Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit

June 2019

Addition of {{rootsee}} to general entries[edit]

Discussion moved from Wiktionary:Tea room/2019/June.

@Ankitdimania has been applying {{rootsee}} to general entries such as exclamation in the "Derived terms" section. Just wanted to check if this is appropriate, as I thought {{rootsee}} was intended for use on entries concerning roots only, such as "Reconstruction:Proto-Indo-European/kelh₁-". — SGconlaw (talk) 19:07, 2 June 2019 (UTC)

Thank you SGconlaw for getting general feedback on this. My intention of using {{rootsee}} template is that all words associated with a root are clubbed in one place and then can be used to show etymologically related words with just one edit. Also, any addition/subtraction is dynamic, i.e. if I add a new word to the category, it will get reflected in all the places where related words with root is used. Ankitdimania (talk) 00:34, 3 June 2019 (UTC)
I'm also open to finding any better way to achieve this. Please help. Ankitdimania (talk) 00:34, 3 June 2019 (UTC)
I found one flaw in using rootsee though. The list is expanded by default, and as such, the page is long and difficult to comprehend. If the list could be unexpanded by default, or some other way, it would be easier to read the entry and then expand the list to find etymologically related words. Ankitdimania (talk) 00:37, 3 June 2019 (UTC)
I enjoy seeing this information, but I think it shouldn't be expanded by default. Same with cognates, I prefer a collapsed list. Ultimateria (talk) 02:32, 3 June 2019 (UTC)
Perhaps it makes sense for {{rootsee}} lists to be expanded in root entries? I don't know. — SGconlaw (talk) 03:53, 3 June 2019 (UTC)
  • I'd vote against extensive use of these, specifically any use in English and taxonomic entries. We already have intimidating and tedious lists of cognates in etymology sections wasting our normal users' time. I don't think we need more of this kind of thing. DCDuring (talk) 01:36, 3 June 2019 (UTC)
    Isn't this is a multi-entry matter. As such BP seems like the right place for it. DCDuring (talk) 01:38, 3 June 2019 (UTC)
    I've moved the discussion. — SGconlaw (talk) 03:53, 3 June 2019 (UTC)

Wiktionary:Votes/2016-07/Adding PIE root box. —Suzukaze-c 03:59, 3 June 2019 (UTC)

I'm glad somebody remembered this. This seems like an open-and-shut case. The removal should begin. DCDuring (talk) 08:21, 3 June 2019 (UTC)
So, for clarity's sake, the suggestion is that the previous vote on PIE root boxes suggests that PIE root descendants should similarly not be added to entries via {{rootsee}}? Because the vote didn't actually touch on the current issue. — SGconlaw (talk) 10:38, 3 June 2019 (UTC)
I would interpret it that way, though the wording was absurdly narrow. As worded, it would not forbid a yellow and red display of each PIE-derived related term at different random locations in the entry flashing at seizure-inducing intervals. DCDuring (talk) 12:03, 3 June 2019 (UTC)
Thank you for feedback on this. Also, if we find this information still relevant, I can update the {{rootsee}} template to have the box collapsed by default. Alternatively, we can just put links similar to "English terms derived from the PIE kelh₁" — Ankitdimania (talk) 17:42, 3 June 2019 (UTC)
Yes, we already have the category pages with precisely this information. Or is it just a matter of presenting the information directly in the entry? – Jberkel 05:06, 4 June 2019 (UTC)
I support the use of this template, as it's bound to give a more complete picture of all related terms. Moreover, it avoids duplication of related terms across entries. —Rua (mew) 19:10, 3 June 2019 (UTC)
I checked how to collapse the box, we can change the depth value in https://en.wiktionary.org/w/index.php?title=Template:rootsee&action=edit, to 0. The usage is documented in CategoryTree. I can test and make the update if it is acceptable. Ankitdimania (talk) 03:22, 4 June 2019 (UTC)
  • What is the point of presenting this information in lieu of lists curated by humans? Some of the items included are just silly, eg, blends that don't contain the root.
In any event the extensive information is available to anyone who cares at the entry for the PIE term. There is already a category link in the entries. If it is too exhausting for those few who are interested in PIE cognates, a link to the recontructed PIE term could be inserted in an Etymology section. DCDuring (talk) 13:58, 4 June 2019 (UTC)
  • To give an example how this information is structurally better, I have a recent edit as an example vs pugnacious#Related_terms (here we can see it takes less effort, is quite compact and is dynamically updatable at other places like pugilism's related entries).
While other approaches give a bit of pain, e.g.
1.) Human curated list is not always up to date, or extensive. A related list is present in one place, but not in other places. Some words are added in a few entries but skipped in other related entries, etc.
2.) Also, Human curated list will require more manual effort to add a new word across all related pages.
3.) Category link in the entries are at the bottom of the page (sometimes after a long scroll through other language's entries, which is not intuitive). e.g. pen is interestingly related to feather and pinion, which is esoteric due to the long scroll on pen's entry. We can, though, add the category link in related entries itself and that would be preferable to me.
Link to reconstructed PIE terms in Etymology section is a good middle ground here. Another benefit here is that the etymology section would have to be a bit more detailed, which would be nice.
Also, I agree that the list is a bit intimidating and tedious, but listing weird entries such as blends or composites give beautiful insights into the relationship of words. e.g. Insights by Norman Lewis are an interesting read on this. I'm still in favor of listing the {{rootsee}}, just the list should be collapsed to give user an option to expand it if relevant to him/her. With a collapsed list, user can just skip the section and it's not intimidating anymore.
Please LMK, how you wound want to structure the page? Ankitdimania (talk) 19:25, 9 June 2019 (UTC)
By reverting to the human-curated material. What is the problem with simply having a link to the PIE root and having all the {{rootsee}} there or on subpages of there or hidden under the direct derivatives in each language? The romance of the "beautiful insights into the relationship of words" is of no appeal except to the amateur linguists. I am concerned that some of them who apparently have no sense of responsibility for making Wiktionary useful to normal users and instead are using this project to indulge themselves. DCDuring (talk) 22:56, 9 June 2019 (UTC)
The related terms for calyx are: apocalypse, calyx and occult. Not very helpful to show that the entry is related to itself. – Jberkel 06:54, 12 June 2019 (UTC)


For ease of reference, what follows is a list of editors who support and do not support the use of {{rootsee}} in ordinary entries. Please add your names to the poll after you have participated in the above discussion to your satisfaction. — SGconlaw (talk) 09:43, 10 June 2019 (UTC)

Do not support

A proposal for WikiJournals to become a new sister project[edit]

Over the last few years, the WikiJournal User Group has been building and testing a set of peer reviewed academic journals on a mediawiki platform. The main types of articles are:

  • Existing Wikipedia articles submitted for external review and feedback (example)
  • From-scratch articles that, after review, are imported to Wikipedia (example)
  • Original research articles that are not imported to Wikipedia (example)

Proposal: WikiJournals as a new sister project

From a Wikipedian point of view, this is a complementary system to Featured article review, but bridging the gap with external experts, implementing established scholarly practices, and generating citable, doi-linked publications.

Please take a look and support/oppose/comment! Evolution and evolvability (talk) 04:24, 3 June 2019 (UTC)

Request for rights[edit]

Hi there. I came here to request autopatrolled rights. I used to edit here as user:Diego Grez-Cañete, but no longer have access to that account. I also used to be an autopatrolled an rollbacker, but lost these rights long time ago. The autopatrolled right would allow me to create entries faster, as I am forbidden to create more than two or three entries per minute. My interest, atm, is to create entries for gentilicios of Chile. Nothing that can't be cited. Thanks in advance. --Cuatro Remos (talk) 19:01, 3 June 2019 (UTC)

Yes check.svg Done. If you have retired the User:Diego Grez-Cañete account, please update your current user account so that it does not redirect to it. Thanks. — SGconlaw (talk) 19:13, 3 June 2019 (UTC)
Thank you. Have done so. --Cuatro Remos (talk) 19:26, 3 June 2019 (UTC)

User Stephen G. Brown[edit]

User:Stephen G. Brown hasn't been active since the 10th of Feb this year - one of our most active long-time editors who contributed in a big number of languages and scripts. I was connected with him outside Wiktionary. He hasn't responded to any contacts. It makes me worry about his health. --Anatoli T. (обсудить/вклад) 04:22, 4 June 2019 (UTC)

I hope he's alright. I searched (briefly) for obituaries of people with that name and didn't spot any (except one from 2018, clearly not him). - -sche (discuss) 05:13, 4 June 2019 (UTC)
I always wondered why somebody with such an outstanding command of languages across multiple continents would waste his time here. Maybe he got a hobby. It's our loss. Equinox 06:56, 4 June 2019 (UTC)
Last WP contribution also February. DCDuring (talk) 14:07, 4 June 2019 (UTC)

{{ja-spellings}} doesn't work well with wago at kanji[edit]

As is shown by the vote, many editors do not support lemmatizing all wago at kana entries. This means that a large number of wago would be lemmatized at kanji, such as 戦う, and consequently require both {{ja-spellings}} and {{ja-kanjitab}}:


This complicates the entry layout because floating elements are laid right to left, as shown at 敷居. It would be possible to make them stack vertically by using the floatright class, but as Eirikr explains, this causes other problems.

Moreover, the kanji spellings in {{ja-spellings}} are shown in a larger size than the kana spellings. This works fine if wago are lemmatized at kana and the reader wants to look up by kanji, but not if wago are lemmatized at kanji and the reader wants to look up by reading (kana).

Therefore {{ja-spellings}} doesn't work well with wago entries at kanji. Given that a lot of wago entries would be lemmatized at kanji, I would like to remove the template and propose the following scheme instead:

  1. Move the kanji spellings to {{ja-kanjitab}}. Extend {{ja-kanjitab}} to accept "alternative kanji spellings", like this:
    Kanji in this term
    Grade: 4
    Alternative spelling 闘う

    {{ja-kanjitab|たたか|yomi=k}} // followed by a {{ja-see}}
    Kanji in this term
    Grade: S

    Alternative spelling 然して
  2. Move the modern and historical kana spellings to {{ja-pron}}. This might not be feasible at the moment and we can keep them first in the headword templates, but in the long run we can modify {{ja-pron}} to accept both modern and historical kana spellings:
    見違ふ (「見違える」の文語形。「みちがふ」/「みちがう」で立項してもOK)

What do you think of this approach?

(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 16:15, 6 June 2019 (UTC)

(1) I still think that option 1 of the vote is conceptually nicer and simpler. But if we must, perhaps this is alright. Or we could revive the "Alternative spellings" header, keeping more in line with standard entry format (although we do have things like لباس#Persian).
(1, 2) As long as it doesn't get too confusing. —Suzukaze-c 19:05, 6 June 2019 (UTC)
@Suzukaze-c: Um..., actually I never liked headers. They're fine for Wikipedia, but using them in Wiktionary entries is like writing XML like
<key level="1">Note</key>

<key level="2">To</key>

<key level="2">From</key>

<key level="2">Heading</key>

<key level="2">Body</key>
<value>Don't forget the meeting!</value>

for what's usually

    <body>Don't forget the meeting!</body>
--Dine2016 (talk) 16:17, 13 June 2019 (UTC)

CFI issue[edit]

I raised a question about inclusion of hyphenated compounds on the CFI talk page here, but now I look again at that page, there seems to be surprisingly little activity, so I am mentioning it here too just in case no one ever sees it in that place. Mihia (talk) 22:03, 6 June 2019 (UTC)

Vote: Language code into reference template names[edit]

FYI, I created Wiktionary:Votes/2019-06/Language code into reference template names.

Let's postpone the vote as much as discussion needs, if at all. --Dan Polansky (talk) 08:00, 7 June 2019 (UTC)

Template:archaic synonym of[edit]

According to the deletion message, this was deleted per WT:RFDO. But where is the deletion discussion? There's nothing on the talk page and nothing among the pages that link to it either. —Rua (mew) 14:23, 9 June 2019 (UTC)

I assume MK deleted it during a deletion spree. Who cares though, right? --I learned some phrases (talk) 14:12, 10 June 2019 (UTC)

{{top3}} in descendants section (e.g. Proto-Slavic)[edit]

(Notifying Rua, Wikitiki89, Atitarev, Benwing2, Guldrelokk, Bezimenen, Jurischroeer, Greenismean2016, Chignon):

Is it useful or useless? Compare descendants with and without {{top3}}. From my view, advantage is that it shortens long narrow lists via filling empty space in right side, thus making entry easier to read. —Игорь Тълкачь (talk) 15:20, 9 June 2019 (UTC)

I think it looks ugly, especially when the columns don't line up with the three Slavic subgroups, which is often the case. We should take into account mobile users, for which the single list is better than the columns. Also, just from a customary point of view, single lists are most definitely the norm on Wiktionary, with columns only used in a few cases. —Rua (mew) 15:21, 9 June 2019 (UTC)
From what i see: 1) It's problem of some browsers: correct in Google Chrome, incorrect in FireFox, Internet Explorer, unknown in Opera. 2) In mobile version columns are single. 3) It's hard to count in Main namespace, but in the Reconstruction there are ~118 cases (e.g. *ćwíšah, *xātun): ~80 (Iranian), ~14 (Turkic), 9 (Germanic), 6 (Algonquian), 5 (Semitic), ~5 (other). Anyway it's just extrapolating from other sections (e.g. Translations, Derived terms, ...). —Игорь Тълкачь (talk) 15:27, 10 June 2019 (UTC)
Correct in Opera. —Игорь Тълкачь (talk) 22:37, 11 June 2019 (UTC)
Exception (incorrect in all 4 browsers): More brokenness in {{top3}}, {{mid3}}. —Игорь Тълкачь (talk) 15:17, 16 June 2019 (UTC)
On mobile, the list reverts to a single column. It would be nice if {{mid3}} actually worked to force breaks. @Erutuon --{{victar|talk}} 02:58, 12 June 2019 (UTC)
@Victar: Columns are overridden by .derivedterms, .term-list { -moz-column-count: 1 !important; -ms-column-count: 1 !important; -webkit-column-count: 1 !important; column-count: 1 !important; } in MediaWiki:Mobile.css. User:DTLHS added that rule in these edits. I'm not confident that mobile screens can always show three columns, so would rather not make a decision in the matter. — Eru·tuon 03:06, 12 June 2019 (UTC)
Yeah, that was added at my behest. I pined you about {{mid3}} though. --{{victar|talk}} 03:08, 12 June 2019 (UTC)

I see little response here, so now i tried to count users who added/removed {{top3}} in Proto-Slavic (the list below is incomplete):

This discussion is not new, the earliest probably was in 2015/04/15, but it didn't get any objections. 4 years have passed and now i notice that Rua (2019/04/15) started removing {{top3}}. Such actions can lead to numerous edit conflicts, because {{top3}} is used in 1000+ Proto-Slavic entries. —Игорь Тълкачь (talk) 16:43, 16 June 2019 (UTC)

@Rua, Useigor: User talk:Wikitiki89/2018 § top3, mid3, bottom in Proto-Slavic entries. I think I support Rua's proposal to remove the template, because it's broken all too often for me. (I'm Chignon) Canonicalization (talk) 11:37, 17 June 2019 (UTC)
If i'm not mistaken, it's broken since using autobalancing (2017). There are solutions: revert edits (unlikely), create another template (possibly), fix via CSS (uncertain). —Игорь Тълкачь (talk) 18:00, 19 June 2019 (UTC)
@Canonicalization: There was no proposal to remove {{top3}} from all Proto-Slavic. If you start doing so en masse, as you suggest, I will revert your edits. That said, there may be some individual cases in entries where it shouldn't be used, i.e. when only found in 1/3 or 2/3 of branches (as Useigor exampled above). --{{victar|talk}} 19:34, 19 June 2019 (UTC)
Just realized who you are now. Congrats on the 50's username change. --{{victar|talk}} 19:36, 19 June 2019 (UTC)
@Victar: Please don't put words into my mouth: I never said I was going to do anything of the sort unilaterally. Anyway, I see @Useigor has been doing some work on fixing the template (I think?). Canonicalization (talk) 20:17, 19 June 2019 (UTC)
And you the same, citing proposals when non such were made. --{{victar|talk}} 20:24, 19 June 2019 (UTC)
Possible solutions:
  • First solution gets complicated, because it requires additional code to handle tag ol. It's easier to revive the old template, because columns without breaks are half-useless. Therefore i created {{xtop}} and soon i will start replacing incorrect cases with it.—Игорь Тълкачь (talk) 21:08, 13 July 2019 (UTC)

I came up with a programming solution. This bit should fix things in Internet Explorer:

@media screen and (min-width:0\0) and (min-resolution: +72dpi) { .derivedterms ul { -webkit-column-break-inside: avoid; break-inside: avoid; } }

And I created {{mid}} to override auto-breaks and manually add breaks where you want, which is semi working...

--{{victar|talk}} 05:34, 26 June 2019 (UTC)

@Erutuon, did you want to try adding the above to common.css? It might also work for Firebox (I haven't tested it) but to enable it for FF you'll need to add something like @-moz-document url-prefix() --{{victar|talk}} 18:20, 26 June 2019 (UTC)

Proposal: Make Latin the primary script for Serbo-Croatian[edit]

@Ivan Štambuk, Crom daba, Vorziblix At the moment, we duplicate a huge amount of information by having the exact same entry at both the Latin and the Cyrillic spelling. For English, we eventually relented and made colour link to color. I think the same should be done for Serbo-Croatian: the Cyrillic spelling should be defined as an alternative spelling of the Latin spelling (or "Cyrillic spelling"), and all information that is already present on the Latin page, such as etymology and pronunciation, should be removed from the Cyrillic page. Descendants and translations should be given only in Latin script, so no more cumbersome nesting. The reason I think Latin should be the primary script is that it's used in all four countries, and appears to be favoured in everyday use even in those that use both. —Rua (mew) 18:29, 11 June 2019 (UTC)

LOL, never going to happen. --{{victar|talk}} 03:09, 12 June 2019 (UTC)
It might work, it's all up to the community. We're almost there with dual re-transliterations into the other side - Roman/Cyrillic and vice versa but more needs to be done.
  1. Cyrillic to Roman converts one-to-one but there are cases when Roman to Cyrillic need to be decided
  2. All inflection tables need to display both Roman and Cyrillic.
  3. Consider using new Serbo-Croatian language-specific templates like {{sh-l}} with automated conversions, compare with {{zh-l}}, e.g. 中國中国 (Zhōngguó), which display traditional Chinese, simplified Chinese and transliterations with only traditional Chinese 中國 in the input.
  4. Cyrillic entries shouldn't be deleted, IMO but be converted to soft-redirects.
  5. We should also address how we display translations, there's a lot language-specific templates can do what {{t+}} or {{t}} can't, e.g.:
宮崎県 (みやざきけん)九州地方 (きゅうしゅうちほう)南東部 (なんとうぶ) () ()する (けん)
Miyazaki ken wa Kyūshū chihō no nantōbu ni ichi suru ken.
Miyazaki prefecture is situated in the south-east part of the Kyūshū region.
The above Japanese example doesn't have any Roman script. I can go on talking about Chinese, Thai, Korean Khmer templates. --Anatoli T. (обсудить/вклад) 06:14, 12 June 2019 (UTC)
I think it's a good idea. Trying to keep two separate entries for every S-C lemma and nonlemma form synchronized is absurd. To Anatoli's points:
  1. Sure, there may be times when a manual Cyrillicization needs to override the automatic one. Ought to be trivial.
  2. Agreed.
  3. Agreed.
  4. Well, duh.
  5. See point 1 above.
It feels like a lot of work, but reducing unnecessary duplication will be worth it. —Mahāgaja · talk 15:02, 12 June 2019 (UTC)
I’m inclined to agree with this proposal; making duplicate entries for every term is frustrating, and maintaining them so they stay synchronized is next to impossible. Latin script predominates even in Serbia (at least outside of official contexts). (It’s worth noting, though, that Cyrillic is also "used in all four countries", though its usage share in each has been rapidly declining over the past two centuries.) Of course, we’d still have duplication between entries for ekavian/ijekavian variants, but a two-way duplication is a decided improvement over a four-way one. — Vorziblix (talk · contribs) 15:15, 12 June 2019 (UTC)
Symbol support vote.svg Support (my main concern is duplication of content, so I'd be fine too with making the Cyrillic spellings the lemmas). @Victar, are you opposed to the proposal, or do you simply think it won't garner enough support? Canonicalization (talk) 16:08, 12 June 2019 (UTC)

@Fay Freak Canonicalization (talk) 17:41, 12 June 2019 (UTC)

It would be easier to create entries. And indeed the inflection tables need more stuff done automatically. All that saves time. It does not despend on new linking templates though those are possible. Changing existing Cyrillic templates to display the new order is however critical in so far as they as they are already out of sync. What happens if some noob has added additional information to the Cyrillic entry that lacks on the Latin entry? Ivan Štambuk had some machine to detect whether entries are exact mirrors. But if there isn’t the parallelism and this is detected, I am afraid, a human must clean up and move because no machine can decide.
Or what’s with some kind of gadget that could convert a Latin entry into a Serbo-Croatian one, and what with a bot that applies changes done to one side only after some time to other?
At some point I have suggested on Wiktionary already – I’d need to search where – to have some template on Cyrillic pages (or the opposite) that fetch the content or whole language section of the other page (like {{desctree}}): so the scripts look treated equally and one finds all at every page but it isn’t twice the work. This looks the best to me. Ideally one could perhaps even write only one line: instead of putting ==Serbo-Croatian== one calls a template on that line that invokes a module. For why write {{spelling of|sh|Cyrillic|čutura}} plus header plus L3 and L2 and possibly altforms and inflection there too if you can get the whole with one line and then it even displays all there? Least work for editors, greatest gain for readers because they find all at every spelling and virtually synchronously. It would be fun to expand Serbo-Croatian on Wiktionary with such an architecture. That’s the utmost concentration.
Coding is needed for every alternative. Fay Freak (talk) 23:02, 12 June 2019 (UTC)
@Fay Freak: Perhaps labeled section transclusion (LST) could be used to grab the Serbo-Croatian section instead of a Lua function. That would save Lua memory, but maybe not much work otherwise; lots of entries would have to be edited (both the source and target of transclusion) and templates would still have to be made to display the right script on each page. — Eru·tuon 23:31, 14 June 2019 (UTC)
Not to forget the table templates. Those that I have created a month ago like the playing cards one have a based |sc= parameter whereby the display switches the script according to the script code. Others like the colours template just display all scripts. The list templates have subpages for all. This must be regularized for all Serbo-Croatian tables or list templates for the module to get it. Fay Freak (talk) 23:22, 12 June 2019 (UTC)

(in response to what has been posted so far) I'm not proposing to do it all in one go, or to make huge sweeping changes to our infrastructure for SC. The proposal is just to codify our intent to convert Cyrillic spellings to alternative forms of the Latin ones, and to remove the nested structure that is currently present in descendants and translations. This could be done on a page-by-page basis, whenever someone happens to come across it. As long as we know what direction we're going to be moving in on this. I don't really see the need for special-purpose templates for SC, let alone page-copying stuff stuff like what Fay Freak proposes. In proposing Latin as the primary script, I meant that when we link to a SC term, we link only to the Latin script form, which in turn lists the Cyrillic script for those interested. —Rua (mew) 10:21, 13 June 2019 (UTC)

But it is preferable if one can treat the display equivalently. So one can give in a Cyrillic term without needing to follow a soft-redirect just to use the dictionary like a dictionary. Linking Latin and Cyrillic forms at the same time is the least problem. What you propose is also a decrease in usability – why convert Cyrillic spellings to alternative forms of the Latin ones if they are successfully parallel at the time being? Why force people to click on Latin links if we could also link both, and even easily via {{sh-l}}? “I don't really see the need.” Fay Freak (talk) 11:57, 13 June 2019 (UTC)
The same argument could be made in opposition to the proposal entirely, because we're removing definitions and etymologies from the Cyrillic pages. They are "successfully parallel at the time being" too, after all. Why treat them equally in some respects but not in others? —Rua (mew) 12:50, 14 June 2019 (UTC)
I don’t think the entries at the time being are even successfully parallel; in the absence of Ivan Štambuk’s watchful eye a noticeable number have drifted apart, and a good many lack a Cyrillic or Latin counterpart entirely. — Vorziblix (talk · contribs) 16:20, 14 June 2019 (UTC)
But I did not talk about the whole, but about those which are parallel (why—if); in fact I warned that the conversion is manual work for those that are not equal. If a Cyrillic page is made a soft redirect this means a usability decrease. Readers want Cyrillic main entries. But we want to repeat less and we want to save attention from the synchronization. So the idea is to have the whole at the Cyrillic pages but in an automated fashion. The page only as displayed will be copied, in the source code we won’t have to do anything but insert a template which fetches the one page from the other.
Oh, I see; sorry, I misinterpreted a bit. I agree that your proposal (automated fetching) would definitely be ideal if it’s technically workable. I must admit I have no idea whether or not that’s the case. If not, though, I’d still support alt-form-style conversion of entries as preferable to the current system. — Vorziblix (talk · contribs) 23:13, 14 June 2019 (UTC)
I don’t understand what “Why treat them equally in some respects but not in others?” is supposed to mean. I have said it already: Different interests: If we treat inequally only to save work, we only have to do it to the extent in which it saves work. What I have outlined is a milder measure, and it will be more agreeable. The hard fans of Cyrillic will say: That’s a great measure, you save work but it does not look like the Cyrillic script is inferior or something. Then I am confident that in the future it will never be in question “why we have done that”. It is best if the end user does not see the problems the creator of the application had. Now the end user can choose arbitrarily which alphabet he types in and the system does no pressure to use Latin. It would be sad to sacrifice the Cyrillic script for occasional danger of asynchronicity and some saved repetitions only because we have no module to get the most out of all. And if we have, it enlightens the attitude of any editor who is pro-Cyrillic. Fetching all through a template with a module is danker than creating alternative forms. Think about the editors we possibly lose because they have a personal dislike to be subjected to such a ranking of Latin. If it is done differently, the resistance will be less. Forever – I think it is the best possible solution. Think about our marketing claims: “In Wiktionary you can type in Latin and Cyrillic and it will be equal.” Fay Freak (talk) 18:32, 14 June 2019 (UTC)

"heading" label[edit]

Examples of the "heading" label at draw#verb:

  1. (heading) To move or develop something.
    1. To sketch; depict with lines; to produce a picture with pencil, crayon, chalk, etc. on paper, cardboard, etc.
    2. To deduce or infer.
  2. (heading) To exert or experience force.
    1. (transitive) To drag, pull.
    2. (intransitive) To pull; to exert strength in drawing anything; to have force to move anything by pulling.

To me, this "heading" label seems superfluous almost to the point of being confusing. I am tempted to remove it where I see it. What do other people think? Does anyone think the label is useful? Mihia (talk) 20:02, 15 June 2019 (UTC)

  • I don't know what it is supposed to be saying. Feel free to remove it. SemperBlotto (talk) 20:05, 15 June 2019 (UTC)
I agree it is unclear. It's an attempt to group related senses together; my suggestion would be either to provide a definition that is a gloss, or a non-gloss definition, as appropriate, like this:
  1. Senses meaning to move or develop something.
    1. To sketch; depict with lines; to produce a picture with pencil, crayon, chalk, etc. on paper, cardboard, etc.
    2. To deduce or infer.
  2. To exert or experience force.
    1. (transitive) To drag, pull.
    2. (intransitive) To pull; to exert strength in drawing anything; to have force to move anything by pulling.
SGconlaw (talk) 22:26, 15 June 2019 (UTC)
AFAICR, it was an invention of @-sche intended to allow grouping of senses where there is no single definition that the contributor can think of that could stand in that location. It is necessary to make it clear to a user that what is on such a line is NOT a definition, even though it is positioned where one would expect a definition. Usually someone comes up with some more helpful label or non-gloss definition than "(heading)", along the lines Sgconlaw suggests. MWOnline and other dictionaries have groups of subsenses that do not have a sense-level definition. It is an artifact of wikiformat ("#" and "##") that we cannot duplicate their numbering scheme. DCDuring (talk) 22:38, 15 June 2019 (UTC)
It is an empirical question whether italics, even with the good wording Sgconlaw uses, are a sufficient indication that the content of the definition line should not be read as a definition. Sadly we don't have reliable means of running an experiment. DCDuring (talk) 22:45, 15 June 2019 (UTC)
I find subsenses confusing, headings or not, but the consensus seems to be that they should get used more (a while ago: Wiktionary:Beer parlour/2015/May#ELE: explicitly ban nested subdefinitions/subsenses? Or allow in rare cases?). In any case, they should at least get mentioned in WT:EL. – Jberkel 23:28, 15 June 2019 (UTC)
I am not keen on the "Senses meaning ..." suggestion. If we are going to use this format, we should just make the heading line read as a broad definition, in my opinion. Mihia (talk) 00:45, 16 June 2019 (UTC)
(@DCDuring's comment above) you are probably thinking of times I've converted entries to use subsenses :) but I always use coherent gloss or non-gloss definitions for the "super-sense"; "heading" labels are not my doing and I remove them when I see them. - -sche (discuss) 01:06, 16 June 2019 (UTC)
@Mihia: I would say give a broad definition wherever possible, but in some cases you may find that a non-gloss definition beginning with “Senses meaning […]” may be more appropriate, so I wouldn’t rule it out. For example, in some entries it seems appropriate to use NGDs like “Nautical senses” or “Senses relating to animals”. — SGconlaw (talk) 02:40, 16 June 2019 (UTC)
More extreme measures may be required for this entry. The highest level groups seem to me to be too abstract. For this word MWOnline has no more than five definitions in any of their groups of definitions, many of which have no master definition. They have nearly 50 definitions, compared to our 39. If we want to extirpate this kind of definition structuring, User:ReidAA, active from early 2013 to late 2015 did (some of?) them and used "structuring" in his edit summaries, AFAICT. DCDuring (talk) 03:32, 16 June 2019 (UTC)
I think we should usually keep the subsenses / top-level senses, and only remove "{{lb|en|heading}}". - -sche (discuss) 04:22, 16 June 2019 (UTC)
That's what we should do in the ambulance, but when we get this particular patient to the hospital, we can't just send it home. DCDuring (talk) 05:06, 16 June 2019 (UTC)
  • OK, well, notwithstanding the other issues, there does not seem to be support for the "heading" label, so I have removed it. Mihia (talk) 17:16, 16 June 2019 (UTC)

Partial blocks deployment to Wiktionary[edit]

Hello Wiktionary contributors,

Wikimedia Foundation Anti-Harassment Tools team is continuing to make improvements to Special:Block with the addition of the ability to set a partial block

While no functionality will change for sitewide blocks, Special:Block will change to allow for the ability to block a named user account or ip address from:

  • Editing one or more specific page(s)
  • Editing all pages within one or more namespace(s)

Additionally, changes are being made to the design of the user interface for Special:Block to enable admins to set partial blocks.

Until now partial block has only been deployed on Wikipedias. Since Wikipedia administrators found partial blocks useful and there are no serious known issues or bugs, our team is planning to introduce partial blocks into more Foundation wikis. We think it is important to find any bugs that might exist for Wikisource, Wiktionary, Commons, Wikidata, etc. that might not be on Wikipedias so we are going to deploy to a few of these wikis next week with our software developers ready to respond to any issues that may arise.

Currently it is scheduled to SWAT deploy to English Wiktionary on Monday, June 17, 2019.

Let me know if you have any questions or thoughts about introducing partial blocks on Wiktionary. For the Anti-Harassment Tools team. SPoore (WMF) (talk) 22:21, 15 June 2019 (UTC)

We always welcome useful hand-me-downs.
Why is this specifically an Anti-Harrassment matter? Is the idea that we can partially implement IBANs by not letting alleged harassers post on individual user talk pages and on Wiktionary discussion space? DCDuring (talk) 23:07, 15 June 2019 (UTC)
Yes, there are times when a full site block might not address the issue as well as other editing restrictions might. One of our working hypotheses is that some users are not given a full site block because it is too harsh. So, partial block is a more targeted option. This page lists some uses.
Additionally, partial blocks are being used to block ip contributors and vandals from one or a few pages to prevent collateral damage to other good users. Also, I can share documentation with you that show how other wikis are changing their local block policy and writing help pages about setting a partial block. SPoore (WMF) (talk) 12:41, 17 June 2019 (UTC)
Partial blocks is now deployed. Let us know if you notice any issues or have questions.
Here is a description of the use of partial blocks Also here is a page that the Italian Wikipedia created about partial blocks. This wiki might want to update there policies according with something similar. SPoore (WMF) (talk) 20:56, 17 June 2019 (UTC)
  • Do we want to make policies first or use these partial blocks and develop policies as needed? DCDuring (talk) 21:16, 17 June 2019 (UTC)
I put something informational on WT:Blocking policy#Partial blocks. Does it need a vote? DCDuring (talk) 21:34, 17 June 2019 (UTC)
I don't think we ought to use partial blocks for limiting interaction between people, if someone is harassing someone else to the point that I would block them from editing a particular talk page I would want to block them from editing altogether. I think there may be potential use in the (rare) cases where otherwise reasonable editors get into revert wars over the content of a particular entry, it could be used to enforce a cooling-off period. Previously we have just protected the entry. Really I don't see much value in this tool here. - TheDaveRoss 12:16, 18 June 2019 (UTC)
While it's true that we're more interested in managing access to languages rather than individual entries, it does allow us to stop certain types of edit wars at a given entry without cutting off access to non-involved parties. There might be an abuse filter or two that we won't have to employ in a few special cases. Chuck Entz (talk) 13:37, 18 June 2019 (UTC)

Languages that "use English"[edit]

Wikipedia has articles on:

These two links seem to imply something about the nature of Hakka and Min Nan dialects. They are using the "English" spelling as the name of their page for that nation in their Wikipedias. So IS 'Mauritius' a Hakka word? Is it a Min Nan word? If not, is there ANY place on this website where we would link to hak:Mauritius and nan:Mauritius?

--Geographyinitiative (talk) 04:43, 18 June 2019 (UTC)

You're making the mistake of assuming that a Wikipedia in a language is necessarily an accurate reflection of that language. Chinese is a macrolanguage, which means that the dominant lect tends to be used for many topics rather than the people's native lects. In languages such as these without an extensive corpus of writings in every possible subject, it's often impossible to find an authentic native word for everything that requires an article- so Wikipedia editors tend to make stuff up or borrow it from other languages. Of course, that's not unlike the kind of borrowing that happens at some time in the history of every language, but in the case of Wikipedias, the words tend not to be used by actual speakers who aren't writing Wikipedia articles. Not only that, but sometimes authentic words do exist that Wikipedia editors don't know about- so you have made-up words taking the place of real ones. I can't tell you how many times we've have to revert people who add bad translations in languages they don't know, "borrowed" from wikipedias in those languages. Chuck Entz (talk) 05:57, 18 June 2019 (UTC)
That's a pain of many languages but it also reflects the lack of language policies, especially when there is no such thing with mostly spoken dialects. Even Vietnamese, which has a rather peculiar situation with foreign place names, has a native word for Mauritius, it's Mô-ri-xơ, which we want in the dictionary, even if they often "borrow" the English name for country names, e.g. "Mauritius" (which will still be pronounced "Mô-ri-xơ") and many others. I think it's best not to add the "borrowed" spelling. 毛里求斯 (Máolǐqiúsī) has the Min Nan form, even if Min Nan Wikipedia uses "Mauritius". --Anatoli T. (обсудить/вклад) 06:17, 18 June 2019 (UTC)
Why is it better to misrepresent the language as it is actually used by ignoring such forms as Mauritius? Should we delete the English entry of Côte d'Ivoire? I'm not a fan of actually using it, but it certainly is used in English.--Prosfilaes (talk) 16:59, 18 June 2019 (UTC)
Because, eg Hakka dialect may not have an established/approved/standard, etc. name for a small country like Mauritius but they can still have an article about it. It’s not exactly a borrowing but a missing term in a language or a dialect (or editors don’t know the word or don’t care as in the case of Min Nan or Vietnamese). Anatoli T. (обсудить/вклад) 21:48, 18 June 2019 (UTC)
Wikipedia is not a good source under CFI, but this seems to be an evasion. A language using a word from another language for a missing term (or term that's not known to the speakers) is exactly a borrowing. Côte d'Ivoire is not a good English word, with "ô" and "d'", but we record it because it is used.--Prosfilaes (talk) 05:15, 20 June 2019 (UTC)
As to the question if whether there's anywhere we would link to such pages: if there's anywhere we'd link to the Hakka or Min Nan Wikipedia article on a country if its name were spelled in Chinese characters (such as: I see we add such links to 中國, so I guess we'd add them to 毛里求斯), then I guess for this country the target of our links would be that Latin-character string, since that's where those Wikipedias put their entries on that country... even if we decided we should "alias" them like [[w:nan:Mauritius|毛里求斯]] (or to link to nan:毛里求斯, if that entry existed as a redirect to the entry where the content is)... - -sche (discuss) 16:46, 18 June 2019 (UTC)
@-sche: In case of Hakka Wikipedia linking to "Chûng-koet" is appropriate because Hakka Wikipedia is written mostly in Pha̍k-fa-sṳ (PFS) and PFS transliteration of 中國中国 (Zhōngguó) is "Chûng-koet" but "Mauritius" is not a transliteration of 毛里求斯 (Máolǐqiúsī) in any Chinese lect, nor it is a loanword. --Anatoli T. (обсудить/вклад) 04:34, 19 June 2019 (UTC)
If Hakka is written in PFS, then PFS is no longer a transliteration; it is a script, and writings in it should be recorded as such, no matter what other scripts might show.--Prosfilaes (talk) 05:15, 20 June 2019 (UTC)
The Bible along with psalms has been translated into PFS along with Chinese characters for Hakka. Wikipedia dialect editors like to write their articles in PFS and make up new words but we go by dictionaries and our CFI and I don't think "Mauritius" will be attested in text written in the Hakka dialect. --Anatoli T. (обсудить/вклад) 05:56, 20 June 2019 (UTC)

Species names - sum of parts?[edit]

What lexicographic information do binomial specific names have that isn't in their two parts? DTLHS (talk) 01:45, 19 June 2019 (UTC)

Does penelope have a meaning apart from Anas so that one can know what kind of dabbling duck Anas penelope is simply by knowing both the generic and the species name? It seems to me that species' names have no lexical value on their own. They have to be used in tandem with generic names to mean anything. For instance, townsendii does not function as an adjective describing Scapanus or Microtus. It doesn't actually tell me anything more specific about the vole or mole than Scapanus or Microtus do, unless I already know something about Townsend's vole/mole. Andrew Sheedy (talk) 02:17, 19 June 2019 (UTC)
Indeed, this is true to the extent that any taxonomic names are lexicographically relevant (some authorities would say they are not; we choose to include them). —Μετάknowledgediscuss/deeds 02:21, 19 June 2019 (UTC)
True. I don't see them as being vastly different than common names, however, and I would say that those are about as inclusion-worthy as fried egg. Andrew Sheedy (talk) 02:32, 19 June 2019 (UTC)
We have chosen not to include the proper names of individuals with rare exceptions. We, like some other standard 'unabridged' dictionaries, have chosen to include these proper names of taxonomic entities. We also have such entries as Fermat's little theorem, Fermat's Last Theorem. DCDuring (talk) 02:40, 19 June 2019 (UTC)
  • I do often wonder why DCDuring spends so much time making species names, and would like to tell him it is pretty stupid and that he should stop, but he argues much better than me and I'd probably get blocked again. --I learned some phrases (talk) 22:17, 23 June 2019 (UTC)
    Thanks for the compliment buried in your comment. Taxonomic entries are part of many languages (hence Translingual). Species names are useful to clarify what vernacular names in various languages and regions are actually referring to. Taxonomic entries are good places to have things like images and links to specialized external sources. DCDuring (talk) 22:57, 23 June 2019 (UTC)
Species names, like any proper noun, are not SOP- because they refer to specific entities. Sometimes they're descriptive enough to distinguish the entity from all others: for instance, Aristolochia californica is the only species of Aristolochia native to California, and it's not native anywhere else. Mostly, though, you can't identify a species from the literal meaning of the species name, alone: sometimes the name is inaccurate- Simmondsia chinensis is native to the southwestern US, not China- and sometimes the description isn't unique to the one species. For instance, there are a number of species of white water lily, but only one Nymphaea alba. Then there are species that are named after someone or something that has nothing to do with that species, and those that are completely arbitrary. If you need more proof, consider: I can take a species with the specific epithet "minutiflora" because it has tiny flowers and develop a cultivar with huge flowers, but that doesn't change its specific epithet to "grandiflora". A favorite example is Eriogonum inflatum, which gets its name from its odd-looking swollen, hollow stems. Someone published a description for Eriogonum inflatum var. deflatum based on specimens without that characteristic, but, sadly, the variety is apparently not taxonomically valid. Chuck Entz (talk) 03:49, 24 June 2019 (UTC)


Discussion moved from Wiktionary talk:English entry guidelines#Hyphenation.

What are the guidelines regarding hyphenation data? I've just come across the one in cromulent which seems phonetic rather than orthographic --Backinstadiums (talk) 10:36, 21 June 2019 (UTC)

I don't think we have guidelines at the moment. I prefer hyphenation that is based on the etymology of the word rather than how it is pronounced, where this is feasible. I suggest raising the issue at the Beer Parlour for general discussion. — SGconlaw (talk) 11:12, 21 June 2019 (UTC)
@Sgconlaw: I do not know how to move this post; can you do it? --Backinstadiums (talk) 14:51, 21 June 2019 (UTC)
@Backinstadiums: Yes check.svg Done. — SGconlaw (talk) 17:53, 21 June 2019 (UTC)
Hyphenation should probably be based on references, or perhaps we could look for real-world examples. See also Wiktionary:Tea room/2019/April#Hyphenation_at_supercalifragilisticexpialidocious. - -sche (discuss) 18:20, 21 June 2019 (UTC)
Surely we can find sources with general rules for hyphentation rather than looking for specific hyphenated examples for each word. DTLHS (talk) 18:26, 21 June 2019 (UTC)
I looked at the hyphenation of words of the form “XVCulent” in OneLook dictionaries. Most of the time it is like XVC·u·lent: crap·u·lent; fec·u·lent; flat·u·lent; flor·u·lent; muc·u·lent; op·u·lent; poc·u·lent; strid·u·lent; tem·u·lent; vir·u·lent. But not always: frau·du·lent; lu·cu·lent; lu·tu·lent; pu·ru·lent; ro·ru·lent. In one case I saw disagreement: while the American Heritage has truc·u·lent, Merriam–Webster has tru·cu·lent.
I see no clear pattern. Etymology is clearly not a guiding principle here, otherwise we’d see, e.g., luc·u·lent and pur·u·lent.  --Lambiam 23:25, 21 June 2019 (UTC)
Note that we have "Wiktionary:Pronunciation#Hyphenation", which states: "British hyphenation more often considers word etymologies, whereas American English hyphenation more often follows syllabification". So far I've generally been hyphenating on the basis of etymology (unless the etymology is unclear or, for some reason, impractical to follow), with the caveat that a word should not be hyphenated in such a way as to leave a single letter at the start or end of a line (so, per·se·cut·ion rather than per·se·cu·tion, and not *e·squa·mul·ose – esqua·mul·ose to be used instead). I suppose, if there is consensus, we could provide both etymology-based and syllable-based hyphenation as alternatives. — SGconlaw (talk) 05:51, 22 June 2019 (UTC)
There really don't seem to be clear-cut rules for where to hyphenate in English, and as noted above there are discrepancies between en-GB and en-US, and sometimes even between dictionaries of the same national variety. The rule I learned (phonologically based for en-US) is generally to hyphenate after vowels, except that a stressed checked vowel should be followed by a consonant. That rule would explain crap·u·lent, fec·u·lent, flat·u·lent, muc·u·lent, op·u·lent, poc·u·lent, strid·u·lent, tem·u·lent, vir·u·lent as well as frau·du·lent since /ɔ(ː)/ is a free vowel, not a checked one. (I don't know how to pronounce luculent and lutulent, and intervocalic r is tricky in American English since we've lost most contrasts between checked and free vowels in that context.) At any rate, my personal intuition is for crom·u·lent. —Mahāgaja · talk 06:12, 22 June 2019 (UTC)
Perhaps we should consider whether this is worthwhile information to provide at all, given how the rules do not appear to be consistent from reference to reference (even within dictionaries of one dialect), and are surely inconsistent in real-world usage, which we would theoretically privilege, being descriptivist... - -sche (discuss) 17:45, 26 June 2019 (UTC)

Pitch in to help with FWOTD.[edit]

There are consistently not enough Foreign Word of the Day nominations ready for me to set them far in advance, but I will have less time to dedicate to Wiktionary in the coming months, and I often need other editors' help when it comes to languages I'm not comfortable with. I don't want to annoy people too much, so if you're willing for me to ping you with various requests related to the languages you know, please add your username at User:Metaknowledge/FWOTD help. —Μετάknowledgediscuss/deeds 22:28, 21 June 2019 (UTC)

Pinyin conventions[edit]

User:Geographyinitiative has been insisting on being inclusive in terms of different Pinyin conventions (as presented in various dictionaries and other sources), including but not limited to capitalization and hyphenation. However, having both pǔtōnghuà shuǐpíng cèshì and Pǔtōnghuà Shuǐpíng Cèshì at 普通話水平測試 or both huàshétiānzú and huàshé-tiānzú at 畫蛇添足 just looks unprofessional and confusing. I see a need in formulating a set of guidelines on Pinyin to ensure consistency across entries. (See User talk:Justinrleung#perspective on capitalization for the latest discussion on this.) — justin(r)leung (t...) | c=› } 03:59, 23 June 2019 (UTC)

I'm sorry! If it's too inconvenient forget it! --Geographyinitiative (talk) 04:01, 23 June 2019 (UTC)
@Geographyinitiative: At Wiktionary_talk:About_Chinese#Capitalisation_of_demonyms_and_language_names_-_a_mini-vote and elsewhere you expressed the view that ALL possible pinyin variations (pinyin, space, capitalisations, numbering), etc. should be included, which I find disturbing and unsustainable. User:Suzukaze-c seems to back you up. Do you still hold this view? --Anatoli T. (обсудить/вклад) 05:17, 24 June 2019 (UTC)
As for the question for standardisation, even if pinyin is just a tool and not a writing system, we need to set standards and conventions and stick to them. Adding hard-redirects to the agreed version is fine, IMO but the exposed/displayed pinyin should be consistent. It's impossible to include all possible (even attested) romanisations. --Anatoli T. (обсудить/вклад) 05:21, 24 June 2019 (UTC)
Yeah- everything. It's disturbing to me because it's the reality, and it's not in the dictionary. --Geographyinitiative (talk) 09:45, 24 June 2019 (UTC)
“It is a damn poor mind that can think of only one way to spell a word.” ― Andrew Jackson. Same with Chinese romanizations. Let all the forms that can be included be included (with appropriate notation telling us what the differences mean). It is a kind of far-fetched long term goal. No standard is better than the standard of "everything". --Geographyinitiative (talk) 10:09, 24 June 2019 (UTC)
@Geographyinitiative: You will have to learn to work cooperatively and stop pushing your point of view in actual edits when it's controversial and other Chinese editors disagree with you and you engage in edit-warring. Dictionaries don't work like this - all possible standards and variations included. I already had to protect the page 吃飽吃饱 (chībǎo) from your edits. I don't want to have to block you and I don't have time to read your endless ranting. --Anatoli T. (обсудить/вклад) 12:27, 24 June 2019 (UTC)
@Atitarev I don't know about dictionaries, but we do work like this. If there's a spelling that's attestably used, we do cite it, no matter what the standard or variation.--Prosfilaes (talk) 03:01, 28 June 2019 (UTC)
Pinyin isn't written Chinese. It's a way to transcribe written Chinese for people who don't know the characters or who don't know a particular reading. It also allows people to search for the character entries if they know the pronunciation. For the latter use, having both an uppercase and a lower case entry means that people only find the case form they search on. That means that you either have to make one case form a redirect to the other, or you have to make absolutely sure that when a character spelling gets added to one case form, it also gets added to the other- good luck with that. Capitalization of Pinyin is a matter of style, not of substance, so it's silly to get all wrapped up in it- I somehow doubt that you'll find separate entries for both uppercase and lowercase pinyin spellings in the same dictionary. Chuck Entz (talk) 14:00, 24 June 2019 (UTC)
Chinese written in Pinyin isn't written Chinese? That's transparently false. Transcription is writing.--Prosfilaes (talk) 03:01, 28 June 2019 (UTC)
How wonderful it would be for you if everybody stopped using their squigglies and used Latin letters instead! No, the transcription is NOT writing and any dictionary can choose the transcription/transliteration, even if it's based on an existing standard. --Anatoli T. (обсудить/вклад) 03:13, 28 June 2019 (UTC)

MANDARIN CHINESE PINYIN: PRONUNCIATION, ORTHOGRAPHY AND TONE by Sunny Ifeanyi Odinye --Backinstadiums (talk) 15:02, 24 June 2019 (UTC)

"Pinyin isn't written Chinese" Irrelevant, but okay fine. I have no opinion on the issue, nor do I need one.
"It's a way to transcribe written Chinese for people who don't know the characters or who don't know a particular reading." Sure, it has that function.
"For the latter use, having both an uppercase and a lower case entry means that people only find the case form they search on." Why? Add a 'see also' to the top of the page and problem solved.
"That means that you either have to make one case form a redirect to the other, or you have to make absolutely sure that when a character spelling gets added to one case form, it also gets added to the other- good luck with that." It would take a lot of work and there would be a lot of mistakes involved. That is the nature of all human activity.
"Capitalization of Pinyin is a matter of style, not of substance, so it's silly to get all wrapped up in it" I know you believe that. The bald assertion of it does not prove it to be accurate.
"I somehow doubt that you'll find separate entries for both uppercase and lowercase pinyin spellings in the same dictionary" Correct, because the other dictionaries are trying to bend the readers to their view of Hanyu Pinyin rather than be a dictionary like Wiktionary. --Geographyinitiative (talk) 21:11, 24 June 2019 (UTC)
Let us have greater respect for the variant and historical forms of Hanyu Pinyin. --Geographyinitiative (talk) 21:37, 24 June 2019 (UTC)
Let's follow agreed conventions and styles, let's focus on the language itself, not the tools to transliterate it. Let us not act unilaterally and let's stop shouting. If you can formulate votes, make a vote but don't force us to protect pages from your controversial edits or block you. --Anatoli T. (обсудить/вклад) 11:44, 26 June 2019 (UTC)
I find this a very problematic post. Geographyinitiative is not shouting; there's no uppercase there. If you mean something else, well, it's not clear what you mean. Making unclear complaints about their argument style and threatening to block someone is not consistent with discussion about the issue under hand.--Prosfilaes (talk) 03:07, 28 June 2019 (UTC)
You can discuss away, as long as you don't engage in controversial edits people have been opposing and it doesn't match the existing policies and practices, as long as you don't edit-war. He has been shouting. He knows what we are talking about. --Anatoli T. (обсудить/вклад) 03:13, 28 June 2019 (UTC)

Mysterious messages on Facebook[edit]

Some years ago (in April 2013), I created Facebook pages for Wiktionary and Wikisource, just for fun. Several people are co-administrators of these pages and can post messages. (But most often, nothing is posted.) The page for Wiktionary has 1293 followers and the one for Wikisource has 2928. These are not impressive numbers, but all is nice and good. Lately, however, an increasing number of people send personal messages to the Wiktionary page containing a single word. Apparently, they believe this is some look-up service. Who gave them that idea? How can we make it stop? Another co-admin and I have started to ask the people who send such messages, but we have received no useful responses so far. One person mentioned "a messenger app", but could not be more specific than that. --LA2 (talk) 16:28, 25 June 2019 (UTC)

You could make it stop by deleting the page. I object to having any official or semi-official presence on Facebook. DTLHS (talk) 16:44, 25 June 2019 (UTC)
In the past couple of years, we've gotten tons of accidental bad edits from mobile-network IPs in India, Pakistan, and some other countries. I think there's some kind of dictionary app that sends people to Wiktionary when they search for a word, and many people in these countries don't have the English skills to understand that they're at a third-party website and not in something internal to the app, or at some kind of service that comes with their mobile account. We have abuse filters that stop edits with nothing but x's (probably kids looking for porn) and page creations that are too short to be actual content. We also get a lot of bogus new-user-pages where people post social-media-style profiles as if they're on Facebook or something. Is there any kind of link to the Facebook page anywhere on Wiktionary? If so, that may be how they're getting there. Chuck Entz (talk) 02:41, 26 June 2019 (UTC)
There’s a link to the Facebook page in this thread, but perhaps you mean the other way around. The About page on Facebook has https://wiktionary.org, which redirects to https://www.wiktionary.org. The first link on that landing page is to the English Wiktionary.  --Lambiam 09:06, 26 June 2019 (UTC)
Right, better delete the Facebook presence for your and Wiktionary’s benefit. It is diametrically opposed to the GPL spirit, and any efforts put into Facebook are wasted. It only supports enslavement by the algorithm instead of responsible use of information. Fay Freak (talk) 14:45, 27 June 2019 (UTC)
Maybe deactivate the account instead of deleting the page, to prevent the "wiktionary.org" name from being taken by somebody else. – Jberkel 09:46, 29 June 2019 (UTC)
It would seem unnecessary to delete the page. Keep it going. BTW, Wonderfool used to control a Twitter account to promote the use of Wiktionary. It was very stimulating --I learned some phrases (talk) 06:59, 30 June 2019 (UTC)
I don't think FB has much in the way of anti-spam features. (They're happy to hide legit posts they don't like, though.) Maybe you could make some bot or API thing to delete single-word posts from your group, if they allow such things. Equinox 18:14, 30 June 2019 (UTC)

Kazakh transliteration update[edit]

I think we can now update to the latest Kazakh romanisation (2018) in Module:kk-translit and WT:KK TR. Calling one everyone involved so far @Vtgnoq7238rmqco, Metaknowledge, Rua. --Anatoli T. (обсудить/вклад) 01:19, 26 June 2019 (UTC)

I disagree. Our romanisation need not match the schemes used in Kazakhstan, which are 1) intended as a primary script rather than as a romanisation, 2) in a state of flux and poorly clarified, and 3) more flawed than the romanisations we use for Cyrillic-script languages. Vtgnoq7238rmqco, who actually works on Kazakh, commented elsewhere that they think this as well. —Μετάknowledgediscuss/deeds 04:42, 28 June 2019 (UTC)
@Metaknowledge: User:Vtgnoq7238rmqco disagreed because of some issues with the new script. Kazakhstan doesn't use either the old or the new romanisation, it's just a... well romanisation. One of the schemes was occasionally used in Kazakhstan and the new one is likely to be promoted and used and our users may want to be more familiar with it. --Anatoli T. (обсудить/вклад) 05:28, 28 June 2019 (UTC)
Those are just my points 2 and 3 in the diff by Vtgnoq7238rmqco you linked to. —Μετάknowledgediscuss/deeds 01:04, 29 June 2019 (UTC)

Requesting a way to automatically transclude a header on Reconstruction pages[edit]

I've started the task T226846 on Phabricator requesting a way to automatically add the notice at the top of every Reconstruction page, so we don't have to manually add {{reconstruction}}. This would probably involve Extension:PageNotice. Previous discussions on this topic: Wiktionary:Beer parlour/2017/September#Proposal: install mw:Extension:PageNotice, Wiktionary:Grease pit/2017/June#Citations at citations, Wiktionary:Grease pit/2018/September#{{reconstruction}}. — Eru·tuon 18:21, 28 June 2019 (UTC)

Additional Form of Romanization of Mandarin Chinese[edit]

In 2002, the government of the Republic of China (Taiwan) approved the usage of the so-called "Tongyong Pinyin" system. Until at least 2008, the system was used throughout the island of Taiwan. I was taught low/mid-level Mandarin Chinese with a book that used Tongyong Pinyin and Bopomofo. To the vast majority of Chinese people in mainland China and around the world, the Tongyong Pinyin romanization system is meaningless and obsolete. But to the people in southwestern Taiwan, this system can still be used in some contexts. I have seen Tongyong Pinyin on printed documents at the Immigration office in Taipei (northern Taiwan). Tongyong Pinyin is much more commonly used than Gwoyeu Romatzyh in my experience. There's no reason to ignore Gwoyeu Romatzyh or Tongyong Pinyin. For these reasons, I would like to find out how to add Tongyong Pinyin to the zh-pron box under Mandarin. If you can show me how to do it, I will do it myself. I can't claim to be very familiar with the system, but I have a reliable source for matching Hanyu Pinyin syllables to Tongyong Pinyin syllables: http://www.pinyin.info/romanization/tongyong/basic.html. If you don't trust me to do it, then I would ask you do add it yourself. Minority and historical perspectives should be included in an appropriate way. Any help would be appreciated. --Geographyinitiative (talk) 04:51, 29 June 2019 (UTC)

@Geographyinitiative: I'd support including Tongyong Pinyin. I've also had textbooks from my Chinese school that show Tongyong Pinyin alongside Zhuyin (and if I remember correctly, Hanyu Pinyin). This should be automated, though, so it will require some fiddling around with code. We can add this to our list of tasks. — justin(r)leung (t...) | c=› } 05:55, 29 June 2019 (UTC)
@Justinrleung: Can it be added only to monosyllabic entries and only in expanded mode just like Wade-Jiles? We had discussions about overcrowdedness of romanisations. Only the mainstream Hanyu Pinyin and Zhuyin should show by default. Gwoyeu Romatzyh is even less popular and known than Wade-Giles, GR should also be hidden by default. Anatoli T. (обсудить/вклад) 07:03, 29 June 2019 (UTC)
@Atitarev: I would definitely have it only in expanded mode. That said, I don't think it'd be overcrowded to show them all, even in multisyllabic entries. It'd be nice to have Wade-Giles and Tongyong Pinyin in all entries. — justin(r)leung (t...) | c=› } 07:09, 29 June 2019 (UTC)
@Justinrleung: Many romanisations are so limited in their usage - in terms of time and territory. Anyway, if you add TP, please add WG as well. Anatoli T. (обсудить/вклад) 07:15, 29 June 2019 (UTC)
@Justinrleung: Yes, and thank you. I will add it there. But if you have a moment, I would like to do this immediately- it seems like a job of mindless copy-pasting for an idiot that I could do no problem. What page could I go to to add this? --Geographyinitiative (talk) 06:18, 29 June 2019 (UTC)
@Geographyinitiative: Sorry to crush your dreams of mindless copy-pasting, but ideally it should be automatically generated given the Hanyu Pinyin, just like Gwoyeu Romatzyh is automatically generated now. — justin(r)leung (t...) | c=› } 06:23, 29 June 2019 (UTC)
How can I do that? --Geographyinitiative (talk) 06:34, 29 June 2019 (UTC)
@Geographyinitiative: I don't think you know how to. There needs to be some code added to MOD:cmn-pron to make this happen. — justin(r)leung (t...) | c=› } 06:43, 29 June 2019 (UTC)
You're correct, I don't know how to do it. I will try to figure it out. --Geographyinitiative (talk) 07:01, 29 June 2019 (UTC)
Working on it. This is what I have so far:

function export.py_tongyong(text)

local tongyong_initial = {

['b'] = 'b', ['p'] = 'p', ['m'] = 'm', ['f'] = 'f',

['d'] = 'd', ['t'] = 't', ['n'] = 'n', ['l'] = 'l',

['g'] = 'g', ['k'] = 'k', ['h'] = 'h',

['j'] = 'j', ['q'] = 'c', ['x'] = 's',

['z'] = 'z', ['c'] = 'c', ['s'] = 's', ['r'] = 'r',

['zh'] = 'jh', ['ch'] = 'ch', ['sh'] = 'sh',

[] =


local tongyong_final = {

['yuan'] = 'yuan', ['iang'] = 'iang', ['yang'] = 'yang', ['uang'] = 'uang', ['wang'] = 'wang', ['ying'] = 'ying', ['weng'] = 'wong', ['iong'] = 'yong', ['yong'] = 'yong',

['uai'] = 'uai', ['wai'] = 'wai', ['yai'] = 'yai', ['iao'] = 'iao', ['yao'] = 'yao', ['ian'] = 'ian', ['yan'] = 'yan', ['uan'] = 'uan', ['wan'] = 'wan', ['üan'] = 'yuan', ['ang'] = 'ang', ['yue'] = 'yue', ['wei'] = 'wei', ['you'] = 'you', ['yin'] = 'yin', ['wen'] = 'wun', ['yun'] = 'yun', ['eng'] = 'eng', ['ing'] = 'ing', ['ong'] = 'ong',

['yo'] = 'yo', ['ia'] = 'ia', ['ya'] = 'ya', ['ua'] = 'ua', ['wa'] = 'wa', ['ai'] = 'ai', ['ao'] = 'ao', ['an'] = 'an', ['ie'] = 'ie', ['ye'] = 'ye', ['uo'] = 'uo', ['wo'] = 'wo', ['ue'] = 'yue', ['üe'] = 'yue', ['ei'] = 'ei', ['ui'] = 'uei', ['ou'] = 'ou', ['iu'] = 'iou', ['en'] = 'en', ['in'] = 'in', ['un'] = 'un', ['ün'] = 'yun', ['yi'] = 'yi', ['wu'] = 'wu', ['yu'] = 'yu',

['a'] = 'a', ['e'] = 'e', ['o'] = 'o', ['i'] = 'i', ['u'] = 'u', ['ü'] = 'yu', ['ê'] = 'e',[] = 'ih'


local tongyong_er = {

['r'] = 'r', [] =


local tongyong_tone = {

['1'] = , ['2'] = 'ˊ', ['3'] = 'ˇ', ['4'] = 'ˋ', ['5'] = '˙', ['0'] = '˙' }

--Geographyinitiative (talk) 03:23, 30 June 2019 (UTC)

@Geographyinitiative: You should probably put it in a module (something like MOD:User:Geographyinitiative/tongyong) for your own testing (and link us to it to avoid clutter on this discussion). I'm not sure if this is the best way to handle the conversion, though, since Tongyong Pinyin is so similar to Hanyu Pinyin. I'll take a crack at this later. — justin(r)leung (t...) | c=› } 03:33, 30 June 2019 (UTC)

July 2019


C'mon guys, it took us 12 days to revert a page move from español to Español. That's really lame. --I learned some phrases (talk) 21:23, 3 July 2019 (UTC)

I agree. Idea: most pages should be move-protected (admins only), seeing as they should never be moved. Any "basic" word in our best-covered languages (various online wordlists can supply these) or page with multiple languages on it ought to be protected this way. —Μετάknowledgediscuss/deeds 00:54, 4 July 2019 (UTC)
You may recall that I recommended this a while back, but no one with a bot thought it was worth the trouble. Every now and then I remember to do this when I visit an eligible page, so there are a few hundred done, at least. I added "Well-attested spelling, should not be moved" to the protection-reason menu, so it's really quite easy for any admin to do this while they're doing other stuff on a page- if everybody does a little bit, we can get a lot done.
My philosophy regarding this kind of thing is that you don't need to make everything vandal-proof (if that's possible): the idea is to do lots of little things to make vandalism more of a chore and less rewarding. Extreme countermeasures just increase the emotional rewards to getting around them- subtle and boring is the way to go. Chuck Entz (talk) 04:00, 4 July 2019 (UTC)
This was not a move out of vandalism but out of ignorance. Looking forward to little things we can do to make ignorance more of a chore and less rewarding :).  --Lambiam 18:05, 4 July 2019 (UTC)
It seems like some Wikipedias have a feature where edits must be verified before being shown. Could we enable that here? —Suzukaze-c 18:08, 4 July 2019 (UTC)
My experience with editing on such a Wikpedia was that my edits were reflexively reverted. Can we do some cost–benefit analysis? How serious is the problem of ill-considered edits? The level of actual vandalism seems to be low, compared to the English Wikipedia. The MediaWiki software allows any wiki – if its users want it – to turn on various protective features. There is the feature of semi-protection, which can be applied on a page-by-page basis to high-risk pages – mainly intended for temporary use. Then there is Flagged Revisions requiring edits by unconfirmed editors to be reviewed, with a more selective variant called Pending Changes used on the English Wikipedia. While these will reduce the volume of bad edits, they may also have a chilling effect on productive new editors.  --Lambiam 23:24, 4 July 2019 (UTC)
We simply don't have enough resources to do it right, and having the feature installed makes it look like we're endorsing any edit that we allow to display. Wikipedias' content is mostly in one language, while ours is in hundreds and potentially thousands, and our patrollers are qualified in only dozens. Most edits would have to be either passed with no scrutiny of content or left in limbo for long periods of time. Chuck Entz (talk) 02:39, 5 July 2019 (UTC)

Changing "dtp" language name to "Kadazandusun"[edit]

The ISO code "dtp" now refers to Kadazan Dusun as of 2016 according to Ethnologue. It reflects the widely used standard name for the language that has been official since 1995. The official spelling for the language is Kadazandusun actually. I would like to request a change from "Central Dusun" to "Kadazandusun" in Wiktionary. --Tofeiku (talk) 13:40, 5 July 2019 (UTC)

Ngrams suggests you are correct that "Kadazandusun" (and to a lesser extent "Kadazan Dusun") is more common than "Central Dusun" or the other names Wikipedia mentions (Bunduliwan, Boros Dusun); a Google Scholar search is even more lopsidedly in favour of Kadazandusun (with "Kadazan-Dusun" also quite common). The various references on the language which Glottolog lists seem to use either "Kadazan-Dusun" or just "Kadazan". I support a rename but it will require updating quite a few pages, which I don't think I have time to do right now. - -sche (discuss) 02:53, 11 July 2019 (UTC)
The native spelling of the language is Kadazandusun but it's up to what's the common spelling for English. Also, Kadazan only refers to Coastal Kadazan and Dusun refers to the Dusun dialect only and not Kadazandusun. The Wikipedia article should be moved to. --Tofeiku (talk) 07:56, 15 July 2019 (UTC)


Yugoslavia ceased to exist and as a consequence, Serbo-Croatian became also defunct. Still, if you type bs (Bosnian), hr (Croatian) or sr (Serbian), the end result is automatically Serbo-Croatian. How could we correct this error? Rajkiandris (talk) 04:49, 6 July 2019 (UTC)Rajkiandris

This is not some new revelation: it's been discussed and debated here several times. Serbo-Croatian as an official language may be defunct, but the fact remains that the standard forms of Bosnian, Croatian and Serbian are all derived from the same dialect and are all mutually intelligible and similar to the point that it's more practical to treat them as one language, and Serbo-Croatian is the best name for such a language. Chuck Entz (talk) 06:15, 6 July 2019 (UTC)
Further reading: Tomasz Kamusella, The Politics of Language and Nationalism in Modern Central Europe.  --Lambiam 13:44, 6 July 2019 (UTC)
@Chuck Entz I have been thinking that it may make sense to treat Kajkavian and Chakavian as a separate language. As can be seen at w:South Slavic languages#Comparison, Kajkavian is extremely similar to Slovene, while Chakavian is in between it and Shtokavian. It makes little sense to call all of these varieties "Serbo-Croatian", Kajkavian especially, while distinguishing Slovene. It would be valuable from an accentological point of view to include Chakavian, which is quite archaic in this respect. —Rua (mew) 11:40, 7 July 2019 (UTC)
Bad idea. Often one cannot even distinguish Macedonian and Bulgarian, like in the quote чу̀тура (čùtura) which fits to none of the two modern languages. The editor Константинъ Миладиновъ is according to the Bulgarian Wikipedia a Bulgarian and according to the Macedonian Wikipedia Macedonian. Also I have heard Kajkavian. It is like Berlinern. Probably extremily similar to Slovene like some accents of Russian are extremely similar to Ukrainian. Accentology is a weak argument, it bears little semantical load, and I do not see how splitting languages could help pursuing accentology. Fay Freak (talk) 13:23, 7 July 2019 (UTC)
I don’t like what @Rajkiandris proposes. It is an error to type “bs”, “hr”, “bs”. One does not type “at” either to find German, so those people who type “bs”, “hr”, “bs” are in error. Everyone knows that this is the same language. It happens often in Balkanized parts of the world that one language is known under multiple names. Confusion arises for remote parts of Africa but for Serbo-Croatian it is patent that it is the same language. Political unity is irrelevant, there is heavy exchange, and if I meet someone in Germany I can easily speak Yugoslav without knowing whether it would be Bosnian or Serbian or Croatian: people call themselves Yugoslav here. Have they not got used to Yugoslavia having broken up? No, they are conscious that a differentiation is over-differentiation. And probably someone speaking Kajkavian would act the same. One can come into the situation of speaking Serbo-Croatian without knowing whether it is Bosnian or Serbian or Croatian, and in most texts it is not distinct which it is: I find texts with words, I see they are Serbo-Croatian, I add words as Serbo-Croatian, I cannot see that it is Croatian or Serbian or Bosnian despite understanding everything, why would I need to undergo the hardship of examining whether a text in Serbo-Croatian is Bosnian or Croatian or Serbian? That’s why it is treated as one language. One should see easily by the text that something is in a language. If you need someone, the place of publication, or diacritics (“an accentological point of view”) to tell you that it is, it isn’t a separate language. Fay Freak (talk) 13:23, 7 July 2019 (UTC)
Your initial premise is a non-sequitur; Serbo-Croatian was standardized before Yugoslavia existed, and (at least as far as its Shtokavian dialects go) existed as a distinct abstand language long before its standardization. The death of a political entity doesn’t magically make a language extinct. — Vorziblix (talk · contribs) 21:16, 8 July 2019 (UTC)

Unified Japanese: a new proposal[edit]

Unified Japanese, the format that treats Classical Japanese and Modern Japanese under a single ==Japanese== header has been proposed before, but past proposals were unsuccessful because they failed to distinguish between regular phonological developments (such as 会ふ会う) and morphological changes (such as 変ふ変える). I propose that we apply Unified Japanese only to the former case: if the 文語形 and the 口語形 differ only by 仮名遣い, we unify them under the modern spelling:



会う (intransitive)

  1. to meet; to encounter
Conjugation of 会う (五段活用) in Modern Japanese
Conjugation of 会ふ (四段活用) in Classical Japanese
Conjugation of 安布 (四段活用) in Old Japanese

On the other hand, if the difference between the 文語形 and the 口語形 is a morphological one, we give them separate pronunciation sections (while still merging definitions). Note that just as the 口語形 can be spelled in historical orthography, the 文語形 can also be spelled in modern orthography and pronounced in Modern Japanese, so we still benefit from the entry layout of Unified Japanese:



変える (transitive)

  1. : to change; to alter; …
  2. , , , : to exchange; to replace; …
Conjugation of 変える (下一段活用) in Modern Japanese
Conjugation of 変ふ (下二段活用) in Classical Japanese
Conjugation of ??? (下二段活用) in Old Japanese



  1. Premodern shūshikei of  ()える (kaeru).

What do you think about such a proposal?

By the way, in the examples given above, the kana and rōmaji are moved from the POS headers to the pronunciation section. I think this is a step long overdue: it is more logical (think of kanji entries with multiple etymology sections), and it greatly simplifies the entry layout of entries (especially kango which tend to have multiple POS).

(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 05:44, 7 July 2019 (UTC)

I think the example "変ふ" is not an entry but rather a soft redirect, like doth for do. -- Huhu9001 (talk) 06:13, 7 July 2019 (UTC)
Ah yes, you're perfectly right. I wanted to express that "変ふ" should have its own pronunciation section, but I didn't make myself clear. --Dine2016 (talk) 06:26, 7 July 2019 (UTC)

@Dine2016: Could you succinctly state the disadvantages, if any, of this new proposal which I do support? --Backinstadiums (talk) 09:29, 7 July 2019 (UTC)

@Backinstadiums: (1) All stages of the language are treated under the modern spelling, which is anachronistic. For example, Old Japanese apu, Classical Japanese afu, and Modern Japanese au are treated under the modern spelling 会う, but the spelling 会う did not exist during the time of Old Japanese, and before the sound changes concerning ɸ during the time of Early Middle Japanese. On the other hand, Unified Chinese works because Traditional Chinese is applicable to the modern dialects, Middle Chinese, and to an extent Old Chinese. (2) Students of Classical Japanese may benefit more from a Classical Japanese dictionary covering only Old Japanese, Early Middle Japanese and later elements incorporated into this classical written language, instead of a historical dictionary which cover all stages under the modern form in modern spelling. Compare the situation in Japan: Even if there is the popular 広辞苑, there are still many specialized 古語辞典. (3) In the first situation (e.g. 会う), the modern form and the classical form are unified both under the modern spelling; in the second situation (e.g. 変える), the modern form is lemmatized in modern spelling and the classical form (変ふ) is lemmatized in historical spelling. And if you conjugate both forms to for example the ren'yōkei, then the modern form (変え) and the classical form (変へ), then they would be again unified under the modern spelling. This is great inconsistency (although there will be soft-redirects when unified). This problem can be solved if we follow Japanese monolingual dictionaries and lemmatize wago under the modern kana spelling, even for classical forms:
い・ず いづ [1] 【出づ】 [1]
かど・う かどふ 【〈勾引〉ふ・拐ふ】 [2]
か・う かふ 【替ふ・換ふ・代ふ・変ふ】 [3]
Then everything will be in modern spelling, which is fairly close to modern pronunciation. --Dine2016 (talk) 10:09, 7 July 2019 (UTC)
I have a few concerns about this.
  • Pronunciations and conjugations
Thanks to the 1603 Vocabvlario da Lingoa de Iapam or Nippo Jisho, we have the Japanese of the time transcribed into the Portuguese spellings of the time, giving us a rough approximation of the sound values. These were sometimes substantially different from modern conventions. Consider the modern verb 買う (kau, to buy). The Nippo Jisho entry is here, right-hand-column, second entry down.
Modern 1603
終止形 / Terminal /kau/ /kɔː/
連用形 / Continuative, Stem /kai/
過去形 / Past Tense /katːa/ /kɔːta/
Or consider the modern verb 替える (kaeru, to exchange, to replace), seen here in the Nippo Jisho, right-hand column, second entry down.
Modern 1603
終止形 / Terminal /kaeru/ /kajuru/
連用形 / Continuative, Stem /kae/ /kaje/
過去形 / Past Tense /kaeta/ /kajeta/
Note here that the Terminal form (the so-called "dictionary" or lemma form) differs in 1603 from both the modern かえる (/kaeru/) and the ancient / pre-Ashikaga or Muromachi period かふ (ancient reading */kapu/, pre-1600s reading /kafu/, pre-modern reading /kɔː/, modern reading /kau/).
  • How far back to go
I think it's a mistake to include Old Japanese, for a few reasons. Linking through to the OJP entry is not a problem, but including OJP conjugations in the modern JA entry is too much detail -- we have the OJP language code, and we're already starting to build out our OJP content, so there's no good reason not to put the details in an OJP entry.
I also think it's a mistake to use man'yōgana spellings for OJP lemmata, such as the 安布 example above to spell canonical 会ふ (to meet, to encounter, ancient reading */apu/, pre-1600s reading /afu/, pre-modern reading /ɔː/, modern reading /au/). Man'yōgana spellings were wildly variable, sometimes changing even within a single poem. Also, so far as I know, there isn't any consensus view of what the "most common" man'yōgana spelling would be for a given word. Native Japanese sources generally list OJP terms under the modernized kanji and/or kana spellings. I think we should follow suit.
If we are to include Classical Japanese in our modern Japanese entries, we must explain somewhere prominently and clearly that this is Classical Japanese as found in XXX usage (replacing XXX with whatever time period we decide to target). For instance, if we include Classical Japanese as used today, that differs from the Classical Japanese recorded in the 1603 Nippo Jisho. That difference is (so far as I've studied to date) mainly in pronunciation, but it's an important distinction and we would need to point that out.
Nota bene: I'm not opposed to some key parts of this proposal, particularly 1) unifying pre-modern and modern terms as much as possible, and 2) using kana as the lemma spellings for wago (native-Japanese terms), given the structural constraints of the MediaWiki platform that make it impossible to replicate the functionality of native-Japanese electronic dictionaries (where a single entry may have multiple indexed spellings, any of which will get the user the desired entry). My points above are to argue that, should we unify, we need to be clear about scope (how much to unify, how far back to go), and about how we present the information to users (differences in pronunciation, conjugation, etc.). ‑‑ Eiríkr Útlendi │Tala við mig 17:41, 9 July 2019 (UTC)
Thanks for your replies.
Pronunciations and conjugations: Yes, you're right. The pronunciation section of 買う should be like this:
I have removed the ambiguous "Classical Japanese" and added specific stages like "Early Middle Japanese" (800-1200) and "Late Middle Japanese" (1200-1600). Similarly, the conjugation section should contain four tables, the table for Modern Japanese listing the terminal form as kau and the past form as katta, and the table for Late Middle Japanese listing the terminal form as /kɔː/ and the past form as /kɔːta/.
Strictly speaking, "Classical Japanese" refers to the classical written language and does not correspond to any particular stage. So Classical Japanese is used up until World War II, while Early Middle Japanese is used during 800 and 1200. And as you noted before, there is a modern pronunciation of Classical Japanese where 買ふ is pronounced , different from Early Middle Japanese where 買ふ was pronounced kafu, and different from Modern Japanese (the spoken language) where it's pronounced kau.
Similarly, the conjugation section of 替える should link to Early Middle Japanese 替ふ and Late Middle Japanese 替ゆ, the former being “Old Japanese and Early Middle Japanese shūshikei of  ()える (kaeru).”, and the latter being “Late Middle Japanese shūshikei of  ()える (kaeru).” cf. The 替ゆ entry in 精選版 日本国語大辞典
How far back to go: I don't think the presence of the OJP code is a problem. Even with Unified Chinese (zh), we still have code for the sublanguages like Middle Chinese (ltc), Old Chinese (och) and Mandarin (cmn), Cantonese (yue), etc. which can be used in templates like {{bor}}. So codes are no problem. As for where to build content, searching insource:/\|m_kana=/ reveals that we have Man'yōshū quotations under the ==Japanese== header of , , , etc. Given that we have Old Japanese content under both ==Japanese== and ==Old Japanese==, I suggest that we move the latter to the former, in order to show the historical continuity of senses and conjugations, and in line with large kokugo dictionaries like the KDJ.
Using kana as the lemma spellings for wago: Yes. If we go for Unified Japanese, then it's better to use the kana spelling instead of kanji-kana majiribun as the lemma spelling of wago. Because kanji and okurigana usage may change over time, the most common spelling today may not be the most common spelling used over history, but kana is consistent throughout. As for using modern kana orthography (e.g. lemmatizing Modern Japanese 替える at かえる, Late Middle Japanese 替ゆ at かゆ, but Early Middle Japanese and Old Japanese 替ふ at かう), that's for consistency (for example, the etymological relationship between 買う and 替ふ is clear if both are lemmatized at かう). --Dine2016 (talk) 07:30, 10 July 2019 (UTC)
I fear that I do not know enough about historical stages of Japanese to have a strong opinion on the matter at the moment. —Suzukaze-c 09:47, 10 July 2019 (UTC)

@Dine2016: "And as you noted before, there is a modern pronunciation of Classical Japanese where 買ふ is pronounced kō, different from Early Middle Japanese where 買ふ was pronounced kafu, and different from Modern Japanese (the spoken language) where it's pronounced kau"

What is that modern pronunciation called? Neoclassical? --Backinstadiums (talk) 13:24, 10 July 2019 (UTC)

I don't think it has a name, though it is taught in many Classical Japanese textbooks published in Japan. --Dine2016 (talk) 05:05, 11 July 2019 (UTC)

@Dine2016: According to Prof. Victor Mair,

From my colleague Linda Chance, who is a specialist on Classical Chinese, the technical term for this is ハ行転呼音・はぎょうてんこおん.

It refers to the fact that from sometime in the Heian period the "ha" line changed to the same pronunciation as the "wa" line, but the "ha" line spellings continued in use. (Interesting examples--if you write these in modern Japanese with 'u' for 'fu,' 惟うに is still pronounced omō ni, but 失う becomes "ushinau" (except in some dialects.) This "modern pronunciation" is potentially centuries old. We read classical texts this way because we can't retrieve that original early Heian pronunciation. --Backinstadiums (talk) 14:19, 12 July 2019 (UTC)

@Backinstadiums: Thank you for your research, but ハ行転呼音 only accounts for the change of "ふ → う". For example, 今日 is originally pronounced けふ (as shown by the historical spelling) so by ハ行転呼音 it becomes けう, but it's now pronounced きょう, which means that there is another sound change which changed けう into きょう. In fact, if you compare historical spelling and modern spellings, you'll find that after ハ行転呼音 changes ふ into う, this う fused with the preceding vowel to make a long sound:
Historical spelling After ハ行転呼音 Modern spelling
あう あう おう
いう いう ゆう
えう えう よう
おう (ou) おう (ou) おう (ō)
おふ (ofu)
I suspect the sound change that turns the second column into the third column is called “/Vu/ monophthongization”.
For Modern Japanese verbs ending with the vowel combination あう or おう, the う is treated as a separate element from the verb stem. For example, 思う is pronounced omo-u instead of omō, and 会ふ stopped at あう and instead of evolving into おう. The "Neoclassical" pronunciation may be just a hypercorrection, an over-application of “/Vu/ monophthongization” to classical verbs ending with (あ)ふ or (お)ふ. Another possibility is that the "Neoclassical" pronunciation is a descendant of Late Middle Japanese. As Eirikr noted above, 買う was pronounced in 1603 as /kɔː/, so in Late Middle Japanese the verb-final う was not treated as a separate element from the verb stem. The Neoclassical pronunciation may simply have followed that. --Dine2016 (talk) 16:02, 12 July 2019 (UTC)

@Wyang What do you think of the proposal above? I really hope someone with a bot can carry out changes to the Japanese entry layout, for example moving the reading to the pronunciation section. --Dine2016 (talk) 06:23, 14 July 2019 (UTC)

As before, I don't think wago should be lemmatised on kanji-containing forms for modern Japanese. But I'm not informed enough about Old/Classical/Middle Japanese to know whether this proposal is the most appropriate solution. I really encourage you to create a bot and test it out; it is not complicated. Wyang (talk) 08:43, 14 July 2019 (UTC)

Pali transliteration of Nikkahita and velar nasal[edit]

This relates to the transliteration of non-Roman text to the Roman script.

The issue is that the choice in writing between a nigghita and the velar nasal ('nga') is not always the same way as it is when writing unlocalised Pali in the Roman script. I would like confirmation that I am applying the correct principle.

My principle is that where the writing system uses distinct symbols for the two, the transliteration should reflect which character was used.

The Burmese and Tai Tham scripts have a special form of nga that sits above the normal layer of base characters. It is called 'kinzi' for the Burmese script, and 'mai kang lai' for the Tai Tham script. I am not distinguishing between them on one hand and ordinary nga on the other in transliteration. A complication is that some writing styles use mai kang lai where the usual Roman spelling would write a niggahita (ṃ) before non-plosives. -- RichardW57 (talk) 00:36, 9 July 2019 (UTC)

Standard German IPA[edit]

I'm finding that there's a lot of variation in the IPA transcriptions for german words and some standard should be settled on so that it's consistent across entries. This is mainly for the case of syllable-final r's, should they be transcribed as /ɐ/~/ɐ̯/ or as /ʁ/? Both can be seen. Either, I think, the most widespread vocalic pronunciation /ɐ/ should be used in all these cases or the more phonemic /ʁ/ should be used as a more unifying transcription allowing it to represent both those dialects which do not reduce it to a vowel in this position and those that do, for the latter just applying the allophonic pronunciation (which could be included alongside in square brackets if desired).

My feeling is maybe to go with the latter but I think it should be discussed. It is sort of like how for french entries one pronunciation is usually listed unless there is an unpredictable regional pronunciation in Canada or Belgium or Louisiana etc., the point is that the other accents' pronunciation are predictable given the base transcription.

I also sometimes see /ʀ/ being used both syllable-initially and -finally, I would say that only /ʁ/ should be used though as the trilled is a regional variant.

Finally, one last option is suppose would be to treat German more like English and list two or more pronunciations qualified by region (northern and southern? northen and Austro-bavarian?) I feel like this could be over-the-top though. Please let me know your thoughts and let's decide on some sort of standard so there can be consistency! 2WR1 (talk) 06:20, 9 July 2019 (UTC)

A small prior discussion of this is at Wiktionary talk:About German/Archive_1#R, where one standard was proposed. Probably 'non-standard' transcriptions will continue to be entered, no matter what we choose, by people who either learned different standards or are basing their additions on (/copying from) works using different standards (de.Wikt vs the Duden, etc). - -sche (discuss) 02:59, 11 July 2019 (UTC)
I have a feeling that some of the inconsistency is due to a policy change at de.wikt which wasn't applied here (/ʁ/ instead of /ʀ/), so a lot of the /ʀ/ we have in our IPA were probably copied from before the policy change. We could just run a bot to apply the same changes here. Jberkel 21:41, 11 July 2019 (UTC)
@-sche @Jberkel There have been standards established for things like English and French (i.e. /aɪ/ instead of /aj/ etc., /ɹ/ instead of /r/) and those are followed pretty well. i think something should just be decided on so there is a standard adn if people don't always follow it precisely, it can be fixed up easily. Maybe a module like the fr-pron one should be made, in that case a standard would be really needed. I think there's arguments in different directions but maybe it should be discussed and decided, it feels a bit sloppy to have it be inconsistent. It would be a good idea as a start at least to remove all instances of /ʀ/ though. 2WR1 (talk) 02:02, 14 July 2019 (UTC)

Visibly Untrue Pali Etymologies[edit]

A lot of Pali etymologies contain "From {{inh|pi|sa|...}}". The problem with this text is that the template (plus modules) does not expand "sa" to "Proto-Indian Indo-Aryan", but expands this to a hyperlink to "Sanskrit", which defines "Sanskrit" with the normal meanings of the term. These words are not inherited from Sanskrit in the normal sense of the word 'Sanskrit'.

How should we fix it so that what is presented to the reader is not a lie? I'm charitably assuming that anything parsing the links should understand that 'sa' as the source parameter of {{inh}} refers to a reconstructed language.

The best I've come up with is to use "Cognate with {{inh|pi|sa|...}}" instead. -- RichardW57 (talk) 07:24, 9 July 2019 (UTC)

Can you type a qualifier in front? From (reconstructed / Proto / ?) {{inh|pi|sa|...}}. Or should we create a new etymology-only code for this? DTLHS (talk) 16:11, 9 July 2019 (UTC)
Perhaps that is the answer, e.g. "From Pre-Sanskrit याति (yāti), from Proto-Indo-European *yeh₂-" currently at yāti#Pali. Another possibility would to prefix something like "From recent ancestor of ". Presumably there was a good reason for not wanting to have an explicit intermediate stage between Sanskrit and Proto-Indo-Aryan. As Wiktionary makes it difficult for European dilettanti to look up Sanskrit words, I have been wondering if I should add a template for this case in which the editor only has to enter the IAST transliteration. There's also the case where the common ancestor is clearly different to Sanskrit.
For old enough words, should we also trace the ancestry back to PIE in the Pali entry, or expect the interested user to click on to the Sanskrit entry?

It’s a longstanding convention, which implicates the grammarians’ conversion schemas (which themselves have somewhat stylized Pali and the other “high” Prakrits beyond the actual MIA vernaculars) and generally makes life simpler, except when forms that cannot be synchronically derived pop up, which is not uncommon. Sanskrit indiscriminately collapsing thorn clusters into >ks is one of the most irritating. Hölderlin2019 (talk) 18:22, 14 July 2019 (UTC)

Russian consonant voicing assimilation[edit]

Many Russian entries, especially those of words with long consonant clusters, don't seem to have the consonants assimilated to their actual pronunciation. For example, currently the pronunciation for исправлять is [ɪsprɐˈvlʲætʲ] instead of [ɪzbrɐˈvlʲætʲ]. Note that at the same time the vowels are phonetic not phonemic. Adding to the confusion, voicing assimilation sometimes is reflected in the IPA, for example совсем [sɐfˈsʲem], and нож [noʂ].

The pronunciations given are all correct, as spoken. Better go listen to more Russian. Fay Freak (talk) 00:20, 11 July 2019 (UTC)

Priority of terms already used in definitions[edit]

For example, at second hand is used in the definition of apud, or on-topic in germane's, therefore adding such terms is especially pressing --Backinstadiums (talk) 17:46, 11 July 2019 (UTC)

If they are entryworthy.
  1. on-topic at OneLook Dictionary Search suggests that most dictionaries find that it has no meaning apart from on + topic, using a standard technique (a hyphen insertion) to prevent alternative readings (ie, a prepositional phrase).
  2. Looking up second hand should fully address any dictionary user's uncertainty about the meaning of at second hand. Unfortunately we don't seem to have an entry as good as MWOnline (4 definitions) for it.
DCDuring (talk) 19:23, 11 July 2019 (UTC)
@DCDuring: thanks for replying. In any case, they're to be dealt with before the rest as they're already being used in entries, even more so if just to complicate matters --Backinstadiums (talk) 19:57, 11 July 2019 (UTC)
I think you did not understand the reply. Consider the phrase “at room temperature”, used in the definition of water. We have no entry for this phrase, for a very good reason: we do have entries for at and for room temperature. For the rest it is a matter of X + Y = Z.  --Lambiam 10:58, 12 July 2019 (UTC)
There is another way to deal with them, which is to either amend where the red link is pointing (point all of "at second hand" to "second hand") or modify the bracketing ("[[at second hand]]" to "at [[second hand]]"). Cleaning up a red link is not always creating an entry, if the entry should never exist then the link should point to an entry or entries which should exist or do exist. - TheDaveRoss 12:29, 12 July 2019 (UTC)
@Lambiam, TheDaveRoss: Correct me if I am wrong, but the preposition at governs the noun secondhand, whose entry does not show any nominal meaning. Regarding second hand, the only noun meaning reads: On a clock or watch, the hand or pointer that... which is not the meaning inteded for at second hand. Then how is it that X + Y = Z? --Backinstadiums (talk) 13:28, 12 July 2019 (UTC)
Correct. To repeat myself: "Unfortunately we don't seem to have an entry as good as MWOnline (4 definitions) for [ second hand ]." DCDuring (talk) 15:01, 12 July 2019 (UTC)
  • I have created an entry for at second hand because I could not make at work with synonyms for the noun second hand. I would welcome better wording at the second definition of the noun [[second hand]] that overcame this problem. DCDuring (talk) 20:43, 12 July 2019 (UTC)

Use of the heading "Synonyms" on the pages of unbound morphemes.[edit]

By happenstance, I have recently noted that the heading "Synonyms" is often included on the pages of unbound suffixes, and want to take this opportunity to highlight what I believe to be the impropriety thereof. The term "synonym" only applies to lexemes; only a lexeme may be synonymous with another lexeme. I understand an unbound suffix, however, to be a morpheme, rather than a lexeme, and a morpheme cannot represent a synonym. I am of the thought, then, that the term "analogue" is a better one for describing two morphemes such as two suffixes, which are near in meaning or effect, and that the heading "Analogues" is preferable to "Synonyms" on the pages of unbound morphemes. I would like to begin a discussion here, to test whether there can be any consensus regarding the use of "Analogues" as opposed to "Synonyms" on such pages as a matter of policy. I am not a Wiktionarian (yet), meaning that I have no Wiktionary account. My name is Michael, and I look forward to reading the thoughts of all you Wiktionarians about this. —This unsigned comment was added by (talk) at 18:46, 11 July 2019 (UTC).

We have the header "Coordinate terms", if that's what you mean. DTLHS (talk) 18:51, 11 July 2019 (UTC)
  • I suspect that the strict formal limitation of synonym to apply only to fully independent lexemes is not well known by most English-language readers, our target audience. It's certainly not a restriction of usage that I'm acquainted with, as an educated native speaker of English. Conversely, I think that most English-language readers are familiar with the sense of synonym as in "these things have roughly the same meaning". I also think that few readers will understand what is meant by an "Analogues" header.
As such, I cannot support the suggested change: it will likely confuse users. ‑‑ Eiríkr Útlendi │Tala við mig 19:53, 11 July 2019 (UTC)
Many affixes have meaning and are lexemes. As I understand it, inflectional affixes are not lexemes. IOW, I don't think morphemes and lexemes are disjoint categories, as the complaint above seems to imply. DCDuring (talk) 20:43, 11 July 2019 (UTC)
-er is synonymous with more, yet one is an inflectional affix and the other is an independent word. —Rua (mew) 21:29, 11 July 2019 (UTC)
To confirm, what I hear you @ saying is that synonym can only apply to independent words, whereas @DCDuring, Rua, you seem to be saying that synonym can and should apply to affixes as well as independent words. Is this a correct restatement? ‑‑ Eiríkr Útlendi │Tala við mig 21:42, 11 July 2019 (UTC)
Yes. I should have said I agree with your user-oriented arguments, which are more important than the definitional matters, about which I could be wrong, eg, about -er. I don't view comparative, superlative, diminutive, and natural gender affixes as on all fours with case, grammatical gender, number, tense, mood, and aspect inflectional affixes, but I am not speaking from a position of multilingual learning. DCDuring (talk) 01:21, 12 July 2019 (UTC)
The meaning of lexeme may seem clear when considering a single language, but viewed across the spectrum of human language it becomes fuzzy. In Turkish, basically the same entity can be a stand-alone word one instant but turn effortlessly into a suffix the next one (e.g., ile-le). To me, a “name” is any term of some language to which we can assign some meaning, and a synonym is then, to me, any other name in that language with the same or very similar meaning. From this point of view that name will smell as sweet, whether it is a single word, a phrase, or a morpheme.  --Lambiam 10:48, 12 July 2019 (UTC)
DCDuring is correct in noting that morphemes and lexemes are not disjoint as categories. Morphemes, however, fall into two general types: those which may stand alone and so are called "roots", and those which depend upon combination with another morpheme to express an idea, then called "affixes". "Roots" may indeed join with lexemes in being referred to as synonyms, but I think that affixes may not, since they serve only a grammatical function. Of course, the suffixes which are the instant topic of conversation represent the second type. Even so, the argument that general user accessibility is more important than a strict adherence to definition, especially within a generalized resource such as Wiktionary, has great validity, and so probably ends the debate. I was simply unsure of whether the use of the header term had been considered in such instances, and was not at all thinking like a lexicographer. Thanks, guys.

Wikidata feedback[edit]

Dear Wiktionary community, you are the only community that I am aware who asked for enabling Wikidata access on this Wiktionary. I am currently preparing a presentation that I will give at the next Wikiconvention francophone that will be held in Brussels at the beginning of September. I want to talk about the relationship between Wiktionary and Wikidata. I have the feeling that it started in a bad way, at least from the French point of view.

So I would like to have your feeling about Wikidata in general and more specifically about your experience with Wikidata. How did you use Wikidata on the English Wiktionary so far and how you would like to use Wikidata, if you want to, here in the future. If some of you contribute on the lexicographic data on Wikidata, I would also be interested in your feedbacks.

Thanks in advance. I am eager to read you :D Pamputt (talk) 22:07, 13 July 2019 (UTC)

My only concrete experience was that it enabled a large number of images to be added to taxonomic name entries, without captions; sometimes in simple error; and almost always not in accord with the notion of trying to select images of type species for entries for genera. The Wikidata project box is displayed much too prominently in view of the limited value of Wikidata to ordinary dictionary users. DCDuring (talk) 03:06, 14 July 2019 (UTC)
@DCDuring thank you for your comment. Could you give a link to a page where this Wikidata project box is used? Pamputt (talk) 07:19, 14 July 2019 (UTC)
See Special:WhatLinksHere/Template:wikidata. DCDuring (talk) 12:48, 14 July 2019 (UTC)
Also Special:WhatLinksHere/Template:Wikidata entity link.
I suppose that if the reward from following the links were sufficient, we would redesign the templates. I would prefer {{Wikidata entity link}} for my purposes. DCDuring (talk) 12:55, 14 July 2019 (UTC)
One of the more frequent uses of Wikidata in entries: The data tables for most languages (7764 out of 8069 as I write this) and language families include the Wikidata item. This in turn is used by the Language:getWikipediaArticle() function in Module:languages, the EtymologyLanguage:getWikipediaArticle() function in Module:etymology languages, and the Family:getWikipediaArticle() function in Module:families to retrieve the name of the Wikipedia article, so that etymology templates such as {{derived}} can display linked language or language family names. — Eru·tuon 05:15, 15 July 2019 (UTC)