Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:Village pump)
Jump to navigation Jump to search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit
2002
December
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019


Contents

May 2019

Using an arrow in synonyms to mean see[edit]

For some time, I've been using an arrow (→) in the synonyms template to mean "see". I think it's pretty neat and can be picked by the readers reasonably fast. Today, an admin prevented me from using the arrow in buřt, so let's discuss.

An example appearance is in this revision of buřt, approximately:

  1. fatso, butterball
    Synonym: → tlusťoch

An alternative is this:

  1. fatso, butterball
    Synonym: see tlusťoch

I think the arrow is neater.

Thoughts? --Dan Polansky (talk)

Using arrows in synonyms[edit]

User:Dan Polansky invented some new notation for synonyms, using an arrow to indicate to the user that there are more synonyms on another page. In a sense, this is analogous to our current use of "see Thesaurus page", which is officially supported by code in Module:nyms. The new notation can be seen in diff. I have removed it for the moment, for reasons that are both technical and merit based:

  1. The arrow is included in the Czech language tag, which means that the arrow is considered part of the Czech term, which is obviously not the case. Some screen readers may handle this by reading out the Czech word for "arrow", and we definitely do not want that. Punctuation and other symbols that are not part of the term should not be included in the language tag.
  2. This is not a standard practice, and is basically a creative misuse of {{syn}} that's not officially supported by Module:nyms.
  3. There is no explanation at all for what the arrow means. The other place where we regularly use arrows, in descendants sections, there is a mouseover text. That's not great either, though. When we direct users to a Thesaurus page, we include the word "see", which is much clearer. If we allow this practice, then we should definitely use "see", of course placed outside the language tagging so that it's not interpreted as Czech.
  4. Given that we have Thesaurus pages to hold collections of synonyms, redirecting the user to another normal entry is a bad practice. The other entry may have multiple senses each with their own synonyms, and the user then has to hunt for the sense that the first entry was trying to refer to.

What do others think of this? —Rua (mew) 11:03, 1 May 2019 (UTC)

I already opened #Using an arrow in synonyms to mean see. ---Dan Polansky (talk) 11:05, 1 May 2019 (UTC)
Let me just note that the use of "see", "see also" or the like is a very useful practice that I have used for multiple years (I think) in the traditional Synonyms sections. It works reasonably well; it allows picking one page as a central collecting location of synonyms without a need of a separate Thesaurus page. The downsides mentioned above are minor, from my experience, and the upside is huge.
As for what this discussion is about, the title of the thread points to the use of arrow rather than raising the subject whether the "see" technique should newly be prohibited. Where there is thesaurus page, the "see" technique can be used with the thesaurus page instead of another mainspace page. --Dan Polansky (talk) 11:12, 1 May 2019 (UTC)
Please don't. The synonyms on the individual "arrow" page might change, get moved to the thesaurus or whatnot. It's confusing (duplicate concepts, where I'm I supposed to look? Thesaurus? Entry? Both?) and requires more bookkeeping. – Jberkel 11:26, 1 May 2019 (UTC)
Czech entries have almost no thesaurus entries so the problems do not arise. It is a scheme alternative to Thesuarus, admittedly, but that is not necessarily a bad thing, I think. Using that technique for English would be confusing, but not for Czech. For Czech, there is no bookkeeping overhead. --Dan Polansky (talk) 11:38, 1 May 2019 (UTC)
An example for ease of reference: tlustoprd is an entry created by me in June of 2014 that uses the technique and points the reader to tlusťoch. --Dan Polansky (talk) 12:00, 1 May 2019 (UTC)
"Czech entries have almost no thesaurus entries so the problems do not arise." Why not? Sounds like this could be a solution. – Jberkel 18:03, 1 May 2019 (UTC)
Sure thesaurus would be a solution. I already have a solution, one that works, is widely deployed in Czech entries, and is simpler than the thesaurus. I have no desire to switch the deployed solution to the thesaurus: the solution is simple, neat, and works well, as far as my experience shows me. The solution does not prevent transition to the thesaurus later. The reader can be happy. Who is not happy is CodeCat/Rua. I am not happy with CodeCat/Rua interfering with my expanding the useful content for our readers. I do not appreciate having to deal with what to me look like pseudo-problems. --Dan Polansky (talk) 14:29, 6 May 2019 (UTC)

PIE Tʰ > T /_s#[edit]

In our PIE reconstructions, we recognize Szemerényi's law and Stang's law and have incorporated them into our PIE inflection templates. Does anyone have any objections to including the other word-final rule of Tʰ > T /_s#.[R0 1][R0 2][R0 3] --{{victar|talk}} 12:16, 1 May 2019 (UTC)

We should probably stick to more mainstream scholarship on Wiktionary, and not stray too far into radical new ideas. Is this a widely recognised rule, and not one limited to one particular university or school of thought? —Rua (mew) 17:14, 1 May 2019 (UTC)
It's not a fringe theory and is required for explaining the lack of Bartholomae's law and voicing of word-final consonant clusters, but most of my sources are Irano-centric, so it would be helpful if those familiar to other branches could comment and give counter-examples, if they exist. --{{victar|talk}} 18:25, 1 May 2019 (UTC)
Well, Celtic, Germanic and Latin are of no help at all here because they merge all three stop series before a voiceless obstruent. I don't know enough about Balto-Slavic to be sure, because Winter's law is the only direct reflex of the voiced-aspirate distinction. And that's about all the branches I know anything about. —Rua (mew) 19:56, 1 May 2019 (UTC)
And of course, helpfully, the primary source of obstruent + s sequences is the aorist, which has a lengthened vowel of PIE origin and thus is ineligible for a Winter's law distinction in Balto-Slavic... —Rua (mew) 20:01, 1 May 2019 (UTC)
@JohnC5 can confirm this exists in Ancient Greek as well. I've asked him for an example when he has a moment. --{{victar|talk}} 20:02, 1 May 2019 (UTC)
@Rua: Lithuanian vapsvà < *wobʰseh₂ (English wasp) seems good for BSl, but we also have Old Prussian wobse, so I dunno. AG certainly had a productive deaspiration process before *s found throughout the verbal system (see the fut. and aor. stems of γράφω (gráphō)). I think that this sound change may be too common to necessitate reconstruction. —*i̯óh₁n̥C[5] 06:34, 2 May 2019 (UTC)
@JohnC5, Rua: PII retains Tʰs both initial and medially, as seen in *wóbʰsos (wasp) > *wábʰsas > *wábzʰas > PII *wábžʰas > PIr *wábžah > YAv. 𐬬𐬀ß𐬰𐬀𐬐𐬀(vaßzaka). Looks like AG isn't going to be of any help to us if it deaspirates stops before sibilants everywhere. --{{victar|talk}} 07:17, 2 May 2019 (UTC)
Or maybe it's exactly the evidence we need. On the other hand, it appears to be a synchronic rule of Greek, given that deaspirated stops become voiceless and devoicing of aspirates is a Greek-specific change. If it were a PIE rule, then you'd expect voiced stops to result. I don't know what Greek does with voiced stop + s sequences though. —Rua (mew) 10:40, 2 May 2019 (UTC)
References
  1. ^ Lipp, Reiner (2009) Die indogermanischen und einzelsprachlichen Palatale im Indoiranischen: Neurekonstruktion, Nuristan-Sprachen, Genese der indoarischen Retroflexe, Indoarisch von Mitanni (Indogermanische Bibliothek; 3) (in German), volume 1, Heidelberg: Winter, page 212
  2. ^ Klein, Jared S.; Joseph, Brian D.; Fritz, Matthias, editor (2017–2018), “Chapter V: Indic”, in Handbook of Comparative and Historical Indo-European Linguistics: An International Handbook (Handbücher zur Sprach- und Kommunikationswissenschaft [Handbooks of Linguistics and Communication Science]; 41.2), Berlin; Boston: De Gruyter Mouton, →ISBN, § The phonology of Indic, page 332
  3. ^ Kapović, Mate (2017), “Proto-Indo-European morphology”, in The Indo-European Languages (Routledge Language Family Series), 2nd edition, London, New York: Routledge, page 359

Reflex and orthographic changes to Proto-(Indo-)Iranian[edit]

There are two reflex and orthographic changes to Proto-Indo-Iranian and Proto-Iranian I'd like to discuss:

  1. When I started cleaning up and adding Proto-(Indo-)Iranian entries awhile back, I went with the orthographic choices of amd , which mirrored the en.wiki entry for Proto-Indo-Iranian. Since then, however, that orthography has fallen out of favor in published academic works, which mostly prefer and *ȷ́, respectively, for both Proto-Indo-Iranian and Proto-Iranian.[R 1][R 2][R 3][R 4] I'm really not troubled either way, but it might look better for us to use academic standards, which, incidentally, also more closely echo the orthography the project uses for Proto-Indo-European (ḱ and ǵ).
  2. In Iranian, the spirantization of aspirated stops *pʰ, *tʰ, and *kʰ to *f, , and *x, respectively, is not seen universally, as evidenced in Sakan, Balochi, Parachi, and some dialects of Kurdish. It has been suggested that Proto-Iranian retained said aspirated stops.[R 4] Alternatively, all these languages experienced a fortition of fricatives, which led to a back-mutation. Any thoughts either way?

Are there any objections to either of these recommendations? If we went through with either, the use a of a bot complete the task would be ideal. Pinging: @JohnC5, Tropylium, AryamanA, Vahagn Petrosyan, Bhagadatta, Calak, Kwékwlos. --{{victar|talk}} 16:21, 1 May 2019 (UTC)

My preference is definitely for ć and j́. We use ś for Sanskrit, which is a reflex of ć, so using the same diacritic shows that continuity. We also use the same symbol for their Balto-Slavic cognates ś and ź, and as you mentioned their PIE ancestors ḱ and ǵ. So that fits much better. —Rua (mew) 17:19, 1 May 2019 (UTC)
I also prefer and *ȷ́. Standard Kurdish orthography ignores aspirated consonants, but usually use character to indicate aspiration.--Calak (talk) 17:24, 1 May 2019 (UTC)
I agree, that would be a better option. But for PIA we should write *c instead of *ć. Kwékwlos (talk) 17:40, 1 May 2019 (UTC)
Academic works on PIA preferenciate and *ȷ́ over *c and *j, which serves to inform the reader that they have different phonetic values than that of Sanskrit, so no, I would disagree with that change. --{{victar|talk}} 18:30, 1 May 2019 (UTC)
I'm fine with these changes. —*i̯óh₁n̥C[5] 06:22, 2 May 2019 (UTC)
@Victar, JohnC5, Rua: I too consent to the proposed changes. I would also like to make an additional proposition: to show the Proto-Indo-Aryan descendant of Indo-Iranian *ȷ́ as *ź (the fricative) and not as *ȷ́ (the affricate) as we currently do. The rationale is that just in the manner Indo-Iranian *ć produced Indo-Aryan *ś, the voiced counter part of *ć (ie, *ȷ́) produced the voiced counterpart of *ś, which is *ź (representing the /ʑ/ sound) and distinctive from the *ȷ́ (/d͡ʑ/) in PIA which comes from IIR *ǰ. Although both IIR *ȷ́ and *ǰ ended up as Sanskrit (ja), Mr Kobayashi believes that the distinction was preserved in the intermediary stage and Old Indo-Aryan passed through some kind of "affricate filter" that merged both the fricative and the affricate into the affricate. -- Bhagadatta (talk) 05:37, 3 May 2019 (UTC)
  • @DTLHS: Would you possibly be able to run a bot to replace to replace amd with and *ȷ́ in all iir-pro and ira-pro entries and links to them? --{{victar|talk}} 07:05, 5 May 2019 (UTC)
    I'll see what I can do. DTLHS (talk) 05:10, 6 May 2019 (UTC)
    @DTLHS: That would be much appreciated, thanks. --{{victar|talk}} 00:10, 7 May 2019 (UTC)
    Done, please watch for double redirects. DTLHS (talk) 05:09, 8 May 2019 (UTC)
    Thank you so much, @DTLHS! I hope it wasn't too much of a pain. --{{victar|talk}} 16:09, 8 May 2019 (UTC)
    @DTLHS, would it be possible to run your bot to include {{desc}} entries as well? --{{victar|talk}} 13:01, 14 May 2019 (UTC)
    I don't know what you mean. DTLHS (talk) 15:37, 14 May 2019 (UTC)
    @DTLHS: Well I just noticed this and I thought maybe {{desc}} wasn't included. --{{victar|talk}} 16:36, 14 May 2019 (UTC)
    I was only modifying links to pages that actually exist. DTLHS (talk) 18:06, 14 May 2019 (UTC)
    I see. --{{victar|talk}} 19:34, 14 May 2019 (UTC)
  • @DTLHS, would it be possible to run a ĉ -> ć, ĵ -> ȷ́ replace on this list that @Erutuon was kind enough to generate? --{{victar|talk}} 20:52, 4 June 2019 (UTC)
    Also, I was discussing it with others, and I think plain *c and *j are better suited for Proto-Iranian [ira-pro] (note: still and *ȷ́ for [iir-pro]). Maybe @Erutuon can help generate @Erutuon generated a list for that as well. You would have to run your script for moving pages, which would be nice if they could be moved without a redirect this time around. --{{victar|talk}} 21:02, 4 June 2019 (UTC)
    Remind me in a week. DTLHS (talk) 15:21, 5 June 2019 (UTC)
    @DTLHS --{{victar|talk}} 17:55, 14 June 2019 (UTC)
References
  1. ^ Lipp, Reiner (2009) Die indogermanischen und einzelsprachlichen Palatale im Indoiranischen: Neurekonstruktion, Nuristan-Sprachen, Genese der indoarischen Retroflexe, Indoarisch von Mitanni (Indogermanische Bibliothek; 3) (in German), volume 1, Heidelberg: Winter
  2. ^ Martínez García, Javier; de Vaan, Michiel (2014) Introduction to Avestan (Brill Introductions to Indo-European Languages; 1), Brill, →ISBN
  3. ^ Skjærvø, Prods Oktor (2017), “Avestan and Old Persian Morphology”, in Kaye, Alan S., editor, Morphologies of Asia and Africa[1], Winona Lake, IN: Eisenbrauns
  4. 4.0 4.1 Kümmel, Martin Joachim (2014), “The development of laryngeals in Indo-Iranian”, in The Sound of Indo-European[2], volume 3, Opava

Japanese soft-redirection necessitates kana-centric approach for wago[edit]

{{ja-see}}, the Japanese soft redirection template, is meant to be put under ==Japanese== or ===Etymology x===. This means that soft-redirection entries do not have POS headers, which affects the table of contents of those pages.

For example, please take a look at the three etymology sections of した. In the table of contents, only Etymology 3 has "Pronunciation" and "Verb" subheaders, while Etymologies 1 and 2 are "bare", having no POS subheaders. This makes nagivating by POS inconvenient. (I frequently look up English words in Wiktionary, and I rely heavily on the POS links in the table of contents.)

Moreover, there is no reason why the three etymology sections should look different. All are wago words, but just because the first two have kanji spellings, does not mean they should be "second-class citizens" meriting a single line in the table of contents, while the verb form remains a full-fledged entry. By contrast, please take a look at わっぱ, which looks much better.

There is no way for {{ja-see}} to automatically generate the POS headers at the proper level (L3 or L4) because it doesn't know its position in the page. Therefore if we want to have soft-redirection at all, the "inequality" mentioned above is inevitable. This adds one more argument for the kana-centric approach for wago. By choosing the kana spelling as the lemma entry,

  1. Imagine how clean would look and how easy it would maintain if it's put in the format of !
  2. We emphasize that the kanji is an encoding of the spoken word, rather than that the spoken word is the decoding of the kanji. I'm not sure which direction western learners of Japanese take, but linguistics suggests the former direction. (The latter direction, if needed, could be built in the ===Kanji=== section.)
  3. In addition, we reduce the chance of {{ja-spellings}} and {{ja-kanjitab}} appearing together for a kanji lemma entry, which would take up too much space. We also solve the symmetry problem that <かえる: non-lemma, 帰る: full-fledged entry, 還る: non-lemma, 返る: non-lemma, 反る: non-lemma> looks imbalanced while <かえる: full-fledged entry, 帰る: non-lemma, 還る: non-lemma, 返る: non-lemma, 反る: non-lemma> looks better.

In any case, the Japanese soft-redirection system works best with the kana-centric approach for wago. There is at least one issue we haven't discussed before: whether to use kana for compounds like 繰り返す. But at least I think we have agreed on using kana for the "core wago vocabulary" – roughly, those appearing as kun readings of kanji.

I propose that we make a proposal to update Wiktionary:About Japanese#Lemma entries. If it passes, we can update the core wago entries to use the new format. If not, we can remove the soft-redirection system (at least for wago entries) and revert to the plain old {{alternative spelling of}}, which would restore the proper POS headers.

(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 11:37, 2 May 2019 (UTC)

Symbol support vote.svg Support. —Suzukaze-c 06:37, 4 May 2019 (UTC)
Symbol support vote.svg Support redirection of "core wago vocabulary" which have numerous kanji forms to their kana forms. However, I think redirection of wago 複合語 entries such as 山登り, 切り倒す, 繰り返す is not necessary because their kana forms やまのぼり, きりたおす, くりかえす are not that recognizable. KevinUp (talk) 00:47, 5 May 2019 (UTC)
Symbol oppose vote.svg Oppose: I prefer using most common spellings. The problem of English Wiktionary is rather existence of too many rare or archaic readings. It’s good to move nonstandard readings to each hiragana page. — TAKASUGI Shinji (talk) 15:35, 5 May 2019 (UTC)
@TAKASUGI Shinji I find this a good compromise. Can you put it into a guideline for use on WT:AJA? I'm not good at writing in English.
By the way, I think the default entry layout of the English Wiktionary also contributes to the problem. WT:EL requires that etymology and pronunciation be put before the headword, even for langauges like Japanese where the reader must first locate the headword and then get information on the word, which is a great distraction. By contrast, the entry layout of the Japanese Wiktionary is much more suitable: etymology and pronunciation are subordinate to the headword, so the reader can easily navigate in an ocean of homographs. If the English Wiktionary allowed such a format, then the table-of-contents problem would no longer exist:
Extended content
==Japanese==

===Noun===
{{ja-see|下}}

===Noun===
{{ja-see|舌}}

===Verb===
{{ja-verb form}}

# {{inflection of|する||perfective|lang=ja}}

====Pronunciation====
{{ja-pron|acc=0}}
--Dine2016 (talk) 06:14, 6 May 2019 (UTC)
@Dine2016, the sample structure in the expansion section is problematic -- has pitch accent 0, while and have pitch accent 2. This suggests that Pronunciation would have to come before the POS. ‑‑ Eiríkr Útlendi │Tala við mig 18:32, 8 May 2019 (UTC)
@Eirikr: I don't think so. The full format would be
Extended content
==Japanese==

===Noun===
====Etymology====
====Pronunciation====

===Noun===
====Etymology====
====Pronunciation====

===Verb===
====Etymology====
====Pronunciation====
so it's still organized by word, just with the words beginning with POS headers and definitions. The problem is rather what to do with words having multiple POS sharing the same etymology and pronunciation, and single kanji which are usually POS-fluid. The Oxford English Dictionary solves this problem by having headwords in the form “xxx, adj. and n.”, but on Wiktionary ===Adjective and noun=== doesn't look good. --Dine2016 (talk) 06:21, 9 May 2019 (UTC)

Hello everyone, I created the vote at Wiktionary:Votes/pl-2019-05/Lemmatize Japanese wago words at kana spellings. What do you think about the wording?

@Atitarev, Cnilep, Eirikr, KevinUp, Korn, Suzukaze-c, TAKASUGI Shinji --Dine2016 (talk) 14:56, 5 May 2019 (UTC)

@Dine2016 If possible, can you reword the "general rule" in the proposed vote to the following categories to make things clearer?
  1. terms commonly written with kanji漢語 (kango), i.e. Sino-Japanese terms that have kana readings based on 音読み (on'yomi)
  2. terms without kanji spellings外来語 (gairaigo), i.e. Japanese loanwords that are usually written using katakana.
  3. 和語 (wago) words or 大和言葉 (Yamato-kotoba), i.e. native Japanese words that have kana readings based on 訓読み (kun'yomi). These usually have multiple kanji spellings.
  4. Derivatives or compound words such as お巡りさん (omawarisan), 山登り (yamanobori), 繰り返す (kurikaesu).
  5. Special consideration:
    1. Words that have reading patterns based on 重箱読み (jūbakoyomi) or 湯桶読み (yutōyomi) (mixture of on'yomi and kun'yomi readings)
    2. Words with irregular readings such as 当て字 (ateji). Examples include common words such as  (とう)さん (otōsan, father) and  (とも) (だち) (tomodachi, friend).
If I'm not mistaken, categories 1,2,4 are mostly not affected by this vote while category 3 (native Japanese words) will be the one that is greatly affected. I'm not sure about category 5 but for me, I would prefer for the most common spelling to be used. I think the entry at 寿 () () (sushi) can stay while the entry at  () (sushi, sushi) can be moved to its kana form すし (sushi). Note that both are wago terms with the same kana reading and the only difference is in its reading pattern. KevinUp (talk) 02:41, 6 May 2019 (UTC)
@KevinUp: Thanks for your reply. I think 寿 () () (sushi) should also be moved to the kana form because (1) it is wago and (2) its word formation is not clear from the kanji. The word has three kanji spellings (, and 寿司) so lemmatizing at kana would be better. Lemmatizing at kana also makes its relationship with  () (sushi) clear. The word formation of  (とう)さん (otōsan, father) and  (とも) (だち) (tomodachi, friend) is clear from their kanji spellings, so they are better lemmatized at the most common kanji spellings. --Dine2016 (talk) 05:47, 6 May 2019 (UTC)
Thanks for the explanation. So if the word formation of a wago compound can be determined from its kanji spelling, then the main lemma form shall be the kanji spelling. I get it now. KevinUp (talk) 09:39, 6 May 2019 (UTC)
@KevinUp: This is off topic but absolutely important in a topic related to the Japanese language. Could you please stop linking romaji as if they are words? Thanks. —Anatoli T. (обсудить/вклад) 08:48, 6 May 2019 (UTC)
Okay, I've corrected the links. It seems that kango and ateji have their own English entry. I added the links only for this discussion and of course I don't do this when dealing with actual entries. KevinUp (talk) 09:39, 6 May 2019 (UTC)

I have a question. Can {{ja-see}} exactly determine, out of several definitions in another page, which is intended for the current page? In case I had not made it clear, suppose ははは means "1. apple; 2. peach". It can be written 派派派 when meaning "peach". Can a {{ja-see}} in page 派派派 state it clearly that 派派派 means "peach" but not "apple"? -- Huhu9001 (talk) 06:00, 6 May 2019 (UTC)

@Huhu9001: Yes, please see 暗い and 貴方 for examples. --Dine2016 (talk) 06:36, 6 May 2019 (UTC)
@Dine2016: It seems the code differentiate them by etymologies. Does that still work when the two senses share a same etymology? -- Huhu9001 (talk) 07:35, 6 May 2019 (UTC)
@Huhu9001: Yes, that's what {{ja-def}} is for. --Dine2016 (talk) 08:59, 6 May 2019 (UTC)

Only lemmatize rare or archaic readings at kana spellings[edit]

Hello everyone, I have added an option to the proposed vote to lemmatize only rare or archaic readings at kana spellings, per Shinji's comment above. Under this option, would retain only みず (mizu) (and the on'yomi  (すい) (sui)) as full-fledged entries, with the (mi) and もい (moi) readings moved to the hiragana entries even though they're frequently spelled in compounds. Similarly, would retain only わらべ (warabe), and would retain only わたし (watashi) and わたくし (watakushi). What do you think about this approach?

(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): also pinging @KevinUp --Dine2016 (talk) 06:19, 7 May 2019 (UTC)

is an example of what the new approach would look like. --Dine2016 (talk) 15:39, 7 May 2019 (UTC)
I think the entry unfortunately looks like a bit of a dog's breakfast, and the apparently arbitrary inclusion or exclusion of material impairs the usability. If we lemmatize wago at the kana spellings, it would be much more consistent and clear to users if we do that across the boards. The ware reading may be spelled more commonly as in modern texts, but the spelling is also in evidence, and it's listed in modern dictionaries. It's unclear to me why we wouldn't lemmatize at われ. Likewise for the wa reading, I'd expect to find the entry at instead of .
Ultimately, what we're doing in all of this is attempting to overcome the limitations imposed by the back-end structure on the one hand, and our usage practices on the other. Electronic Japanese dictionaries appear to use a biaxial indexing system, where a user can enter either the reading or the spelling, and be presented with a list of hits showing the intersection between that reading and all matching spellings, or that spelling and all matching readings. The MediaWiki back-end can't handle that as-is.
Technically speaking, it might be possible to make clever use of transclusion to reproduce the behavior of electronic Japanese dictionaries, allowing users to go to われ and see full information for all terms that have that reading, or go to and see full information for all terms that have that spelling -- without having to jump through the hoops of clicking through soft-redirect links. However, if this is even possible, any implementation of this approach would require changes in how we create and edit entries. ‑‑ Eiríkr Útlendi │Tala við mig 18:28, 8 May 2019 (UTC)
@Eirikr: Yes, you're right. Take a look at and you'll see the problem caused by redirecting kanji to kanji. Etymology 1 is really two words, but spitting them into two etymology sections would require extra mechanism to specify "which word in the lemma entry in intended" for each {{ja-see}}. This could be avoided by having kana as main hubs for wago. More importantly, having kana as wago paves the way for unified Japanese (cf. User:荒巻モロゾフ/draft). --Dine2016 (talk) 06:21, 9 May 2019 (UTC)
I think that if option 2 was implemented, it would be much harder to implement a unified Japanese approach for Japanese lemmas. Since Wiktionary is an etymological dictionary, I would prefer to see native Japanese words being lemmatized at their kana forms and Sino-Japanese terms lemmatized at their kanji forms. Korean has a category for Category:Native Korean words. Does such a category exist for Japanese words? KevinUp (talk) 06:36, 10 May 2019 (UTC)

@Eirikr I find my original proposal (Option 1) a failed attempt to limit the number of terms affected. For example, the word-formation of 青い is clear from its kanji spelling, yet it should obviously be lemmatized at あおい. I have changed Option 1 to lemmatizing all wago at kana; please take a look at the wording in the proposed vote and provide feedback. (For example, should proper nouns use the kana spelling?) --Dine2016 (talk) 15:47, 9 May 2019 (UTC)

Also pinging @KevinUp, Suzukaze-c. --Dine2016 (talk) 15:52, 9 May 2019 (UTC)

Yes, the current wording is much better now. With regards to proper nouns, I think the same rule can be applied, so  () (ほん) (Nihon) would be lemmatized at its kanji form while  () () (Fuji) would be lemmatized at ふじ instead.
However, for Japanese names, particularly given names, I would like to see all given names lemmatized using hiragana only (see also Wiktionary:Beer parlour/2019/February#Kanji compounds for Japanese given names). KevinUp (talk) 06:36, 10 May 2019 (UTC)
@KevinUp Why are you now suggesting to lemmatise  () (ほん) (Nihon) and  () () (Fuji) at kana, if they are not wago?! --Anatoli T. (обсудить/вклад) 01:46, 11 May 2019 (UTC)
@Atitarev: If you look at the etymology for  () () (Fuji), the term is actually derived from Old Japanese. As for  () (ほん) (Nihon), I mentioned that it would be lemmatized at its kanji form, not at its kana form. KevinUp (talk) 08:59, 11 May 2019 (UTC)

かとう is an example of how inflected forms would look like under option 1. --Dine2016 (talk) 11:30, 10 May 2019 (UTC)

Broadly speaking, I think that structure looks good.
At a finer-grained level, I don't agree with the "infinitive" POS label. For adjectives in particular, the -く and -う forms function as adverbs; there's nothing particularly infinitive about them. ‑‑ Eiríkr Útlendi │Tala við mig 23:16, 10 May 2019 (UTC)
I think it's good. —Suzukaze-c 07:34, 12 May 2019 (UTC)
As for inflections, I have thought that omitting kanji spellings would be easier to manage (as with いった). いった is an inflection of いう・いく・いる, and 云った is an inflection of 云う. But perhaps this isn't applicable anymore if we use {{ja-see}}. —Suzukaze-c 07:36, 12 May 2019 (UTC)

The vote has started. I would appreciate it if you could clarify your positions at your earliest convenience, so that I could know what steps to take next. @KevinUp, TAKASUGI Shinji, Eirikr, Atitarev, Huhu9001 --Dine2016 (talk) 04:57, 24 May 2019 (UTC)

Wikidata Lab XV: Lexicographic Data[edit]

Wikidata Lab XV - en.pdf

The fifteenth Wikidata Lab will occur on May 23, 2019. This time the event will address lexicographic data within Wikidata, also known as lexemes. The event is free and open to all, but by limit of vacancies, prior registration is required. The event will eventually be recorded and will be available in the NeuroMat youtube channel.

This will be the fifteenth activity of a series of trainings for integrating This is the fifteenth activity of a series of trainings for the integration of the projects Wikidata and Wikipedia (and now Wiktionary!). The presentations, photographs and impact reports of the first fourteen activities are available for consultation at Wikidata Lab I, II, III, IV, V, VI, VII, VIII, IX, X, XI, XII, XIII e XIV, respectively.

The "Wikidata Lab XV: Lexicographic Data" will occur in CEPID NeuroMat, at the w:en:Universidade de São Paulo, in May 23,(Thursday), from 12:30 to 19:30 UTC.

The activities will be conducted by the wikimedian Léa Lacroix, in english. The event is offered by the group Wiki Movement Brazil and by CEPID NeuroMat, with support from the Foundation for Research Support of the State of São Paulo (FAPESP).

More informations: Visit the event page and sign up. Ederporto (talk) 15:18, 2 May 2019 (UTC)

The event will take place in São Paulo, Brazil, a bit far for me. Also, the page announced as the “event page” characterizes its aim as “training for the integration of Wikidata with the Portuguese Wikipedia”, so it is not clear that this is of interest to this project.  --Lambiam 18:45, 2 May 2019 (UTC)
Hi, @Lambiam, as stated before, the event will be recorded (and maybe streamed!), so you will have full access to the presentation. As for the event page at the ptwiki, our main target is our local community, but of course, we believe, as the presentation will be in english, and involves lexicography, this community could make a good use of it, that's why I announced here. Ederporto (talk) 02:03, 3 May 2019 (UTC)

Polities[edit]

What's the difference between:

For example, where does prefecture belong? Ultimateria (talk) 20:03, 2 May 2019 (UTC)

All three are categories of categories only, so the entry prefecture belongs in none of them. But we also have Category:en:Administrative divisions for “English terms related to administrative divisions” and Category:en:Political subdivisions for “English terms for political subdivisions, such as provinces, states or regions”. Some guidance or a mini-tutorial on when to apply which seems in order. On Wikipedia, Political subdivision redirects to Administrative division, so that is no help. The intersection of Category:en:Administrative divisions and Category:en:Political subdivisions contains county, dependency, département (is that English?), oblast, prefecture, province, region, and state. The terms categorized as administrative divisions are types of (sub)divisions, whereas the terms categorized as political subdivisions tend to be the names of specific entities that are instances of such (sub)divisions. But, as the overlap shows, the distinction has not been rigorously maintained.  --Lambiam 16:53, 3 May 2019 (UTC)
Somewhat surprisingly, the terms municipality and voivodeship are in neither of these categories. I also see we have no entries at all for crown dependency and special administrative region. --Lambiam 17:08, 3 May 2019 (UTC)
I meant which English category does prefecture belong in. I would like to populate these categories in other languages, but as you point out, some guidance is needed. Could we merge them into a single category to contain instances of subdivisions (in the form of categories) and types of subdivisions (in the form of entries)? There's nothing intuitive about the current division. Ultimateria (talk) 00:37, 4 May 2019 (UTC)
Rather than merging them, it may be more useful and helpful to pick more evocative names for these categories, like Terms for administrative divisions and Names of political subdivisions. Renaming will also involve a bit of separating the sheep from the goats. There are currently twelve languages (Arabic, Chinese, Czech, German, English, Middle English, Esperanto, Ottoman Turkish, Polish, Serbo-Croatian, and Zhuang) for which we have a category “L2:Administrative divisions”. Each of those also has a category “L2:Political subdivisions”, but there are in total 144 of such categories. For languages using an alphabet and orthography that distinguishes minuscules (for common nouns) from Majuscules (for proper nouns), I think it is a safe bet that the minuscule minority are terms, not names.  --Lambiam 13:54, 4 May 2019 (UTC)
These are already set categories, which follows from their name, so they are already known to contain "terms for" things. The confusion arises because we need to differentiate types of administrative divisions from names for specific administrative divisions. However, this kind of confusion isn't specific to this case, there is also Category:en:Celestial bodies which could conceivably contain both planet and Jupiter, and really any other case where there are names for specific instances of something. More confusing is that we have both Category:English names and Category:en:Names. By our own category naming scheme, the first is a word-type category, in which the category name relates to the word itself (it contains words that are names), while the second is a set-type category, in which the category name relates to the referrent (the words refer to names but are not themselves names). It appears, though, that these two categories are used indiscriminately without any rhyme or reason. That should probably be sorted out. —Rua (mew) 20:31, 4 May 2019 (UTC)
So if we have two categories, one for types of boats and one for names for specific boats, wouldn’t it be a good idea to name these categories such that the names are suggestive of which is intended for which category, rather than using, say, Category:Ships and Category:Boats?  --Lambiam 07:33, 5 May 2019 (UTC)
We already have Category:en:Named roads, perhaps there is a case for a Category:en:Named boats? —Rua (mew) 10:25, 5 May 2019 (UTC)
And Category:en:Named administrative divisions to end the confusion?  --Lambiam 11:42, 5 May 2019 (UTC)
That seems like a good idea. Although we may not want to have Category:en:Named administrative divisions next to Category:en:Political subdivisions. We should use the same term, one with "named" and one without. —Rua (mew) 11:54, 5 May 2019 (UTC)

Lack of editors from certain countries[edit]

Just pondering: why do we have so few editors from certain ("First World") countries that we might expect: e.g. almost no Germans or Swedes (correct me if I'm wrong)? Does this suggest that their local Wiktionaries (de.wikt, sv.wikt...) are very good, or that these nationalities have less interest in working in English, or what? We seem to get much more of the Romance-language contingent, even in some cases from South America. Equinox 02:48, 5 May 2019 (UTC)

On what are you basing this assertion? I think we have several editors from Germany, they just aren't actively working on the German language. DTLHS (talk) 02:50, 5 May 2019 (UTC)
Yeah, you're right, I meant "lack of editors doing certain languages". I won't change the heading now because it would be rewriting history, but if you're not bothered, feel free to fix the question. Equinox 03:06, 5 May 2019 (UTC)
We have a number of very active IPs working on German. We did actually lose at least one good German editor over our refusal to treat any transparent multi-multi-whole-word German compounds as SOP, though their main focus was on other languages. As for Swedish, there are some familiar names at Category:User sv-N- both old and new. Chuck Entz (talk) 06:25, 5 May 2019 (UTC)
It would be interesting to get some stats on this. Reported languages vs actual edits, changes over time, IP contributions etc. I know that on the English Wikipedia some research has been done, analyzing contribution patterns of L2 English speakers. Can't find the link right now (using a search engine to find content about Wikipedia and not on Wikipedia is tricky…) – Jberkel 08:05, 5 May 2019 (UTC)

Bopomofo or Zhuyin?[edit]

Hey all. I was considering whether or not Template:zh-pron should call the 注音符號 system 'Zhuyin' or 'Bopomofo'. Here's the vitriolic discussion that happened about this in 2008 on Wikipedia- w:Talk:Bopomofo#Requested move (2008). Here's the recent discussion I have had Talk:邊. I would say change it to Bopomofo, but that's just me. I feel the term 'Zhuyin' has just not reached the level of acceptance that 'Pinyin' has. No need for rudeness or anything. It's just my opinion. See that talk page for more details on some of the back and forth. --Geographyinitiative (talk) 08:41, 5 May 2019 (UTC)

I agree with several Justinrleung's points and I think we should keep the name as is. I disagree with you about its currency or level of usage. 注音符號注音符号 (zhùyīn fúhào) or 註音注音 (zhùyīn), in English Zhuyin is the formal term, known in China as well. Bopomofo (ㄅㄆㄇㄈ) is an informal Taiwanese (only) name, for which there is not even a phonetic transcription exists in Mandarin. --Anatoli T. (обсудить/вклад) 09:48, 5 May 2019 (UTC)
Thank you for your reply. Let us be clear about what is English and what is Mandarin Chinese. Why would you say "Zhuyin" is the formal term for 注音符號 in English? Only because it is the formal term in Mandarin Chinese? Use of "Zhuyin" in English works is probably prominent in situations where English is a second language. The system has not been promoted in Mainland China for decades, and some might assume that 'Zhuyin' or 'Zhuyin Fuhao' is just the best way to render it for English. I do not accept this as a valid reason to change the English term to 'Zhuyin' from what it is, 'Bopomofo'. (Isn't the term 'Bopomofo' influenced by Hanyu Pinyin? If it were based on the romanizations used in Taiwan, wouldn't it be written something like "Pop'omofo"?) It is 100% irrelevant to this discussion whether or not "a phonetic transcription exists in Mandarin" for the term 'Bopomofo': this is English I'm talking about, not Mandarin Chinese. Also, ISO and Unicode call it 'Bopomofo'. The Taiwanese Phonetic Symbols are called 'Bopomofo Extended'. Also, see all the usages I mentioned on the other page. I definitely used 'Bopomofo' when talking about these symbols in English in the USA in 2006 or so. My goal in promoting this change is to help ENGLISH speakers who use Wiktionary to look up stuff about Chinese more readily understand what the symbols they are looking at are. They are probably only passingly familiar with the word 'Bopomofo', and will likely be completely unaware of the term 'Zhuyin'. All I'm asking is to conform to English language common usage. Mencius not Mengzi. Yangtze not Chang Jiang. Bopomofo not Zhuyin or Zhuyin Fuhao. (Also, if you are going to find examples against my argument, a use of 'Zhuyin Fuhao' doesn't count as a use of 'Zhuyin' to me.) --Geographyinitiative (talk) 10:44, 5 May 2019 (UTC)
Since Zhuyin/Bopomofo is not inherently a concept known to most native English speakers, it is hard to judge whether Zhuyin or Bopomofo is more formal in English; the formality would naturally be borrowed from Chinese, and there is some evidence for this (all emphasis mine):
  • Unicode Demystified: A Practical Programmer's Guide to the Encoding Standard, p. 377: "This alphabet is nicknamed 'Bopomofo'"
  • Learning Chinese: Linguistic, Sociocultural, and Narrative Perspectives, p. 293, footnote for bopomofo: "The colloquial term for 注音符號 (zhùyīn fúhào)"
While it is true that Zhuyin is probably a less common term than Bopomofo, this does not dismiss the validity of Zhuyin as an English word. Even though the use of Zhuyin by Chinese/Taiwanese users of English should not be dismissed, there is evidence of its use by non-Chinese/non-Taiwanese, as demonstrated by the quotes I have placed at Zhuyin. We're not forcing people to change Bopomofo to Zhuyin; it would simply be a choice of our dictionary. I would still choose Zhuyin over Bopomofo because it is the more formal option of the two.
(As a tangent, Bopomofo Extended refers to the Unicode block containing some extra symbols, including symbols only used in the Taiwanese Phonetic Symbols; the name does not refer to Taiwanese Phonetic Symbols, which also use symbols in the main block. The Bopomofo Extended block also contains symbols used in Hmu and Ge, which have nothing to do with Taiwanese.) — justin(r)leung (t...) | c=› } 07:33, 6 May 2019 (UTC)

Template:de-noun, Template:feminine noun of and confusion of sex and gender[edit]

The user Rua changed Template:de-noun to display "male .." and "female .." instead of "masculine .." and "feminine .." and changed Template:feminine noun of into a template displaying "female equivalent of ..". While doing so the user confused sex (or: natural gender, biological gender) and gender (or: grammatical gender), or the user lacks knowledge of the German language. The user's edits are incorrect, led to incorrect entries and the old versions of the templates need to be restored:
Ankläger m, Herausgeber m, Hersteller m for example are masculine but not necessarily male; Anklägerin f, Gegnerin f, Herausgeberin, Herstellerin f, Sammlerin f for example are feminine but not necessarily female. Examples can be seen in the entries. Rua's edit led to wrong statements like "male Ankläger", "female Anklägerin" and "female equivalent of Hersteller" which is even kind of ridiculous when the entry clearly shows that it is incorrect, like in Herstellerin: "female equivalent of Hersteller ... von der Herstellerin, der Firma Karl Zeiß ...". In it the word Herstellerin f is feminine but the referent is a sexless and not female Firma f. In general words with -in f are always feminine but the referent isn't necessarily female (it can be, but doesn't have to be), and similary words with -er m are masculine but the referent isn't necessarily male. --Majbef (talk) 14:36, 5 May 2019 (UTC)

Just because Anklägerin is not necessarily female doesn't mean it's not the noun used for a female referent. The template was renamed because there are languages that have pairs of nouns for male-female referents without having grammatical gender, languages that have gender but do not distinguish male and female in their gender system, and specific words where grammatical gender disagrees with natural gender. English is an example of the first, Dutch and Scandinavian examples of the second, and Old English wīf an example of the third. For such languages, the terms "masculine" and "feminine" are inappropriate. See also WT:RFDO#Male and Female categories. —Rua (mew) 14:46, 5 May 2019 (UTC)
It obviously does mean that your edits are misleading as they make wrong implications. They are also wrong as Herstellerin for example isn't only female as it now states ("female equivalent of Hersteller") because of your edit. And they are also wrong as the -in-terms aren't the only terms used for a female referents: The -er-terms can refer to females too, especially as ..er m pl refering to both sexes, ..er beiderlei Geschlechts refering to both sexes, männliche und weibliche ..er refering to both sexes, and weibliche ..er refering to the female sex only.
For genderless languages or differences as in case of wer m and wīf n other parameters/templates would be needed instead of changing templates and making entries wrong. --Majbef (talk) 15:09, 5 May 2019 (UTC)
If Herstellerin is wrong, then how would you improve it without messing with the templates? —Rua (mew) 15:18, 5 May 2019 (UTC)
It is wrong as can be seen by the definition (clearly stating a sex) and the example (refering to a sexless entity), so it's not "If ..., then how ..." but "As ..., how ...". You messed with the templates, so you should answer that and correct that entry and all other entries you made wrong. The simpliest way I see would be to differ between "feminine equivalent of .." of the original template which fits here and a new "female equivalent of .." which could for example be used in stewardess pointing to steward. --Majbef (talk) 15:42, 5 May 2019 (UTC)
I asked you how you would improve it without messing with the templates, can you give an answer to that? If you only keep hammering on the template being "wrong" we're not going to get anywhere. Renaming the old template back would reintroduce the old problems which led to me renaming it in the first place. You need to explain better why the grammatical gender of these words matters when they are referring to things with no natural gender. —Rua (mew) 16:19, 5 May 2019 (UTC)
Personally I don't see an issue with using "feminine" in languages like English without gender, and would prefer "feminine" over "female". To me, "feminine" is ambiguous between natural and grammatical gender, while "female" can only imply natural gender. But if the terminology is an issue, I think the correct thing is to create separate templates {{female equivalent of}} and {{feminine equivalent of}}. Benwing2 (talk) 16:34, 5 May 2019 (UTC)
The question that begs answering, though, is whether the cases of grammatical equivalent genders require the existence of equivalent neuter forms as well in languages that have a neuter gender. German is one such language, and it was exactly this objection I raised in the RFDO discussion I linked. If a word like Herstellerin is required to be used in reference to something with no natural gender but feminine grammatical gender, then what happens when that noun is neuter instead? Or cases where the two types of genders mismatch? Is Weib a Hersteller or Herstellerin? Would a neuter noun similar to Firma be a Hersteller or Herstellerin? —Rua (mew) 16:43, 5 May 2019 (UTC)
"feminine equivalent of" was correct, and you made it and entries wrong instead of correcting it. So changing it back would fix the errors you introduced.
As Klägerin f shows, the term can also be used in reference to a neuter noun (Unternehmen n) or a masculine noun (Verlag m) (Maybe because of some kind of constructio ad sensum, connecting it with Firma f or Unternehmung f). But it doesn't matter: In both cases (Unternehmen + Hersteller and Unternehmen + Herstellerin), "Herstellerin: feminine equivalent of Hersteller" is correct, while "Herstellerin: female equivalent of Hersteller" is obviously incorrect as a Herstellerin f isn't necessarily female. --Majbef (talk) 22:01, 5 May 2019 (UTC)
The example you just gave shows otherwise, that grammatical gender is irrelevant for what Klägerin can refer to. I think I'll wait for a German native speaker with more Wiktionary experience to give input, as I have trouble making any sense of your arguments. —Rua (mew) 22:09, 5 May 2019 (UTC)
The example does show that "Herstellerin: female equivalent of Hersteller" is incorrect and that "Herstellerin: feminine equivalent of Hersteller" is correct. Herstellerin f is always feminine - regardless of it refering to a masculine, feminine or neuter noun. Hence it's a feminine equivalent to (the masculine) Hersteller m. The referent however isn't always female - it can also be sexless. Hence it's not a female equivalent to (the sexless, male and/or female) Hersteller m. Emphasising the sex, "Herstellerin: sexless or female equivalent of Hersteller" could be correct, but that's not what the template displays and what the template was for. --Majbef (talk) 22:23, 5 May 2019 (UTC)
That just shows that the current entry is incomplete and is missing senses. But you didn't give any suggestion for how to improve the entry otherwise, when I asked for it, so I don't know what you're after. Again, I'm going to wait for someone more experienced with German on Wiktionary who can explain the situation better. —Rua (mew) 22:36, 5 May 2019 (UTC)
The entries were correct and didn't miss any senses, you made them wrong or "incomplete". So it would be your task to fix them. Nontheless I made two suggestions of how to fix them: Firstly, change the template back to when it was correct, and secondly, optionally create a new "female equivalent of"-template which can for example be used for English. --Majbef (talk) 22:49, 5 May 2019 (UTC)
We're just going in circles here. @Fay Freak can you offer anything? —Rua (mew) 22:52, 5 May 2019 (UTC)
Rua confused nothing. She solved a technical problem. The templates are there to show pairs in a typified fashion. As rightly noted, “Herstellerin” is the female equivalent of “Hersteller”. Addingfemale Hersteller” to Herstellerin is confused and says nothing to the reader, at the best, possible even confuses him, and it’s wrong since it is the male equivalent, according to our system of displaying information. If a reader does not know that a GmbH is treated like a female in German optionally, in accepting the -in suffix and else gendered forms of nouns (“die Beklagte” and “der Beklagte”), this is not the place to tell him this. Fay Freak (talk) 23:13, 5 May 2019 (UTC)
Duden/de.wikt just say "weibliche Form zu Hersteller". In German weiblich can mean both female and feminine, so the problem doesn't arise there. Technically "feminine of" seems to be correct in all cases, but I don't think "female of" would be massively confusing. Perhaps the template be changed so that it outputs a different text based on the language? And why do we have {{masculine noun of}} ? – Jberkel 00:00, 7 May 2019 (UTC)
Seems to demand a change too. Fay Freak (talk) 13:32, 7 May 2019 (UTC)
"“Herstellerin” is the female equivalent of “Hersteller”" isn't correct or "incomplete" as Rua called it: A Herstellerin f is feminine (gender) but not necessarily female (sex). And the example in the entry does clearly show that the definition is now(!) wrong - at least if the reader has some knowlegde of German to understand the example and does know that a company is a sexless thing. "“Herstellerin” is the feminine equivalent of “Hersteller”" on the other hand is correct and complete, it doesn't lack any senses.
As for the adding part: It was a typo for m= (masculine), which would be correct. In case of f= (female) it would be Hersteller or Herstellerin and not only Hersteller.
As for the GmbH part: The entry GmbH is not the place to tell the reader that a GmbH f is sexless but can be referred to with feminine (not female!) words like Beklagte f. However, the entry Beklagte f is a place to tell the reader that the noun is feminine (not female!), or that the referent can be sexless or female (and not only female!). To tell the reader that a Beklagte f is only female (sex) is wrong (or "incomplete"). --Majbef (talk) 07:51, 11 May 2019 (UTC)
It didn’t tell this though. It’s your exaggerated interpretation. This is not what “female” means here: nowhere the wording excluded that the word can be used for a GmbH. As I said, the way information is presented is typicized. And if a reader does not know that such pairs can be employed depending on conjecturing natural gender from grammatical gender of a company, there is no place, no entry to tell him. It is a general phenomenon that is to be read in grammars but not here. However the distinction in issue is basically a sex or natural gender distinction and not a grammatical gender distinction, as can be seen by there not being a “neuter equivalent”. All these words are used primarily for the roles of humans, hence the distinction is after the natural gender. The distinction applied for corporations is secondary and optional. It might be incomplete to feature the distinction so as a role split but it is a distortion to mark it as a grammatical gender distinction.
Der Eintrag hat selbiges indes nicht behauptet. Es ist von dir hineingelesen, eine überzogene Deutung. Das ist nicht, was »weiblich« hier bedeutet. Nirgendwo schloß der Wortlaut aus, daß das Wort auch für eine GmbH genutzt werden kann. Wie ich bereits gesagt habe, die Mitteilungen werden auf eine typisierte Weise gemacht. Wenn ein Leser nicht weiß, daß solchergestalt Paare in Abhängigkeit eines durch grammatisches Geschlecht ersonnenen natürlichen Geschlechtes angewandt werden können, gibt es keinen Ort, keinen Wörterbucheintrag, um ihm davon zu berichten. Es ist eine allgemeine Erscheinung, die in den Grammatiken nachgelesen werden kann, jedoch nicht hier. Die Unterscheidung freilich ist ihrem Grunde nach eine Unterscheidung nach dem natürlichen Geschlechte, nicht nach grammatischem Geschlechte, wie man daran sehen kann, daß es keine »sächliche Entsprechung« gibt. Alle diese Wörter werden vorzüglich für Rollen von Menschen gebraucht, daher ist die Unterscheidung nach dem natürlichen Geschlechte. Die Unterscheidung bei Gesellschaften ist nachrangig und freigestellt. Es mag unvollständig sein, die Unterscheidung als eine solche Rollenteilung darzustellen, doch ist es eine Verdrehung, sie als eine solche des grammatischen Geschlechtes zu kennzeichnen. Fay Freak (talk) 01:46, 12 May 2019 (UTC)

Japanese verb: both -suru verbs and irregular verbs are put into Category "type 3"[edit]

Why not separate them? -- Huhu9001 (talk) 05:02, 6 May 2019 (UTC)

Symbol support vote.svg SupportSuzukaze-c 22:50, 6 May 2019 (UTC)
  • @Huhu9001, if memory serves, this is because English-language materials for teaching Japanese often classify する (suru) and  () (kuru) as "type 3" or "group 3" or something similar. Example.
What would you suggest? And what verbs besides する (suru) and  () (kuru) would be affected? ‑‑ Eiríkr Útlendi │Tala við mig 17:14, 7 May 2019 (UTC)
@Eirikr: This affect する and all suru verbs like 散歩. くる is not affected. -- Huhu9001 (talk) 03:13, 8 May 2019 (UTC)
Concerns that arise:
  • This deviates from English-language teaching materials, and is thus likely to confuse (some of) our readership.
  • This would result in  () (kuru, to come) being our only expected "type 3" verb.
  • However, it appears that our "type 3" categorization has itself been ... irregular, as our "type 3" category contains many things typically excluded in common English-language teaching materials. I also see a few entries that are Classical, not modern Japanese (居り (ori), (su), そうず (sōzu), (sōrou)). See also w:Japanese irregular verbs.
My growing sense is that the current approach -- which only removes する verbs from "type 3" -- is mistaken and insufficient. する is indeed a truly irregular verb, as is 来る, and these two should be treated as such. This would mean that both are "type 3", according to common English-language materials for teaching Japanese. Creating a sub-category under Category:Japanese type 3 verbs for just the する verbs (of which there are many) would strike me as a better approach than removing する verbs from this category altogether.
Widening the scope, there are various other words that could be treated as "irregular verbs", some of which are already in our Category:Japanese type 3 verbs despite not being included in "type 3" . Some are classed as 助動詞 (jodōshi, literally helper verb) in common Japanese grammars, although this is historically a grab-bag of grammatical oddments, including things like べし (beshi) that inflect as adjectives, and things like (da) that is essentially a modern invention cobbled together from disparate pieces. We also have the honorific verbs おっしゃる (ossharu), ござる (gozaru), なさる (nasaru), and the like that are mostly regular, only evincing some minor irregularity in one conjugation stem, where the expected -ri ending instead becomes -i.
Considering all of this, I really think that we need to have a broader conversation about how we (Wiktionary) want to treat the less-regular verb classes as a whole -- and whether this categorization system should apply just to the modern language, or to Classical and OJP as well. ‑‑ Eiríkr Útlendi │Tala við mig 17:56, 8 May 2019 (UTC)
@Eirikr: One solution to the current problem I suppose is to modify Module:ja-headword to remove all irregular verbs except くる from Category:Japanese_type_3_verbs, and then Category:Japanese_suru_verbs can be made a subcategory of it. Also it is better to have the label of type 3 verbs changed from "irregular" to "type 3".
I agree that it is better to separate "classical" or older conjugations from modern ones. But there seems to be no consensus on this. -- Huhu9001 (talk) 03:50, 10 May 2019 (UTC)
@Huhu9001: <nods/> A follow-on suggestion, then:
  • Do as you suggest and reduce the current "type 3" category to 来る and compounded verbs at the top level of the category, with the する verbs included as a sub-category within "type 3".
  • Also create a category for irregular verbs more broadly, under which "type 3" would itself be a sub-category. This broader category could include the honorific verbs with the -i endings where -ri would be expected, which verbs are used in the modern language, and the oddball copular verb (da), among others.
  • Talk further about how to handle Classical forms. (I see now that multiple editors appear to be treating Old Japanese as a separate language, for entry-structure purposes, which is great by me.)
いかがでしょう (Ikaga deshō ka?, How would that be?) ‑‑ Eiríkr Útlendi │Tala við mig 23:13, 10 May 2019 (UTC)
@Eirikr: So any idea on the categories of old conjugations? -- Huhu9001 (talk) 10:41, 17 May 2019 (UTC)

Remove all Unihan definitions and adjust reference header of CJKV translingual section from L4 to L3[edit]

By now, it is well established that there are many errors in the Unihan database such as non-existent readings and inaccurate definitions. The source of these definitions is not stated by Unihan and editors working on CJKV entries would occasionally send some of these definitions to RFV. I think it would be better to remove them altogether or at least hide them using comments: <!-- Unihan definition -->.

Note that CJKV translingual entries should not have definitions lines so editors would usually move these definitions to the appropriate language section. However, Unihan definitions tend to conflate Chinese and Japanese meanings together, especially those of flora and fauna, so I don't think it is a good idea to use these definitions.

On the other hand, the CJKV translingual section has references listed at L4 level under "Han character". I find this to be a little odd, since most other languages have references at L3 level. I would like to request for these references to be adjusted to L3 level rather than L4 level for uniformity.

Of course, the following tasks: (1) hiding or deleting translingual definitions from Unihan and (2) adjusting reference headers from L4 to L3 would involve the use of bots. KevinUp (talk) 10:03, 6 May 2019 (UTC)

Symbol support vote.svg Support Symbol support vote.svg Support Symbol support vote.svg SupportSuzukaze-c 22:47, 6 May 2019 (UTC)
Pinging @Bumm13, Dokurrat, Geographyinitiative, Justinrleung for comment: KevinUp (talk) 10:59, 10 May 2019 (UTC)
Support tentatively. I don't think it's a good idea to remove the definitions completely. They need to be moved with {{attention|zh|moved from Translingual, please verify}} to the Chinese section. The conflation happen, I think, because some editors just copied definitions from Japanese dictionary, not sure these errors happen a lot in the Unihan database. {{ping}} doesn't work without the signature in the same edit. --Anatoli T. (обсудить/вклад) 09:09, 10 May 2019 (UTC)
Yes, we could hide the translingual definitions using comments such as <!-- Unihan definition --> or using {{attention|zh|Unihan definition}} so that editors working on converting the old Mandarin/Cantonese sections to unified Chinese would still be able to see them. KevinUp (talk) 10:58, 10 May 2019 (UTC)
Responding to the first point, it sounds like a good idea to have the 'references' all on level 3, but I think a thorough understanding of the original reason for putting them in level 4 should be made before the change is made. Responding to the second point: Here's what I do. When I see definitions in the translingual section of a ckjv character and I am interested, I will check a dictionary to see if they are similar to the Chinese definitions. If they are, then I move them to the Chinese section. It is definitely a mistake to move them blindly into the Chinese section. But I don't know if hiding them is the answer either. I think leaving them there kind of alerts me that the entry is still in a primitive stage of development. If the unihan definitions are hidden, then I have no base from which to start understanding the characters in the way that almost every other Chinese-English dictionary understands the characters. Not really vehement for or against these proposals. --Geographyinitiative (talk) 11:35, 10 May 2019 (UTC)
We do have several maintenance categories such as (1) Category:Mandarin Han characters, (2) Category:Cantonese Han characters, (3) Category:Requests for definitions in Mandarin entries, (4) Category:Requests for definitions in Cantonese entries, (5) Category:Requests for definitions in Chinese entries to keep track of Han character entries that are still in a primitive stage of development. I think the Unihan definitions are not that reliable when it comes to rare or archaic Han characters, so it would be better to hide them for the time being. KevinUp (talk) 12:29, 10 May 2019 (UTC)
I'd support hiding the definitions in translingual and putting an attention template. For the references section, I would also support the move to L3; I don't see a reason for it to be in L4. While we're at it, I would probably also go as far as changing it from "References" to "Further reading" because we don't actually have much in the translingual section that refers to the listed sources (probably except the Unihan database page). — justin(r)leung (t...) | c=› } 14:54, 10 May 2019 (UTC)
I think having straightforward definitions in the translingual section is a mistake, more often than not: no one ever speaks or writes in Translingual, just Chinese, Japanese, etc. There are definitely common semantic threads, but IMO the definitions should be more like those in root entries for Semitic languages (see ש־ל־ם‎ for an example). Chuck Entz (talk) 18:33, 10 May 2019 (UTC)

Change Proto-Slavic notation to consistently use haček for all cases of iotation[edit]

Right now, we mix two different notations for Proto-Slavic consonants which result from the process known as iotation (triggered by a following j). We use the haček for the letters č, ď, š, ť, ž, but a following j for the cases of lj, nj and rj. All of these were single phonemes in Proto-Slavic, and remain so in most of the modern languages. Derksen's notation uses č, š, ž with haček, ļ, ņ, ŗ with a comma below, and dj, tj with following j, which is even less consistent. Czech and Slovak orthography, on which the use of hačeks for Proto-Slavic is based in the first place, use the haček consistently for all of these cases, thus č, ď, ľ, ň, ř, š, ť, ž (although their ď and ť are not reflexes of the Proto-Slavic equivalents). I propose that we consistently use the haček for all consonants that result from iotation, thus renaming our existing lj, nj and rj to match the ľ, ř and ň of Czech and Slovak orthography. —Rua (mew) 17:46, 6 May 2019 (UTC)

Symbol support vote.svg Support – Háčky are easy to type and easier to read than digraphs. Digraphs should be avoided when creating orthographies. If one has digraphs now this is carried over from the past when one was not creatively or technically capable, that is for Proto-Slavic one used what one found easy to print or to type. Fay Freak (talk) 13:29, 7 May 2019 (UTC)
Whether they are easy to type depends on the platform one uses. For me, the easiest way (via the Latin/Roman edit panel) is awkward.  --Lambiam 17:45, 7 May 2019 (UTC)
You have to use that panel anyway for the yers, nasal vowels and other hačecked letters, so adding three more is not going to make a difference. —Rua (mew) 18:06, 7 May 2019 (UTC)

Pinging editors who have recently edited Slavic entries: @Benwing2, Useigor, Bezimenen, Kwékwlos, Greenismean2016. —Rua (mew) 20:08, 9 May 2019 (UTC)

Symbol support vote.svg Support – More convenient. Kwékwlos (talk) 20:09, 9 May 2019 (UTC)
Symbol support vote.svg Support – Looks better to me. Although the "hard to type" argument holds some weight; haceks are easy to type on the Mac keyboard using U.S. Extended aka ABC Extended, but I'm not sure about the PC. Benwing2 (talk) 00:18, 10 May 2019 (UTC)
But again, Slavic has so many other special characters, adding three more isn't going to matter. I can type all the hačeks and ogoneks on my keyboard (Linux US international) but for the yers I still have to use the special characters panel. —Rua (mew) 09:52, 10 May 2019 (UTC)

<s(s)> pronounced /z/[edit]

Joseph, deserve, dessert... since it is a lexical issue, it would be quite useful to create a category for <s(s)> pronounced as /z/, even if as a variant --Backinstadiums (talk) 03:19, 11 May 2019 (UTC)

Is it lexical? Why this particular situation, and not "X pronounced Y" for any other X-Y pair? English spelling and pronunciation allow all sorts of odd combinations of these; see ghoti. Equinox 03:20, 11 May 2019 (UTC)
@Equinox: This one clearly exceeds the rest by far and is not a diagraph, except for <ss> --Backinstadiums (talk) 15:40, 12 May 2019 (UTC)

Template:nonlemma[edit]

User:Rua recently replaced the etymology section of housen with Template:nonlemma, and later informed me in an edit summary that it's "the standard practice". If so, I think it's a remarkably poor practice. The link in the template leads not to the lemma entry, but rather to Wiktionary:Lemmas. I think in general we should avoid linking from the mainspace to "behind-the-scenes" Wiktionary namespace pages, and this template in particular is quite confusing as a reader would expect the link to take them to the page that actually has the etymology. What do others think? —Granger (talk · contribs) 14:08, 11 May 2019 (UTC)

We shouldn't give, and in fact in general never have given, etymologies for every inflection of a lemma. The template {{nonlemma}} reflects that. Instead, the etymology is to be found on the lemma form, which is given in the definition and therefore does not need to be repeated in the etymology. —Rua (mew) 14:13, 11 May 2019 (UTC)
I agree that we don't and shouldn't give etymologies for every inflection. But where appropriate, it's fine to give etymologies for nonstandard, irregular, or otherwise interesting inflections. The edit that started this replaced a perfectly good etymology with a confusing and (IMO) unhelpful template. —Granger (talk · contribs) 14:25, 11 May 2019 (UTC)
You seem to think that I'm objecting to giving an etymology. What I'm objecting to is giving an etymology on a non-lemma. Note that the lemma house contains the exact same etymology, making it redundant to list it again on the non-lemma. —Rua (mew) 15:58, 11 May 2019 (UTC)
Please read my comments more carefully. I understand that you object to giving an etymology on a non-lemma. But you haven't given any reason for that objection other than the straw man of giving "etymologies for every inflection of a lemma".
What I'm objecting to is the confusing and non-user-friendly link in the template. In my opinion leaving out the etymology section in the housen entry would be preferable to adding such a confusing template. But I think the best options are (a) keeping the etymology as it was before this disagreement began, or (b) a short sentence that links to house (something like, "See house."). —Granger (talk · contribs) 23:16, 11 May 2019 (UTC)
This should be obvious from the definition line, which already links to house. We should not repeat the lemma in the etymology. —Rua (mew) 12:06, 13 May 2019 (UTC)
Of course it's obvious to you. I don't think it's obvious to a casual dictionary user, and what makes matters much worse is that the link doesn't go where one would expect it to. Even just removing the link from the template would be an improvement. —Granger (talk · contribs) 14:01, 13 May 2019 (UTC)
As a general rule, I leave nonlemma forms without an etymology. I don't see how the {{nonlemma}} template adds any useful information, regardless of whether it links to the lemma or to Wiktionary:Lemmas. The lemma is already found in the definition. Having to redundantly specify the lemma in the call to {{nonlemma}} will be a bit painful (esp. since nonlemma entries link to multiple lemmas) and IMO not terribly useful. Benwing2 (talk) 02:20, 12 May 2019 (UTC)
I use it so that the etymology section isn't empty. Leaving the section empty is ugly; it says to the user "this is where the etymology goes, but now we're not giving you any". Again, this is because our entry layout is broken, as I mentioned at the end of last month. The template is one of the kludges I use to deal with the broken layout, and was introduced in Wiktionary:Beer parlour/2016/May#Template:nonlemma, because another proposal to make the entry layout more sensible failed. The proper fix is to eliminate etymology sections where there is no etymology, but until then {{nonlemma}} is a good stopgap. It also signals to bots which etymology section to add inflections to. —Rua (mew) 11:18, 12 May 2019 (UTC)
I think I've only encountered this at non-lemma verb forms like, say, "flying", where it seems annoyingly redundant to repeat the material from "fly". It doesn't seem to add much, as Benwing2 remarks. Equinox 11:07, 12 May 2019 (UTC)
Yes, in entries like those I would suggest not including an etymology section at all, which seems to be what others in this discussion are advocating. In rare cases like housen I think it can be useful to give an etymology at the nonlemma (or at least a link to the lemma's Etymology section), but no etymology section at all would be better than such a confusing template. —Granger (talk · contribs) 12:51, 12 May 2019 (UTC)

There seems to be wide agreement that the template in its current form is not good. What's the next step? RFDO? Rewriting the template to be useful? Orphaning it by removing the Etymology section from articles that use the template? —Granger (talk · contribs) 12:03, 23 May 2019 (UTC)

  • I would not agree with the statement that no etymology should be given for a non-lemma form. It is fine to say that the etymology for "eating" is "eat" + "-ing". It would not be good to repeat the etymology of "eat" at "eating" though. Ƿidsiþ 12:39, 23 May 2019 (UTC)
    • @Widsith What would you propose we do with cases where the lemma has an ending but the nonlemma doesn't, like sluit or open? —Rua (mew) 13:25, 23 May 2019 (UTC)
      • I'm not proposing anything for such cases. Ƿidsiþ 13:54, 23 May 2019 (UTC)
    • If you have etymologies for inflected forms, you're basically turning the categories for the inflectional morphemes into categories for the inflections. I don't want all the present participles and gerunds to swamp out interesting cases like the -ing in building. Chuck Entz (talk) 13:38, 23 May 2019 (UTC)
      • Well, I mean that's the problem with categories in general, isn't it? You can hardly restrict their use to only those examples that you personally find interesting. Ƿidsiþ 13:54, 23 May 2019 (UTC)
        • In this case, you can. I think we should avoid etymologies in non-lemmas that only state the obvious, morphologically speaking. If you know that a form is the present participle of an English verb and you know anything about English grammar, you already know that it's got an -ing ending tacked on. Etymologically, its SOP. I have no problem with providing etymologies in non-lemmas when there's something like suppletion that isn't predictable from straightforward application of well-known morphosyntactic rules, but I don't see the point in having a list of every English word that ends in "-ing". There's no useful information- it's just clutter. It would be like having a Derived terms section in the have entry with the compound-tense forms of every English verb. Chuck Entz (talk) 02:40, 24 May 2019 (UTC)
          • OK. Well, I disagree. If someone wants to specify these etymologies, they are clearly correct and I can't see any justification to remove them. Ƿidsiþ 10:48, 25 May 2019 (UTC)
  • I think that the etymology sections should be removed from any straightforward, obvious non-lemma entries. If altered to redirect to the most relevant entry, however, this template could be useful for non-lemmas in entries—such as abak in Polish—that contain both lemma and non-lemma (with obvious etymologies) definitions to avoid blank etymology sections. Another application for the template could be to direct the reader to the non-lemma entry with the etymology for non-obvious non-lemmas that are obviously related to another non-lemma—like tygodnie, tygodni, et cetera to tygodnia, whose lemma is tydzień. Maybe format it like this: nonlemma|lang|entry . By the way, I think the etymology of housen should stay as it is unusual in English and would not be obvious whether it was inherited from an inflection of its predecessors or occurred due to some alternative dialectal construction of plural forms. İʟᴀᴡᴀ–Kᴀᴛᴀᴋᴀ (talk) (edits) 01:23, 1 June 2019 (UTC)
  • There seems to be consensus that this template is not useful in its current form, and no one has offered to rewrite it to make it useful. Therefore, I suggest going through and removing the "Etymology" section from all entries where the section only contains this template. —Granger (talk · contribs) 00:20, 6 June 2019 (UTC)

Changing {{doublet}} to take multiple terms[edit]

User:Metaknowledge recently suggested to me in an edit summary that {{doublet}} should allow multiple terms to be specified, so you can easily say "doublet of foo and bar" or "doublet of foo, bar and baz". I'm wondering what people think of this. Currently, to specify this, you have so say something like {{doublet|en|foo}} and {{doublet|en|bar|notext=1}}, which is ugly and gets uglier if you need to specify 3 or more terms. The most logical way to make the change is to use the structure {{doublet|LANG|TERM1|t1=TERM1-DEFN|alt1=TERM1-ALTTEXT|TERM2|t2=TERM2-DEFN|alt2=TERM2-ALTTEXT}}, which is a radical departure from the current structure, which looks like {{doublet|LANG|TERM|ALTTEXT|DEFN}}. (In truth, however, more than 95% of calls to {{doublet}} look like {{doublet|LANG|TERM}}, which wouldn't change.) What do people think? Benwing2 (talk) 02:30, 12 May 2019 (UTC)

I support this idea because I've added {{doublet}} a lot and encountered quite a few entries where this could be used. In Appendix:English doublets, there are quite a few lines in the various tables with triplets and quadruplets, and even some quintuplets and sextuplets. These have to be linked with separate templates at the moment; I usually use {{doublet}} and then {{m}}. It would be nice to be able to link them in the same template. — Eru·tuon 02:38, 12 May 2019 (UTC)
Good. Don’t know though when one shall say “triplet” or if templates should be able to do that or if one should categorize triplets, quadruplets. Seems like for Arabic for now I have only used the word “triplet” once, on عُقَار(ʿuqār). Or maybe triplets being unlikely is even more a reason to collect them. Fay Freak (talk) 12:19, 12 May 2019 (UTC)
This is implemented, and I've switched the format of all uses. Benwing2 (talk) 02:05, 13 May 2019 (UTC)
Created a list of probable doublet lists that can be merged into a single instance of {{doublet}}. — Eru·tuon 03:21, 13 May 2019 (UTC)
@Erutuon Thanks. The list looks good, I'll get to work on it. Benwing2 (talk) 13:49, 13 May 2019 (UTC)
@Erutuon Done. Benwing2 (talk) 00:24, 14 May 2019 (UTC)

will be needed to replace[edit]

Is the following construction grammatically correct (meaning "will need to be replaced")? Each such device will be needed to replace sometime in the near future. If so, is such structure specified in the entry of need? --Backinstadiums (talk) 16:42, 12 May 2019 (UTC)

It's wrong. Equinox 16:44, 12 May 2019 (UTC)
I'd say wrong. DCDuring (talk) 17:54, 12 May 2019 (UTC)
Ugly syntax sounds as if it were written by a non-native speaker. SemperBlotto (talk) 05:33, 13 May 2019 (UTC)

hour meaning time outside poetic contexts[edit]

For example in a sentence such as "She couldn't avoid calling him to share the news, despite the hour". Yet hour specifies that meaning as poetic --Backinstadiums (talk) 18:52, 12 May 2019 (UTC)

In your sentence, it means "despite how late it was" (i.e. she called even though it's rude to call at night, when people are sleeping). That's not the same as the poetic sense. Equinox 19:04, 12 May 2019 (UTC)
Well, I suppose that's what you're saying, but really it's "despite the clock time", the actual hour, sense 1. Equinox 19:04, 12 May 2019 (UTC)
In this, your hour of need, you should have recourse to WT:TR or WT:ID or, yea, Google "the hour of" (BooksGroupsScholar). "I call on the colectivos; the hour of resistance has arrived, active resistance in the community,"
Hour ("time") seems to me a bit rhetorical, literary, but it can be found in newspapers and in religious works ("now and at the hour of our death. Amen"). It is very common in book titles. DCDuring (talk) 19:47, 12 May 2019 (UTC)

Thesaurus - translations to other languages or not?[edit]

There is a bit of a disagreement going on at Thesaurus:juoppo - it is a Finnish word and has Finnish synonyms, but recently User:Paradoctor added German words there as well, despite the fact that juoppo is not a German word. The policy does not seem to, based on a quick look, clearly indicate whether this is allowed or not. — surjection?〉 20:23, 12 May 2019 (UTC)

For the record, my personal opinion is that the German words should be under a Thesaurus page for a German word. — surjection?〉 20:24, 12 May 2019 (UTC)
Wiktionary:Thesaurus#Multilingualism: "English Wiktionary Thesaurus is multilingual"
In light of that, I don't see how one could reasonably exclude synonyms in other languages, when they exist in them.
Mainspace lists meanings in other languages, so why should the thesauraus not list synonyms in other languages? Paradoctor (talk) 20:28, 12 May 2019 (UTC)
English Wiktionary itself is multilingual too. That doesn't mean you see German synonyms in Finnish entries. They are in German entries. —Rua (mew) 20:31, 12 May 2019 (UTC)
We don't put German synonyms in Finnish dictionary entries, so the same treatment applies to Thesaurus pages as well. Doing otherwise would ultimately result in a huge criss-cross of redundant synonyms with every language pointing to synonyms in every other language. We limited translation to English entries for the same reason, so English is to be the "synonym hub": all terms have translations into English, and all English terms have translations into all other languages. —Rua (mew) 20:30, 12 May 2019 (UTC)
Yeah, German words go under their own headers and not under Finnish. Fay Freak (talk) 20:37, 12 May 2019 (UTC)
That being said, it might become a thing to link to Thesaurus pages of other languages with related meaning, provided that the format is neat. This is from a marketing staindpoint, for leading the user further within the website, and it isn’t too farcical if we aim at polyglots. If someone is interested in Ukrainian synonyms he might be interested in Russian synonyms, if someone is interested in Catalan words he might be interested in Spanish words, if someone is interested in Ottoman words he might be interested in Persian and Arabic words. No human resources to maintain thesauri so though, this will stay a thought for the next years. We have more problems in attracting than in keeping users. Fay Freak (talk) 02:26, 13 May 2019 (UTC)
You have to consider what it looks like when every language links to the equivalent Thesaurus page of every other language. Then someone adds a new Thesaurus page to the list of one of the pages, but not the others. It will all go horribly out of sync after a while and become an unmaintainable mess. We disallow Translation sections for all languages other than English for this reason, too. Allowing them would create that same inconsistent unmaintainable many-to-many relation. —Rua (mew) 12:03, 13 May 2019 (UTC)
So we gonna link to Thesaurus pages in other languages only on English thesaurus pages? 🤔 Fay Freak (talk) 00:41, 14 May 2019 (UTC)
Only English entries have Translations sections. Translations of synonyms are translations too. —Rua (mew) 08:53, 14 May 2019 (UTC)

Pali Verb Roots[edit]

Do Pali verb roots merit entries in different scripts? The usual lemma form of Pali verbs is the 3s of the present active. I can't find any evidence of native traditions of *Pali* verb roots, and the PTS dictionary attempted to work in terms of equivalent Vedic Sanskrit roots. If Pali roots do actually merit entries, I think they are an abstraction for the benefit of Wiktionary users, and should therefore only be in the Latin script. RichardW57 (talk) 21:18, 12 May 2019 (UTC)

Literary Chinese or Korean[edit]

Following an extremely lengthy conversation at User talk:B2V22BHARAT#囧, I feel compelled to ask, what does the community feel about the following quotations. Are they considered literary Chinese or Korean? Should they be placed under the Korean section or under the Chinese section? Also, what is our criteria for hanja? Is hanja confined only to those used in the Korean language (한국말 (han-gungmal))? Are characters used in literary Chinese texts written by Korean scholars considered hanja as well?

(1) Literary Chinese quotations written by Korean scholars added by me to Korean hanja entries:

Extended content
  1. 1818, Jeong Yakyong, 정약용(丁若鏞), “율기(律己) (yulgi)”, in 목민심서(牧民心書) (mongminsimseo):
    . . . . (Gan-gichuryul. On-giansaek. Isunibang. Jeungminmuburyeorui.)
    (please add an English translation of this quote)
  2. 1567, 선조소경대왕실록 (宣祖昭敬大王實錄), 즉위년 십월 (卽位年 十月):
    , ,
    , , . (Gyeongjihageo, jeokjaewang-wangmanggeukjijung, migeupchalji.)
    The noble official descended, heads towards solemnness that is boundless in the middle, not able to investigate it.
  3. 1412, “十二年 春正月, 3月 30日”, in 태종공정대왕실록 (太宗恭定大王實錄) [Veritable Records of King Taejong Gongjeong], published 1431:
    , 一處訊問
    , . (Myeongjip Gugyeong、Gwangmi、Yeong-u、Choebaek deung, ilcheo sinmun.)
    He ordered for the arrest of Gugyeong, Gwangmi, Yeong-u, Choebaek and others, for them to be interrogated together.

(2) Literary Chinese quotations added to 박#Korean: (Note the usage of {{zh-x}} in a Korean entry):

Extended content

Etymology 3

Korean Wikipedia has an article on:
Wikipedia ko
English Wikipedia has an article on:
Wikipedia

Sino-Korean word from

Proper noun

박 • (Bak) (hanja )

  1. A surname​.
    c. 1280 AD, Il-yeon, 일연(一然) (iryeon)), “기이 권제1(紀異卷第一), 신라시조 혁거세왕(新羅始祖 赫居世王) (gii gwonje1, sillasijo hyeokgeosewang(新羅始祖 赫居世王))”, in Samguk yusa, 삼국유사(三國遺事) (samgugyusa):
    사내는 박〔瓠〕과 같이 생긴 알에서 나왔고, 향인들은 박을 박(朴)이라 이르므로, 그 성을 박으로 하였다.(The boy hatched from an egg that looked like a gourd, and countrymen called gourd as Bak, so his last name became Bak.)
    Sanaeneun bak〔瓠〕gwa gachi saenggin areseo nawatgo, hyang-indeureun bageul bagira ireumeuro, geu seong-eul bageuro hayeotda.(The boy hatched from an egg that looked like a gourd, and countrymen called gourd as Bak, so his last name became Bak.)
    (please add an English translation of this quote)
  2. bark of a tree
    , [Classical Chinese, trad.]
    , [Classical Chinese, simp.]
    From: Shuowen Jiezi, circa 2nd century CE
    Pǔ. Mù pí yě. Cóng mù, bǔ shēng [Pinyin]
    (please add an English translation of this example)
  3. A historical surname used by the Yi () people.
    九月西 [Classical Chinese, trad.]
    九月西 [Classical Chinese, simp.]
    From: Chen Shou, Records of the Three Kingdoms, circa 3rd century CE
    jiǔyuè, bā qī xìng yí wáng Pú hú, cóng yì hóu Dù huò jǔ bā yí, cóng mín lái fù, yú shì fēn bā jùn, yǐ Hú wéi bā dōng tài shǒu, Huò wèi bā xī tài shǒu, [Pinyin]
    (please add an English translation of this example)
    [MSC, trad.]
    [MSC, simp.]
    From: 通志畧
    Yí dí dà xìng. Yǒu pǔ shì yǔ pǔ tóng [Pinyin]
    (please add an English translation of this example)
  4. big, suddenly, beautiful
    [MSC, trad.]
    [MSC, simp.]
    From: 博雅
    Pǔ. Dà yě cù yě lí yě [Pinyin]
    (please add an English translation of this example)
  5. combined notes of and , the sound is . The meaning is
    [MSC, trad. and simp.]
    From: 玉篇
    Pǔ mù qiè, yīn pū. Běn yě [Pinyin]
    (please add an English translation of this example)
  6. there were many Kim and Park in the country, but they did not married with other last names.
    · [MSC, trad.]
    · [MSC, simp.]
    From: 舊唐書
    guó rén duō jīn pǔ liǎng xìng yì xìng bù wéi hūn [Pinyin]
    (please add an English translation of this example)
  7. the king's last name was Kim, the noble man's last name was Park, the people had no last name, but only a name.
    ,, [MSC, trad.]
    ,, [MSC, simp.]
    From: 新唐書
    wáng xìng jīn, guì rén xìng pǔ, mín wú shì yǒu míng [Pinyin]
    (please add an English translation of this example)

Pinging Chinese editors @Atitarev, Bumm13, Dine2016, Dokurrat, Geographyinitiative, Justinrleung, Suzukaze-c, Tooironic for comment: KevinUp (talk) 11:07, 13 May 2019 (UTC)

I think they are Chinese quotes: they have no bearing on the Korean language "as she is spoke". Perhaps they could go in the Etymology section if they have a heavy Korean 'flavor.'
It reminds me that 舎利子#Japanese quotes the Heart Sutra, but I feel that is an exception. I don't think I would like to see quotations of the original Analects in a Japanese entry. —Suzukaze-c 02:43, 14 May 2019 (UTC)
@KevinUp: I have no idea about Korean, but 日本国語大辞典 has Classical Chinese citations, shown in the original Chinese (with shinjitai) and annotated with 返り点. --Dine2016 (talk) 04:53, 24 May 2019 (UTC)
There are also several Japanese works compiled by Japanese scholars in the 8th century AD using Literary Chinese such as Kojiki (古事記) and Nihon Shoki (日本書紀). I'm not sure how these works are annotated in Japan but Wikisource Chinese has the original text [1] [2] and it is written using the Literary Chinese language.
Vietnam also has Literary Chinese works (mainly poetry) by writers such as Lê Thánh Tông (1442-1497) and Nguyễn Khuyến (1835–1909) who wrote literary Chinese poems (thơ chữ Hán) as well as Vietnamese poems in Nôm script (thơ Nôm). KevinUp (talk) 12:30, 24 May 2019 (UTC)
An interesting topic to discuss would be whether Korean, Japanese and Vietnamese scholars were bilingual, i.e. were they fluent in spoken Chinese (any of its varieties) as well as their native language? I don't think this is the case because Korean, Japanese and Vietnamese scholars were borrowing the Han script mainly for writing purposes such as recording their own history. Furthermore, the characters would have been pronounced in a way that conforms to the phonology of their native language, e.g. the Japanese language didn't have consonant endings so some monosyllabic words were split into two syllables. KevinUp (talk) 12:30, 24 May 2019 (UTC)
In Lion-Eating Poet in the Stone Den (施氏食獅史), the modern Chinese linguist Yuen Ren Chao (趙元任) pointed out that Classical Chinese texts can be read and understood in written form but does not make sense when it is recited in Mandarin because pronunciation of written Chinese has changed a lot over the centuries. KevinUp (talk) 12:30, 24 May 2019 (UTC)
For now, the solution that I can think of is to use {{quote-book|lzh}}lzh is the language code for Category:Literary Chinese language to quote Literary Chinese works written by Korean, Japanese and Vietnamese scholars until the collapse of the Qing dynasty around the early 20th century. (Korean and Vietnamese scholars continued to produce works in Literary Chinese until this time, unlike Japan which stopped writing texts in Literary Chinese centuries before). Quotations using {{quote-book|lzh}} would then be transliterated using Sino-Xenic readings of the respective languages. KevinUp (talk) 12:30, 24 May 2019 (UTC)
@Justinrleung Would you mind checking this quote regarding the origins of the Korean surname (, gim)? It's written in Literary Chinese by Yi Ik (李瀷), a Korean scholar from the 18th century.
I would prefer not to see {{zh-x}} being used in a Korean entry because it's a bit awkward to see Simplified Chinese characters and Pinyin appearing in a Korean entry, but I would like to seek a second opinion. KevinUp (talk) 12:30, 24 May 2019 (UTC)

Cognate vs related[edit]

I strictly distinguish cognates from related terms. To me, cognate implies common descent: a word that is inherited from the same ancestral term as another. This is what the word "cognate" means when you analyse it in Latin: co-gnate, born together, from the same parent word. Two terms that merely share a root or are derived from a common term aren't cognates, they are related. Yet I see countless instances of people using "cognate" for terms that are not cognate, but related. This is a misuse of terminology in my view, similar to using {{inh}} for a Proto-Indo-European root: nothing can be inherited from a root, because it's not even a word, therefore words with the same root aren't necessarily cognates. I would like to ask if others are willing to pay attention to the cognate-related distinction when editing entries, and fix it if needed. —Rua (mew) 19:59, 13 May 2019 (UTC)

AFAIK, the correct use of "cognate" does include all terms derived from the same root. The term normally used to indicate descent from a given ancestral term is "reflex". I would not support trying to narrow the meaning of "cognate" to mean "reflex". Benwing2 (talk) 23:48, 13 May 2019 (UTC)
In my understanding, parallel formations can also be cognate; though sharing a root is not enough. If Turkish and Azerbaijani have formed the same word, a word using sensu strictissimo cognate parts, around the same time for a technological innovation, this is a cognate. So to say SOP cognate. It is the regular outcome of the tie of inheritance. It would be sesquipedalian to constantly point out that they are parallel formations.
So X is inherited into languages A and B, and Y is too, X and Y are cognates, and when both languages form XY by the same motivation, it will be a cognate. The whole is not more than the sum of its parts.
Or “cognate” has multiple meanings, why not. Long not used the word polysemy.
Even if it were not correct to say “cognate” in my case, there wouldn’t be any other template (apart from {{noncog}} which is identical). Do we need {{parallel formation}} now? Lol. Fay Freak (talk) 00:38, 14 May 2019 (UTC)
I use {{cog}} even when the forms are not cognate. We abuse the term "cognate" so much that there's no point in using the template correctly. I disagree that parallel formations are cognates. They are not "born together". It's like parallel evolution: not everything with wings is a bird. —Rua (mew) 08:51, 14 May 2019 (UTC)
@Rua Do you use {{cog}} even with words that are completely unrelated etymologically? Please don't do that; that's what {{noncog}} is for. Benwing2 (talk) 14:58, 14 May 2019 (UTC)

Questions on eliminating {{yi-inflected form of}} and {{lb-inflected form of}}[edit]

I am currently using code like the following to replace calls to {{lb-inflected form of}} for Luxembourgish, and {{yi-inflected form of}} for Yiddish.

==Luxembourgish==

==[[grouss]]:==

===Adjective===
{{head|lb|adjective form|head=groussen}}

# {{inflection of|lb|grouss||str//wk|nom//acc|m|s|;|wk|dat|m//n|s|;|str//wk|dat|p}}

===Adjective===
{{head|lb|adjective form|head=grousser}}

# {{inflection of|lb|grouss||str//wk|dat|f|s}}

===Adjective===
{{head|lb|adjective form|head=grousst}}

# {{inflection of|lb|grouss||str//wk|nom//acc|n|s}}

===Adjective===
{{head|lb|adjective form|head=groussem}}

# {{inflection of|lb|grouss||str|dat|m//n|s}}


==Yiddish==

==[[שאָטנדיק]]:==

===Adjective===
{{head|yi|adjective form|head=שאָטנדיקן‎}}

# {{inflection of|yi|שאָטנדיק||acc//dat|m|s|;|def//postpositive|dat|n|s}}

===Adjective===
{{head|yi|adjective form|head=שאָטנדיקער‎}}

# {{inflection of|yi|שאָטנדיק||nom|m|s|;|dat|f|s}}

===Adjective===
{{head|yi|adjective form|head=שאָטנדיקע‎}}

# {{inflection of|yi|שאָטנדיק||def|nom//acc|n|s|;|nom//acc|f|s|;|all-case|p}}

===Adjective===
{{head|yi|adjective form|head=שאָטנדיקס‎}}

# {{inflection of|yi|שאָטנדיק||postpositive|nom//acc|n|s}}

Questions:

  1. The inflection tables for Yiddish adjectives say "postpositive or nominalized". Is it enough to say just "postpositive" in the call to {{inflection of}}, or should I say "postpositive/nominalized" or "postpositive and nominalized"?
  2. The inflection tables for Luxembourgish adjectives say "without article" and "with article", which I have rendered as "strong" and "weak" respectively, as in German. Is this correct?

Benwing2 (talk) 00:37, 14 May 2019 (UTC)

(Notifying Metaknowledge, Wikitiki89): @Qehath @BigDom Benwing2 (talk) 00:39, 14 May 2019 (UTC)
I'm still not all that happy about replacing something that points the reader toward the more informative main entry anyway with something that I won't just be able to type out easily if I want to create an entry. (There's also the lingering problem that it only applies to Standard Yiddish, rather than the dialects.) But I don't see anything incorrect with how you lined it out above. —Μετάknowledgediscuss/deeds 17:37, 14 May 2019 (UTC)
@Metaknowledge I am planning on adding accelerators to make it easy to generate pages like these. Benwing2 (talk) 00:52, 15 May 2019 (UTC)
Alright, guess I can live with it. It would also be nice if you could bot-create them for all the existing adjectives with tables, which would obviate much of that work. —Μετάknowledgediscuss/deeds 00:57, 15 May 2019 (UTC)
@Metaknowledge OK, I'll see about doing that. BTW is פֿארגאַנגענער misspelled? The base form פֿאַרגאַנגען has an extra patakh under the first aleph. Benwing2 (talk) 01:28, 15 May 2019 (UTC)
Moved, good catch. —Μετάknowledgediscuss/deeds 02:34, 15 May 2019 (UTC)

The following user names are in block list . Is it applicable for Wikitionary?[edit]

This is a list of page titles which are blocked from creation/editing on Wikimedia wikis.Is it applicable for Wiktionary?

   https://meta.wikimedia.org/wiki/Title_black

Please see following link :

https://meta.wikimedia.org/w/index.php?title=Title_blacklist&diff=18052597&oldid=18052591

My doubt is Admin blocked the user name "Cruzir" , so my user name is applicable for this in Wiktionary?

If my user name is allowed i will continue , other wise i will change my user name

As far as I can see the user name “Cruzir” has not been blocked and is available, unlike the name “Cruizir”, which is already registered but has been blocked because of persistent abuse.  --Lambiam 12:53, 14 May 2019 (UTC)
You are correct User:Lambiam. The above "Title_black" listed user names are prohibited in Wiktionary?
Yes this is Block users list of Wikimedia foundation Wikis...includes Wikipedia..Wiktionary

In following link Admin blocked Cruizir and Bonadea .These two users in block list Prohibited in all wikies

https://meta.wikimedia.org/w/index.php?title=Title_blacklist&diff=18052597&oldid=18052591

Elexis?[edit]

I just got an email from the EU project ELEXIS announcing all sorts of free access to their lexicography tools for those from EU institutions. That would seem not to include en.wikt but may be interesting for those who also contribute to fr.wikt, de.wikt, it.wikt, etc. DCDuring (talk) 15:49, 14 May 2019 (UTC)

What kind of tools do they have that we don't? --I learned some phrases (talk) 07:31, 23 May 2019 (UTC)

The apostrophe in Zealandic[edit]

Zealandic (Zeêuws) belongs part of a wider Low Franconian dialect group which has historically lost the consonant /h/, the beginnings of which can already be found in Old Dutch. Modern Zealandic has no standardised spelling, but guides often suggest the use of an apostrophe in places where Zealandic has no h, but where it is found in standard Dutch. This is done purely for the sake of recognition, so that people who know standard Dutch can more easily guess the corresponding Dutch word. For Zealandic itself, the apostrophe has no significance, it is a vowel-initial word like any other. The suggestion only applies to words that have recognisable counterparts in Dutch; otherwise there's just an initial vowel and no apostrophe.

I'm unsure if we should lemmatise words with this apostrophe or without it. From a phonological point of view, the apostrophe makes no sense, it's only written because of comparison with Dutch and only where such a comparison can be made, which is subjective and inconsistent. Even the Wikipedia article w:zea:Zeêuws seems to be inconsistent with it, with a mishmash of spellings. I see both Ollands without it and Neger'ollands with it, both and and 'and, both oôg and 'oôg. My preference would be for the apostrophe-less forms to be standard, with the apostrophe forms to be treated as alternative forms. —Rua (mew) 12:22, 15 May 2019 (UTC)

My preference would be to lemmatise with h, as one does with any Romance language that has lost /h/. Now that is recognizable. Fay Freak (talk) 13:27, 15 May 2019 (UTC)
But nobody writes that way. —Rua (mew) 13:59, 15 May 2019 (UTC)
What's more common? Is there any community besides you working on it? If you can find both apostrophe and non-apostrophe forms for one word, it seems reasonable to lemmatize the non-apostrophe form.--Prosfilaes (talk) 14:54, 15 May 2019 (UTC)
I haven't been able to find out which is more common, but the fact that both variants of several words are used in a single Wikipedia article suggests that it's likely about even. —Rua (mew) 16:22, 15 May 2019 (UTC)
Technical approach: lemmatize at both, backed by a single entry page. Rough proof-of-concept at [[User:Eirikr/'and]], [[User:Eirikr/and]] for the "front-end" pages, and [[User:Eirikr/'and-and]] for the "back-end" where the data would live. On the "front-end" pages, clicking any of the "edit" links next to the headers takes the user to the "back-end" page, similar to how it works for our sectionalized forum pages like the Tea Room. ‑‑ Eiríkr Útlendi │Tala við mig 19:02, 15 May 2019 (UTC)
That's way too complicated. —Rua (mew) 08:47, 16 May 2019 (UTC)
These sound very much like apologetic apostrophes in Scots. Based on what has been said above, I would suggest lemmatizing the form without apostrophes. In theory, one benefit to lemmatizing the forms with apostrophes would be that anyone who wants to know how the word would be spelled without them would find removing them straightforward (right?), whereas knowing where to add them if an entry doesn't use them is not straightforward—but if the apostrophized forms are listed as alt forms in the un-apostrophized entries, then that issue is eliminated. - -sche (discuss) 06:33, 16 May 2019 (UTC)

National politics categories[edit]

It seems odd to claim that Brexiteer is limited to British English or that Jamaika-Koalition is limited to the German of Germany; the terms are used by any speaker of that language, but only pertain to the particulars of one nation's politics. We would certainly have enough entries for Category:en:UK politics and Category:de:German politics, as well as many others. What do you all think about this scheme? —Μετάknowledgediscuss/deeds 20:51, 16 May 2019 (UTC)

We've never correctly addressed the distinction between topic and register/context/language-variety. I hope you can come up with something, which might require a parallel category structure and careful cleanup of membership in the existing structure. DCDuring (talk) 23:37, 16 May 2019 (UTC)
I have a similar issue with the label (African American Vernacular), which is applied to terms that are found predominantly in speech by African Americans, but that are not specific to the AAVE sociolect – they may as well be used informally by black people with a high socioeconomic status, such as professors. I feel the distinction between context (in this case, a restricted class of speakers) and variety (nevertheless basically standard English) should also be carefully maintained.  --Lambiam 00:18, 17 May 2019 (UTC)
Categorizing these seems fine. Not sure on the name; Category:en:Stars is for names of stars whereas Category:en:UK politics wouldn't be a list of names of politics (would it?), but we also have Category:en:Politics, so maybe it's fine. It won't affect people trying to use {{lb|en|UK}} to indicate that a word refers to a British topic, though. (We even had "Category:German English" because people tagged {{l|en|GDR}} with {{lb|en|Germany}}...) I don't know how to stop that besides periodic checks of all dialect categories. (Some users have proposed adding a second, topical type of labels, to list "topics": e.g. "eye" might be tagged "anatomy", and "pronoun" tagged "grammar".* I don't know if that's a good idea.)
Regarding "AA-but-not-AAVE", probably it should have its own label. It still at least seems to be in the same category of phenomenon as the label "AAVE" (and "Appalachian", "Cockney", and "UK", etc), i.e. it's indicating who uses the term, which seems different from labelling "Brexit" as "British" just because it pertains to Britain. (Comparable to having a label for "AA speech of any level of formality", we have labels like "LGBT" that encompass not only a diversity of levels of formality but also aren't even limited to particular nations...)
- -sche (discuss) 06:56, 17 May 2019 (UTC)
Would (African-American community) be a good label?  --Lambiam 16:34, 17 May 2019 (UTC)
*I see it already is, but that seems weird, since it's not limited to discussions about grammar unless one argues the very fact of using it makes the discussion become about grammar, but in that case is saying "there's an airplane!" making a discussion (and the word) about aviation? - -sche (discuss) 07:02, 17 May 2019 (UTC)
If someone applies the label entomology or lepidopterology to a definition of brown that refers to a butterfly, is that a topical or a context label? I'd argue that the term can be used 'correctly' by almost any person referring to a lay observation of nature, so it shouldn't be considered a context label. I don't recall that there was any consensus, let alone a vote, not to use topical labels. I do not advocate topical labeling and would prefer that they be banned, but there were reasonable arguments in favor. DCDuring (talk) 12:13, 17 May 2019 (UTC)

I've had this same question with foods. See pozole; it's not, as the entry currently suggests, a "Mexican Spanish" word. It's just a dish eaten in Mexico. The English definition labeled as "American English" makes even less sense. It's just a geographic coincidence, it's not as if the Australians have a different name for the same soup. In these cases I usually remove the label and write the country where it's eaten in the definition. Ultimateria (talk) 15:32, 17 May 2019 (UTC)

FWIW, that (removing the erroneous label and writing the country into the definition) is the correct approach. Is the use of the word to mean a drink limited to Honduras, or is the drink merely Honduran? - -sche (discuss) 15:46, 17 May 2019 (UTC)
I think the name of the drink is actually pozol. It is not confined to Honduras (see Pozol on Wikipedia), but Hondurans consider it an authentic Honduran tradition ([3],[4]).  --Lambiam 17:54, 17 May 2019 (UTC)
  • Seeing no opposition and a great deal of unrelated discussion, I have gone ahead and added some national politics categories to get us started. —Μετάknowledgediscuss/deeds 06:18, 19 May 2019 (UTC)

Sources[edit]

Would it not make sense to cite specific sources for definitions, as well as cite quotations of sources? -ApexUnderground (talk) 06:05, 17 May 2019 (UTC)

That depends on the language. But generally for English we are not attempting to be a tertiary source. The source is the editor who created the definition. DTLHS (talk) 06:06, 17 May 2019 (UTC)
The source are the citations. Definitions are based here on usage, not on other authorities. Ƿidsiþ 12:36, 23 May 2019 (UTC)

Talk pages consultation: Phase 2[edit]

  • We have generally discouraged reliance on talk pages to initiate discussions. Do we want to change that practice, make it official, continue the practice unofficially, or ??? Is there some boilerplate that we could have, perhaps as a filter warning, to inform someone starting a new talk page to go elsewhere? If there is any interest in this we should participate in the process, I suppose. DCDuring (talk) 20:35, 17 May 2019 (UTC)
  • When a user goes to a not-yet-existing talk page of an article, they are shown an info box:
    Ambox notice.png Talk pages of individual entries are not usually monitored by editors,
          and messages posted here may not be noticed or responded to.
          You may want to post your message to the Tea Room or Information desk instead.
    We could add that to every article talk page.  --Lambiam 13:37, 18 May 2019 (UTC)
    How do I forget these things?
I suppose that a user could post to an existing talk page, expecting it to be monitored. The notice for new talk pages would be useful for such users. But it is easy to miss such messages. I wonder whether such a message can and should appear at the edit window whenever one is opened on a talk page. DCDuring (talk) 15:52, 18 May 2019 (UTC)
The message MediaWiki:Editnotice-1 apears at the edit window of new and existing article talk pages. --Vriullop (talk) 09:48, 19 May 2019 (UTC)
Thanks. That I don't remember having read it shows how easy it is to filter such notices out. I must have noticed it at one time, either when I was new or when it was new. DCDuring (talk) 16:32, 19 May 2019 (UTC)
Uh oh, I hope this won't be the revenge of the spurned Liquid Threads upon us all. Equinox 16:10, 18 May 2019 (UTC)
How many are still using Liquid Threads? DCDuring (talk) 17:32, 18 May 2019 (UTC)
On the occasions where I used them (because I had no choice) I became even more confused than I already usually am.  --Lambiam 00:14, 19 May 2019 (UTC)
I would happily vote to ban LT from Wiktionary. —Μετάknowledgediscuss/deeds 00:17, 19 May 2019 (UTC)
I was using LT for a while. TBH, the only reason (a good one, IMO) was to annoy my fellow Wiktionarians. --I learned some phrases (talk) 17:25, 19 May 2019 (UTC)
Same here, I'd vote for a ban too. Canonicalization (talk) 16:34, 20 May 2019 (UTC)

Clown world[edit]

My friend's six-year-old was given this test: [5]. I don't think I knew "digraph" was a word until I was an undergraduate. Equinox 01:44, 19 May 2019 (UTC)

There's no reason six-year-olds can't learn it, though. It's no harder than a lot of words that six-year-olds learn. —Mahāgaja · talk 15:47, 19 May 2019 (UTC)
But I don't see that digraph has any purpose in the test question, except to make the answer slightly less obvious. I suppose the six-year olds need to know not to waste time on a word that doesn't contribute useful information. DCDuring (talk) 16:28, 19 May 2019 (UTC)
I still very vaguely remember being given "church" and "photograph" (which both conveniently begin and end with the digraph) as examples of ch and ph. I think there's a lot to be said for teaching fragments of Latin and Greek, as it really helps with spelling (you don't expect an f in an Ancient Greek word) and with decoding medical terminology and generally making sense of "hard words". Anyhow... knowing the jargon isn't the important bit. Hey! if you know the Greek bits then you can work out what a digraph is anyway. Equinox 21:01, 22 May 2019 (UTC)

Bad definitions in functional words[edit]

Functional words are usually definied by a synonym, yet the latter remains unspecified regarding which of its meaning(s) is intended (concessive, adversative, etc.). For example, the third definition of as long as is just "while, since", but a starting learner wouldn't be able to infer the common causal meaning both terms shared. --Backinstadiums (talk) 15:57, 20 May 2019 (UTC)

We could improve our definitions, no question. But Wiktionary is not aimed at a starting learner of English. A starting learner will not benefit from an English-language resource anyway. —Μετάknowledgediscuss/deeds 17:03, 20 May 2019 (UTC)
Still, if a definition has multiple senses, only one of which applies, we should aim at disambiguating it – for example by giving a sequence of definitions whose intersection is no longer ambiguous. While I’m writing this, sense 3 has been simplified to “Since”, which in turn has two current senses as a conjunction: “From the time that” and “Because”. Neither is a particularly good substitute in the example sentence As long as you're here, you may as well help me with the garden. The first is a clear misfit, but Because you're here, you may as well help me with the garden also doesn’t sound right. I’d like to define it as “Since, seeing as”, but we don’t have the latter.  --Lambiam 15:13, 21 May 2019 (UTC)
I tnink I understand your concern, but OneLook dictionaries don't have seeing as or seeing as how as entries. They are highly informal. MWOnline has 1 "provided that" and 2 "inasmuch as, since". Usage examples help. DCDuring (talk) 18:56, 21 May 2019 (UTC)
I believe Cambridge and Merriam–Webster are OneLook dictionaries.  --Lambiam 21:57, 21 May 2019 (UTC)
Apparently the OED also has it, as a colloquial variant of seeing that.  --Lambiam 22:02, 21 May 2019 (UTC)
Thanks. I must have mistyped into their search box. I think of seeing as as informal, but MWOnline doesn't think so, though Cambridge thinks of it as nonstandard. DCDuring (talk) 21:20, 22 May 2019 (UTC)
Cambridge applies the nonstandard label only to the variant seeing as how. Just seeing as is labelled “informal”. Seeing as how sense 3 of as long as is also rather informal, using an informal synonym in its definition is reasonable. Cambridge and OED appear to agree that seeing that is more formal. BTW, the meaning of these collocations is not covered in the entries see or seeing, nor anywhere else that I can see.  --Lambiam 23:56, 22 May 2019 (UTC)
Also, you could try creating categories or, better, appendices that group the conjunctions into categories with impications for the definitions and use them to achieve some kind of consistency and correctness in the definitions. The modern grammars might help in that noble endeavor. DCDuring (talk) 19:00, 21 May 2019 (UTC)

"RFD failed"[edit]

On the RFD pages, people often close discussions with the note "RFD failed", which I understand from context to mean that the entry should be deleted. However, this seems the wrong way round. If the RFD, i.e. request for deletion, has failed, does not that mean that the entry should be kept? Or is this just me? Mihia (talk) 20:56, 22 May 2019 (UTC)

GASP! Way to rock the boat. I suppose you're right but it's nice to have parity in how we talk about RFVs and RFDs. Perhaps we should say "entry passed" or "entry failed" in both cases? Maybe we need more goddamn templates. Equinox 20:59, 22 May 2019 (UTC)
I just say "deleted" and "kept". —Rua (mew) 21:39, 22 May 2019 (UTC)
That is much better. On Wikipedia the closer usually writes: “The result was keep/delete.”  --Lambiam 00:05, 23 May 2019 (UTC)
I have always been confused about what "failed" meant on RFD pages. "Deleted" and "kept" are much easier to understand. — Eru·tuon 00:40, 23 May 2019 (UTC)
I have a vague memory that when I used to close RFVs and RFDs (which I haven't done for years) I used to put things like "deleted" and was scolded because this didn't indicate whether it was unilateral or processual. Equinox 00:43, 23 May 2019 (UTC)
The “The result was keep/delete” wording on Wikipedia is clearer in that respect. — Eru·tuon 02:00, 23 May 2019 (UTC)
The idea was traditionally that the entry itself failed or passed, rather than the request. I don't care what people say as long as they help close RFDs and RFVs, which very few people seem to do nowadays. —Μετάknowledgediscuss/deeds 15:40, 25 May 2019 (UTC)
Can anyone do that, or is it only admins? Mihia (talk) 00:22, 28 May 2019 (UTC)
Generally only admins, because failing things may require deleting pages. —Μετάknowledgediscuss/deeds 03:18, 28 May 2019 (UTC)

Nonconcatenative morphemes[edit]

I think they deserve entries. They're morphemes. They're tricky to deal with though; we can't just give "palatize final consonant" a main namespace entry, but they shouldn't be hidden in appendices. A past discussion on Semitic transfixes was pretty inconclusive. Is anyone else on board with incorporating nonconcatenative morphemes into Wiktionary? If so, how do we do it? Julia 01:50, 23 May 2019 (UTC)

I'm disinclined to include them. They're morphemes, but they aren't things that readers are going to want to look up in a dictionary. The same goes for the zero morpheme that "marks" the plural of words like sheep or the past tense of words like put. Readers aren't going to want to look up a suffix -∅ for those forms, or a suffix for certain Irish genitive singulars and nominative plurals (e.g. báid), or a prefix ʀᴇᴅ- for Malay plurals, Ancient Greek perfects, or anything else languages use reduplication for. —Mahāgaja · talk 10:25, 23 May 2019 (UTC)
I'm in favour of including nonconcatenative morphemes, but I'm not sure how to include them either. We should keep in mind that the basic goal of Wiktionary, as stated by WT:CFI, is to include things that someone comes across and wants to know what it means. Someone might come across a suffix as part of a word, then realise that it's a suffix, and look that up. But with nonconcatenative morphology it's much less clear what the user would be looking for and what the most intuitive place is that they'd find it. If the user doesn't know where to find it, they might be inclined to look for a word that has it instead and see what the etymology contains. Thus, I think that should be our focus: presenting nonconcatenative morphology in the etymology section in a way that lets the user find out more on that particular mode of derivation. At that point, it doesn't really matter where the information is located, so an Appendix seems workable. —Rua (mew) 13:50, 23 May 2019 (UTC)
Yeah, an appendix would definitely be better than nothing. With good etymology and linking templates I think it would easy to implement. Julia 21:03, 25 May 2019 (UTC)

Google have broken/retired their Usenet search[edit]

Searches in Google Groups no longer find anything from Usenet by default. You have to search for a specific desired Usenet group first, by group name, and then search within that group. Naturally this is utterly crippling for Usenet searching and makes it quite impractical, since you won't know which groups to look in, and doing it one at a time is impossible anyway. Equinox 15:33, 25 May 2019 (UTC)

You can still search Usenet using regular Google. Just search google:site:groups.google.com "cromulent" to find cites for cromulent. That said, it is certainly inconvenient. —Μετάknowledgediscuss/deeds 15:38, 25 May 2019 (UTC)

Cites in different languages[edit]

At the English entry for black pill, we have a citation in Swedish! And people defended this at its RfV. Cites of English words in other languages can only be mentions. Do we really need to have a vote to disallow this, or is it obvious enough that I can just delete the cite? Julia 20:33, 25 May 2019 (UTC)

It depends. A quote in another language is inherently a mention, so for English and other WDLs it's invalid as a cite for CFI. For LDLs, it's not that common to have decent lexical resources in the languages themselves, so a reliable source discussing the term would be welcome. In this case it's just someone discussing it on some website, and the term is English- so it's utterly useless. Chuck Entz (talk) 20:57, 25 May 2019 (UTC)
Yes, I forgot about LDLs. Per CFI: “For terms in extinct languages, one use in a contemporaneous source is the minimum, or one mention is adequate...For all other spoken languages that are living, only one use or mention is adequate.” It doesn't explicitly say that you can use a mention from different-language sources, which we do allow (c.f. tons of Classical Nahuatl entries). Do you think that should be included, or can it be inferred well enough? Julia 21:33, 25 May 2019 (UTC)
Discussing a term, I must warn, is not a “mention” generally. It often is a report of use. The use-mention distinction was incorporated into the CFI to shed ghost words, these “dictionary mentions” where a dictionary editor could have just copied it over mindlessly, and protologisms which barely live outside of lists of word yet only intented to be used. An instance of witnessing a word use does not fall out of our use-mention axis, here the rule must be teleologically reduced. For example the quote: “Eine kleine Minderheit schnalzt beim Begriff Kaviar mit der Zunge und denkt an eine viel billigere, viel leichter zu beschaffende Delikatesse – an Kot.” at Kaviar – this is typical journalistic style of reporting a use and meseems this is “use enough”, enough use shines through. This is not where the danger of ghost words or protologisms is. This is how slang terms often shine through which a great part of society avoids to use. In fact, avoiding to use a term is a use. Fay Freak (talk) 22:35, 25 May 2019 (UTC)
A use in another language can be a use like it can be a mention (ha! I have called it use hence). Nowadays it is easier than ever to just pick up foreign discourses. Also, I suggested to implement code-switching templates. Of course sb who consults an English dictionary does not want Swedish quotes as if they were English. That's not my ideal. But separated this is interesting extra information. The real problem here is not votes but the insecure criteria according to which a word passes through languages, and lacking technical refinement to represent uncertainty. It’s rather how should it be displayed, not whether it should be displayed, for by itself these quotes are not bad, so it depends on whether the technicians have written some good code so one can place it inoffensively – I mean quotes always come at a premium, like any corpus is limited and particularly any editor’s access to sources, so editors take what they have. Remember the strifes about APP? Much hate broke out for this being alleged to be Chinese. I suggested to keep the Chinese quotes at the English and this was a good compromise. Not to solve whether it is Chinese. Similarly, I wouldn’t ever create that word as Swedish. There must come much in addition to it being used in Swedish for it to be Swedish. These material criteria are currently hard grasped by editors, let alone formulated. Somebody has put much effort into quoting English arrastão and now nothing can be seen of that because the deficiency of our moulds let it slip through, sad, I think we can have English quotes without English section. Fay Freak (talk) 22:35, 25 May 2019 (UTC)
I did a quick Google Translate of the Swedish quote; it seems like the first instance is a mention, and then two uses in Swedish. The quote doesn't say anything about it being an English word. I don't see how this proves English usage. Julia 22:46, 25 May 2019 (UTC)
It does, because what else? How likely is it thought out for that article, or only picked up from Sweden? Nobody will try to prove baby-foot and Handy so, this is a strawman, the counter-examples are sought. Unless we have reason to believe that the term is invented only in this language, the presumption is that it isn’t, unless perhaps we have a situation where a foreign language is constantly used to invent new words (like Latin in Europe today, or Arabic in Ottoman Turkish, but even then it depends on the concrete use; Swedish does not use English that way, the “fake anglicisms” must be a real outside case). But here the whole points to a foreign context, so all indications are against it. All these talks about incels, redpills, blackpills, is the English internet and not the Swedish. Not yet heard “redpill” or “incel” for example in German, though I do encounter “rotgepillt”. Just reckon the likelihoods. That’s what one constantly has to do in investigating language. Fay Freak (talk) 23:02, 25 May 2019 (UTC)
A quote in Swedish does not count as a use in English. Handy is a good comparison. —Granger (talk · contribs) 00:18, 26 May 2019 (UTC)
It’s not about “counting”. It’s about what is a likely witness. “Three quotes” is just a procedural rule of thumb. Even when you have a quote in the target language it might still not be good, it might be a varyingly probable mistake or manuscript corruption or one is not sure “if this is still English” (Pidgin etc.). One still has to discern what is what. So one does not know if one has three or one has two or four, because they have varying quality. Usually it provides proof, it is in a text in a different language with the shape of the target language. No again, Handy is a strawman. One has to learn to make use of the verisimilars. I am averse to this black-white think.
WT:CFI literally speaks of a guideline. Guideline wherefor? For our cognizance of the terms. Like a judge does not just count the number of witnesses but assesses their quality, so the CFI only wants that one verifies terms by convincing procedure. The records that suffice might be of varying permanence. We exclude Facebook, Instagram, and the like, because people there write unreliably and one needs a filter against protologisms, raids, and all the garbage of the unredacted wilderness, apart from their trait that they disappear often and hence induce link rot. The CFI did not separate strictly what English is, and it didn’t tell that we can’t prove to the community a word by a bunch of web quotes, in fact such an assertion is very contrary to the CFI. The idea is: Include words that exist. Then only just enough measures have to be taken to make it evident that a word has been around. A web quote, unlike a book not digitized, is particularly suitable to verify, because many people can see at the appropriate time that it is there and not made up.
WT:CFI speaks of a “more formal guideline” implying there are still the other guides aside from the form. Even viewing WT:CFI#Attestation strictly the CFI does not say that a term is only to be included if it is to be attested in that very sense. Under the general rule it still might pass because of being verified otherwise.
Why this all for “well-documented languages”, one might think? It is because the more well-documented a language is – the more space it takes in the world –, the more peripheral areas there are that call for inclusion. The “good documentation” of a language grows linearly with bad documentation of the same language, that is. The more documented it is, the more used it is, hence the more badly documented but equally real fringes it has, it has them like any “limited documentation language”. We have a lot missing from Africa or even Multicultural London English.
This interpretation of the CFI being, at least expressed, novel for most, I apologize for the pretention, but you might see that there is a greater satisfaction to achieve than suffering the deletion words that you see being in the world.
I think we can also more often mark senses as dubious. If you object that a sense is not found except on the internet, but there it is found, the right thing to do might be to give it a cautionary background colour or separating, what CSS offers, to mark words that stand on the threshold, not being in and not being out.
So, we can also “include” words because they are likely but our current readings are uncertain. There was this case tracer. The meaning “The act or state of tracking or investigating something“ is still dubious. We have quotes, but hm. Irrespective of not counting enough evidence, I think this is a sense that should be included because of its possibility, because “it's likely that someone would run across it and want to know what it means” (as the CFI page says), and if somebody runs across the gloss, he can possibly add a quote that we could not find.
What do we do with reconstructed senses anyway? I mean if the word is attested but a sense is reconstructed. For example Aramaic זַיָּירָא(zayyārā) is attested as a pressing-tub for olives (or wine?) but Arabic زِيَار(ziyār, barnacle, twitch, pincers to fix a horse) is borrowed from it. So one can reconstruct a “possible meaning” “barnacle, twitch, pincers to fix a horse” for the Aramaic. (Good example?) I cannot exclude of course that this needs to be done for English, though of course rarely as compared to what one else does.
I have discovered whole semantic groups that are unfit for being “attested by three quotes”.
1. For instance, I found it a bit ridiculous how one wanted to “quote” that pomelo actually meant “grapefruit” – quotes from old times won’t make this possible because grapefruits and pomelos are used in the same contexts, only non-philological evidence can do it. Apparently editors settled with reasons therefore. So in general, lifeforms that are hard to distinguish.
2. Then, there is the field of diseases. For instance, the entry bejel, for a bacterial diseases, lists a lot of synonyms and translations. Of course by 19th-century sources we couldn’t find out that these names are all of the same meaning. This is a study of its own. Hence one needs medical treatises which look at the symptoms of historical diseases and decide if something is the same, and I have cited such a treatise as a reference.
3. It would be weird to quote for units of measure. Even for English-language historical measures, one might need to measure their capacity physically. You won’t find a quote from the 17th century telling you how much that weight is in kilograms. If you are lucky there are modern studies for that.
These are examples how the whole “three quotes“ thing needs to be seen flexibly according to the topic. Maybe I have talked specifically about a fourth group, “the internet’s language”. Different fields need different paths to be caught up with. Fay Freak (talk) 03:18, 26 May 2019 (UTC)
(edit conflict) A mention is circumstantial evidence, not direct evidence. It has zero standing under CFI. A discussion in English might fill out some of the details of the definition and context, and thus be marginally useful, although useless for verification. an untranslated chunk of Swedish in an entry for an English term might as well be lorem ipsum- something used to keep it from looking empty until actual content (a translation, at the very least) is added. Chuck Entz (talk) 03:37, 26 May 2019 (UTC)
So? Circumstantial evidence is also evidence. But this distinction is artificial, and does not address my point: Avoiding to use a term is a use. And ignores the code-switching problem. The assertion “that is a mention” is dogmatic, not to say, a swearword. “Anything I don’t count is a mention”.
An interpretation of this distinction after the CFI based on its telos of excluding ghost words and protologisms leads me to conclude that what you call mention is actually use.
To speak more clearly, I think “use” and “mention”, as used in WT:CFI do not have the meaning as usual outside of Wiktionary, hence the confusion. They are private language. So well, what you call a mention is a mention, but not according to the CFI definition of a mention, because its meaning is restricted according to the context and telos of the CFI. Because there are good mentions. Mentions that alone let you know the meaning. In fact we know many meanings that we consider language knowledge only through a chain of mentions. Mentions cannot be ousted.
And yet, how are details useless but useful? Either you recognize that one uses information apart from that conferred by uses, or not. Fay Freak (talk) 04:00, 26 May 2019 (UTC)
I don't have time to read a long comment this evening, but the basic point here is that a quotation in Swedish may count as a use in Swedish, but not as a use in English. —Granger (talk · contribs) 11:18, 26 May 2019 (UTC)
I agree with Chuck's first comment. A Swedish or Chinese (etc) text with a seemingly-English word embedded in it obviously does not attested the English term and is not suitable for illustrating its use. The fact that pseudo-loans (and pseudo-anglicisms, etc) exist demonstrates this quite clearly, that a language can make up something that looks like a word in another language, but isn't. If a term exists in English, find people using it in English. - -sche (discuss) 08:45, 28 May 2019 (UTC)
  • One of my favorite such examples of "English" in a non-English context is No Smorking.
Context is key. ‑‑ Eiríkr Útlendi │Tala við mig 17:00, 28 May 2019 (UTC)

Proposal: Switch to tonal orthography for Slovene[edit]

@Benwing2 Slovene has two diacritic orthographies, the stress-based one and the tonal one. The tonal one can be converted automatically to the stress-based one, but the reverse is not true. The tonal orthography thus gives more information. At the moment, following WT:ASL and Appendix:Slovene pronunciation (which I originally wrote), the stress orthography is used in Wiktionary for all headwords, inflection tables and links. At the time, my argument was that the stress orthography is more common in dictionaries, and since not all Slovene dialects distinguish tones, the stress orthography applies to more of them. On the other hand, the standard practice for Dutch on Wiktionary is to include three genders, even though a majority of speakers only distinguishes two. The argument for Dutch is, again, that 3 genders gives more information, and the conversion to the two-gender system is automatic.

I'd like to propose that we change this practice for Slovene, by using the tonal orthography exclusively. As mentioned, it includes more information that would otherwise not be visible, especially tonal alternations in inflection paradigms. It is also crucial for etymologies, to the point that a separate {{desc/sl-tonal}} template was created, just to be able to show the tonal orthography in descendants (and not get people confused about the notation). Converting existing entries will take a while, and since the two notations overlap it's not always clear which of the two is currently present in the entry. —Rua (mew) 13:17, 28 May 2019 (UTC)

@Rua 100% in agreement with this change. Benwing2 (talk) 14:31, 28 May 2019 (UTC)
@Benwing2 Any idea how we can do it? The acute ´ and grave ` accents appear in both schemes, which makes it difficult to determine whether a particular word has been converted already. Perhaps a bot could list all forms that are currently ambiguous, and then we can work through that list. —Rua (mew) 17:28, 28 May 2019 (UTC)
I've created a bunch of tracking categories:
We should presumably start with the last category, which can be converted unambiguously to the tonal scheme. —Rua (mew) 18:26, 28 May 2019 (UTC)
Also, should we include the two additions Wiktionary made to the tonemic scheme, namely the letters ə and ł? They aren't used when writing full words with tonemic diacritics in other sources, and may confuse the reader into thinking they're real Slovene letters. On the other hand, while not related to tone and accent, they are a useful pronunciation guide. —Rua (mew) 17:32, 28 May 2019 (UTC)
Another problem: it's actually surprisingly hard to find thorough information on the tonemic system. I've found only one source that consistently uses the tonemic diacritics throughout, and it's a bit lacking in other areas. It seems that schools and such almost only teach the stress system and ignore the tonemes, as do grammars in general. I think the best we can do is provide the headwords with tonemes; providing them in inflection tables is going to be extremely difficult. —Rua (mew) 20:20, 28 May 2019 (UTC)
It seems like the right thing to do. How do you make sure the notations are not mixed and also, is there a definite source for the tonal notations? I have just searched for soprog in [6] and found three different notations from various sources: sopróg, soprọ̑g and sǫ́prog. soprọ̑g matches *sǫprǫgъ, so it must be right. --Anatoli T. (обсудить/вклад) 03:28, 29 May 2019 (UTC)
@Atitarev If you read those entries correctly, some of the notational variance disappears. The first entry, which says sopróg for SSKJ, is the non-tonal orthography; the tonal equivalent (ọ̑) for the stressed vowel is given in parens afterwards on the same line. Pravopis does the same thing. The feminine equivalent sopróga given in Pravopis has the tonal notation (ọ́; ọ̑) after it, which I think means that the nominative has tonal ọ́ but the genitive has tonal ọ̑, although I can't be sure about this. Pletersnik uses his own notation which can mostly be converted to the standard tonal notation but expresses additional distinctions found in dialects; I'm not sure what sǫ́prog is doing here, that must be some sort of dialectal form. Benwing2 (talk) 04:21, 29 May 2019 (UTC)
Thanks, Benwing2. I guess, the important questions are:
  1. Is there enough information online to make switch completely to tonal notation? At first glance, I find Fran a little confusing because of many different results. Are there more sources?
  2. It seems tones changes in inflected forms. Some entries show these changes, e.g. brlòg. Is there enough info for that?
  3. It seems it's easy to mix up both notations in some instances. Are (any of) you ready to commit and complete and support this for some time? If it's half done, we could have a mess instead. Correct me if I'm wrong.
  4. I guess casual Slovene users are less familiar or not familiar with this notation or they will use the stress-based. We need to make sure this is addressed.
Also @Rua. --Anatoli T. (обсудить/вклад) 04:44, 29 May 2019 (UTC)
Point two is what I was concerned about. With a lot of searching I found Jože Toporišič's "Slovenski slovnica", which gives quite a few more details about tone in inflectional patterns. The problem of course is that it's in Slovene, so it's taking me a bit to interpret. I'm planning to update w:Slovene declension based on it, when I figure it out. We can then work from that. —Rua (mew) 09:22, 29 May 2019 (UTC)
I've done some work, which can be seen at w:User:Rua/sandbox. It's still a work in progress, and there's still most likely some gaps. —Rua (mew) 16:51, 29 May 2019 (UTC)
@Benwing2, Atitarev I'm currently working on adding {{sl-tonal}} to all entries that don't have it yet. It will no longer have any use once the headword displays the tonal orthography, but I intend to turn it into an automatic IPA generating template instead, and rename it to {{sl-IPA}}. This will also be the way to track progress: as part of converting entries to the tonal orthography, we also change {{sl-tonal}} into {{sl-IPA}}. That way, we can see by the transclusions of the former which entries still need updating.
I am thinking that a bot can handle updating the headwords automatically in many cases. If the number of parameters on {{sl-tonal}} equals the number of headwords in the headword template, and if they match (if the tonal orthography converted to stress orthography equals the headword), then change the headwords to use the tonal orthography and also update the template name to {{sl-IPA}} to track progress. This will not update inflections, but that's ok because inflections need manual attention anyway. I am going to change the inflection tables to display some forms in the collapsed state, a format that some languages already use, e.g. kopen, eallit, kala. This keeps the inflection information in one place and means we don't have to fix the inflections in the headword, we can just remove them if an inflection template is in place. —Rua (mew) 10:09, 31 May 2019 (UTC)
I've now created {{sl-IPA}}, and it works nicely. :) From now on, when you convert entries to the tonal format, please change {{sl-tonal}} to this, or add it if it's not present. That way it's easy to keep track of what has been done. —Rua (mew) 13:26, 31 May 2019 (UTC)
I've run the bot and orphaned {{sl-tonal}} in favour of {{sl-IPA}}. Now, every entry that contains the latter template should have tonal orthography in the headword. There are still a fair number of pages that don't have the IPA template, and which don't have tonal orthography yet. I'll work on those. —Rua (mew) 15:11, 1 June 2019 (UTC)

WTNoD[edit]

Excuse me, what does "WTNoD" stand for? I accidentally got tripped by an abuse filter with a good-faith edit, and I saw this phrase in the filter description. 68.193.209.173 01:08, 29 May 2019 (UTC)

We've had problems with anonymous editors randomly removing big chunks of text from discussion pages like this one and various information pages in the Wiktionary namespace. This filter stops that. Unfortunately, it has no way to know when the text removed was originally added by the same person. The filter has already stopped some very destructive vandalism in the month it's been in place, so I'd really rather not disable it if I can avoid it. I've taken care of the text you wanted to remove, so the immediate problem is solved. If this happens again, one workaround would be to split your deletion into two edits so you don't go over the size threshold, or you can ask someone with an account to do it for you (I'd be happy to). Either that, or sign up for an account: you may think that editing as an IP is protecting your privacy, but in reality, anyone who wants to can tell in less than a minute what Internet Service Provider you're using and your approximate physical location to within a couple dozen miles or so. If you were logged in to an account, the only people with access to that information would be checkusers, and we would have to have a very good reason to look. At any rate, sorry for the inconvenience. Chuck Entz (talk) 03:38, 29 May 2019 (UTC)
@Chuck Entz: I am aware of the above. I do have an account , but I have recently decided to stop using it so that I could slowly depart from this community, partially out of boredom and partially out of real-life circumstances. 68.193.209.173 19:59, 30 May 2019 (UTC)

Which display is preferable - split or solid?[edit]

See also Wiktionary:Grease_pit#Tibetan_་_shouldn't_separate_words_in_headword_lines

I want to gauge people's preference, if it's OK for these two words, e.g. in Khmer and Burmese: ខួរក្បាល (khuə kbaal) and ဦးနှောက် (u:hnauk). This question is applicable to Thai and other scripts

Should the headword display them, regardless if the etymology section exists?

  1. As solid: ខួរក្បាល (khuə kbaal) and ဦးနှောက် (u:hnauk). this revision and this revision.
  2. Split into components, with links to individual parts: ខួរក្បាល (khuə kbaal) and ဦးနှောက် (u:hnauk). this revision and this revision.

Calling @Mahagaja, Octahedron80 to participate. --Anatoli T. (обсудить/вклад) 06:11, 29 May 2019 (UTC)

  • I vote for the unlinked display in the headword line, just as English compounds written without a space are written without the component parts linked. —Mahāgaja · talk 06:18, 29 May 2019 (UTC)
    @Mahagaja: Thanks for the reply. Can you be more specific, why? I have no issue of reading and deciphering the English compound word e.g. airwave (well, I know English, Roman letters and I know these components) but even acetylmannosaminyltransferase becomes a bit problematic. I would support displaying Hungarian agyhártyagyulladás split in two as agyhártyagyulladás--Anatoli T. (обсудить/вклад) 06:28, 29 May 2019 (UTC)
    First, because that's what the etymology section is for, and the etymology section can give more detail, such as which sense of the word is relevant to the compound, whether one element is obsolete, and so on. Secondly, because sometimes simply linking the visible parts of the word leads to wrong results. For example, Burmese ကျောင်းသား (kyaung:sa:) isn't ကျောင်း (kyaung:) plus သား (sa:), it's ကျောင်း (kyaung:) plus -သား (-sa:); and ပန်းသီး (pan:si:) isn't ပန်း (pan:) plus သီး (si:), it's ပန်း (pan:) plus အသီး (a.si:). And of course it is possible to write |head=[[ကျောင်း]][[-သား|သား]] or |head=[[ပန်း]][[အသီး|သီး]], but why go to that trouble, especially when all we're doing is copying what the Etymology section says, and in a less informative manner? —Mahāgaja · talk 06:47, 29 May 2019 (UTC)

I always unlink in the headword line. Because they can be described in etymology instead. Besides, some compounds change in spelling when they join, which cannot explain in headword line, or sometimes cannot put wikilink at all. --Octahedron80 (talk) 08:26, 29 May 2019 (UTC)

@Mahagaja, Octahedron80: OK, thanks. I have already switched to not linking on new entries. Yeah, I should mentioned the change spelling. All the more it would be important showing the combining forms (visible) in the header for Thai and Khmer but link to the actual, non-combining entry but, anyway, I'll follow the decision. --Anatoli T. (обсудить/вклад) 11:43, 29 May 2019 (UTC)

June 2019

Addition of {{rootsee}} to general entries[edit]

Discussion moved from Wiktionary:Tea room/2019/June.

@Ankitdimania has been applying {{rootsee}} to general entries such as exclamation in the "Derived terms" section. Just wanted to check if this is appropriate, as I thought {{rootsee}} was intended for use on entries concerning roots only, such as "Reconstruction:Proto-Indo-European/kelh₁-". — SGconlaw (talk) 19:07, 2 June 2019 (UTC)

Thank you SGconlaw for getting general feedback on this. My intention of using {{rootsee}} template is that all words associated with a root are clubbed in one place and then can be used to show etymologically related words with just one edit. Also, any addition/subtraction is dynamic, i.e. if I add a new word to the category, it will get reflected in all the places where related words with root is used. Ankitdimania (talk) 00:34, 3 June 2019 (UTC)
I'm also open to finding any better way to achieve this. Please help. Ankitdimania (talk) 00:34, 3 June 2019 (UTC)
I found one flaw in using rootsee though. The list is expanded by default, and as such, the page is long and difficult to comprehend. If the list could be unexpanded by default, or some other way, it would be easier to read the entry and then expand the list to find etymologically related words. Ankitdimania (talk) 00:37, 3 June 2019 (UTC)
I enjoy seeing this information, but I think it shouldn't be expanded by default. Same with cognates, I prefer a collapsed list. Ultimateria (talk) 02:32, 3 June 2019 (UTC)
Perhaps it makes sense for {{rootsee}} lists to be expanded in root entries? I don't know. — SGconlaw (talk) 03:53, 3 June 2019 (UTC)
  • I'd vote against extensive use of these, specifically any use in English and taxonomic entries. We already have intimidating and tedious lists of cognates in etymology sections wasting our normal users' time. I don't think we need more of this kind of thing. DCDuring (talk) 01:36, 3 June 2019 (UTC)
    Isn't this is a multi-entry matter. As such BP seems like the right place for it. DCDuring (talk) 01:38, 3 June 2019 (UTC)
    I've moved the discussion. — SGconlaw (talk) 03:53, 3 June 2019 (UTC)

Wiktionary:Votes/2016-07/Adding PIE root box. —Suzukaze-c 03:59, 3 June 2019 (UTC)

I'm glad somebody remembered this. This seems like an open-and-shut case. The removal should begin. DCDuring (talk) 08:21, 3 June 2019 (UTC)
So, for clarity's sake, the suggestion is that the previous vote on PIE root boxes suggests that PIE root descendants should similarly not be added to entries via {{rootsee}}? Because the vote didn't actually touch on the current issue. — SGconlaw (talk) 10:38, 3 June 2019 (UTC)
I would interpret it that way, though the wording was absurdly narrow. As worded, it would not forbid a yellow and red display of each PIE-derived related term at different random locations in the entry flashing at seizure-inducing intervals. DCDuring (talk) 12:03, 3 June 2019 (UTC)
Thank you for feedback on this. Also, if we find this information still relevant, I can update the {{rootsee}} template to have the box collapsed by default. Alternatively, we can just put links similar to "English terms derived from the PIE kelh₁" — Ankitdimania (talk) 17:42, 3 June 2019 (UTC)
Yes, we already have the category pages with precisely this information. Or is it just a matter of presenting the information directly in the entry? – Jberkel 05:06, 4 June 2019 (UTC)
I support the use of this template, as it's bound to give a more complete picture of all related terms. Moreover, it avoids duplication of related terms across entries. —Rua (mew) 19:10, 3 June 2019 (UTC)
I checked how to collapse the box, we can change the depth value in https://en.wiktionary.org/w/index.php?title=Template:rootsee&action=edit, to 0. The usage is documented in CategoryTree. I can test and make the update if it is acceptable. Ankitdimania (talk) 03:22, 4 June 2019 (UTC)
  • What is the point of presenting this information in lieu of lists curated by humans? Some of the items included are just silly, eg, blends that don't contain the root.
In any event the extensive information is available to anyone who cares at the entry for the PIE term. There is already a category link in the entries. If it is too exhausting for those few who are interested in PIE cognates, a link to the recontructed PIE term could be inserted in an Etymology section. DCDuring (talk) 13:58, 4 June 2019 (UTC)
  • To give an example how this information is structurally better, I have a recent edit as an example vs pugnacious#Related_terms (here we can see it takes less effort, is quite compact and is dynamically updatable at other places like pugilism's related entries).
While other approaches give a bit of pain, e.g.
1.) Human curated list is not always up to date, or extensive. A related list is present in one place, but not in other places. Some words are added in a few entries but skipped in other related entries, etc.
2.) Also, Human curated list will require more manual effort to add a new word across all related pages.
3.) Category link in the entries are at the bottom of the page (sometimes after a long scroll through other language's entries, which is not intuitive). e.g. pen is interestingly related to feather and pinion, which is esoteric due to the long scroll on pen's entry. We can, though, add the category link in related entries itself and that would be preferable to me.
Link to reconstructed PIE terms in Etymology section is a good middle ground here. Another benefit here is that the etymology section would have to be a bit more detailed, which would be nice.
Also, I agree that the list is a bit intimidating and tedious, but listing weird entries such as blends or composites give beautiful insights into the relationship of words. e.g. Insights by Norman Lewis are an interesting read on this. I'm still in favor of listing the {{rootsee}}, just the list should be collapsed to give user an option to expand it if relevant to him/her. With a collapsed list, user can just skip the section and it's not intimidating anymore.
Please LMK, how you wound want to structure the page? Ankitdimania (talk) 19:25, 9 June 2019 (UTC)
By reverting to the human-curated material. What is the problem with simply having a link to the PIE root and having all the {{rootsee}} there or on subpages of there or hidden under the direct derivatives in each language? The romance of the "beautiful insights into the relationship of words" is of no appeal except to the amateur linguists. I am concerned that some of them who apparently have no sense of responsibility for making Wiktionary useful to normal users and instead are using this project to indulge themselves. DCDuring (talk) 22:56, 9 June 2019 (UTC)
The related terms for calyx are: apocalypse, calyx and occult. Not very helpful to show that the entry is related to itself. – Jberkel 06:54, 12 June 2019 (UTC)

Poll[edit]

For ease of reference, what follows is a list of editors who support and do not support the use of {{rootsee}} in ordinary entries. Please add your names to the poll after you have participated in the above discussion to your satisfaction. — SGconlaw (talk) 09:43, 10 June 2019 (UTC)

Support
Do not support
Abstain

A proposal for WikiJournals to become a new sister project[edit]

Over the last few years, the WikiJournal User Group has been building and testing a set of peer reviewed academic journals on a mediawiki platform. The main types of articles are:

  • Existing Wikipedia articles submitted for external review and feedback (example)
  • From-scratch articles that, after review, are imported to Wikipedia (example)
  • Original research articles that are not imported to Wikipedia (example)

Proposal: WikiJournals as a new sister project

From a Wikipedian point of view, this is a complementary system to Featured article review, but bridging the gap with external experts, implementing established scholarly practices, and generating citable, doi-linked publications.

Please take a look and support/oppose/comment! Evolution and evolvability (talk) 04:24, 3 June 2019 (UTC)

Request for rights[edit]

Hi there. I came here to request autopatrolled rights. I used to edit here as user:Diego Grez-Cañete, but no longer have access to that account. I also used to be an autopatrolled an rollbacker, but lost these rights long time ago. The autopatrolled right would allow me to create entries faster, as I am forbidden to create more than two or three entries per minute. My interest, atm, is to create entries for gentilicios of Chile. Nothing that can't be cited. Thanks in advance. --Cuatro Remos (talk) 19:01, 3 June 2019 (UTC)

Yes check.svg Done. If you have retired the User:Diego Grez-Cañete account, please update your current user account so that it does not redirect to it. Thanks. — SGconlaw (talk) 19:13, 3 June 2019 (UTC)
Thank you. Have done so. --Cuatro Remos (talk) 19:26, 3 June 2019 (UTC)

User Stephen G. Brown[edit]

User:Stephen G. Brown hasn't been active since the 10th of Feb this year - one of our most active long-time editors who contributed in a big number of languages and scripts. I was connected with him outside Wiktionary. He hasn't responded to any contacts. It makes me worry about his health. --Anatoli T. (обсудить/вклад) 04:22, 4 June 2019 (UTC)

I hope he's alright. I searched (briefly) for obituaries of people with that name and didn't spot any (except one from 2018, clearly not him). - -sche (discuss) 05:13, 4 June 2019 (UTC)
I always wondered why somebody with such an outstanding command of languages across multiple continents would waste his time here. Maybe he got a hobby. It's our loss. Equinox 06:56, 4 June 2019 (UTC)
Last WP contribution also February. DCDuring (talk) 14:07, 4 June 2019 (UTC)

{{ja-spellings}} doesn't work well with wago at kanji[edit]

As is shown by the vote, many editors do not support lemmatizing all wago at kana entries. This means that a large number of wago would be lemmatized at kanji, such as 戦う, and consequently require both {{ja-spellings}} and {{ja-kanjitab}}:

{{ja-spellings|たたかう|h=たたかふ|戦う|闘う}}
{{ja-kanjitab|たたか|yomi=k}}

This complicates the entry layout because floating elements are laid right to left, as shown at 敷居. It would be possible to make them stack vertically by using the floatright class, but as Eirikr explains, this causes other problems.

Moreover, the kanji spellings in {{ja-spellings}} are shown in a larger size than the kana spellings. This works fine if wago are lemmatized at kana and the reader wants to look up by kanji, but not if wago are lemmatized at kanji and the reader wants to look up by reading (kana).

Therefore {{ja-spellings}} doesn't work well with wago entries at kanji. Given that a lot of wago entries would be lemmatized at kanji, I would like to remove the template and propose the following scheme instead:

  1. Move the kanji spellings to {{ja-kanjitab}}. Extend {{ja-kanjitab}} to accept "alternative kanji spellings", like this:
    {{ja-kanjitab|たたか|yomi=k|alt=闘う}}
    
    Kanji in this term
    たたか
    Grade: 4
    kun’yomi
    Alternative spelling 闘う


    {{ja-kanjitab|たたか|yomi=k}} // followed by a {{ja-see}}
    
    Kanji in this term
    たたか
    Grade: S
    kun’yomi


    {{ja-kanjitab|alt=然して}}
    
    Alternative spelling 然して
  2. Move the modern and historical kana spellings to {{ja-pron}}. This might not be feasible at the moment and we can keep them first in the headword templates, but in the long run we can modify {{ja-pron}} to accept both modern and historical kana spellings:
    見違ふ (「見違える」の文語形。「みちがふ」/「みちがう」で立項してもOK)

What do you think of this approach?

(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 16:15, 6 June 2019 (UTC)

(1) I still think that option 1 of the vote is conceptually nicer and simpler. But if we must, perhaps this is alright. Or we could revive the "Alternative spellings" header, keeping more in line with standard entry format (although we do have things like لباس#Persian).
(1, 2) As long as it doesn't get too confusing. —Suzukaze-c 19:05, 6 June 2019 (UTC)
@Suzukaze-c: Um..., actually I never liked headers. They're fine for Wikipedia, but using them in Wiktionary entries is like writing XML like
<key level="1">Note</key>

<key level="2">To</key>
<value>George</value>

<key level="2">From</key>
<value>John</value>

<key level="2">Heading</key>
<value>Reminder</value>

<key level="2">Body</key>
<value>Don't forget the meeting!</value>

for what's usually

<note>
    <to>George</to>
    <from>John</from>
    <heading>Reminder</heading>
    <body>Don't forget the meeting!</body>
</note>
--Dine2016 (talk) 16:17, 13 June 2019 (UTC)

CFI issue[edit]

I raised a question about inclusion of hyphenated compounds on the CFI talk page here, but now I look again at that page, there seems to be surprisingly little activity, so I am mentioning it here too just in case no one ever sees it in that place. Mihia (talk) 22:03, 6 June 2019 (UTC)

Vote: Language code into reference template names[edit]

FYI, I created Wiktionary:Votes/2019-06/Language code into reference template names.

Let's postpone the vote as much as discussion needs, if at all. --Dan Polansky (talk) 08:00, 7 June 2019 (UTC)

Template:archaic synonym of[edit]

According to the deletion message, this was deleted per WT:RFDO. But where is the deletion discussion? There's nothing on the talk page and nothing among the pages that link to it either. —Rua (mew) 14:23, 9 June 2019 (UTC)

I assume MK deleted it during a deletion spree. Who cares though, right? --I learned some phrases (talk) 14:12, 10 June 2019 (UTC)

{{top3}} in descendants section (e.g. Proto-Slavic)[edit]

(Notifying Rua, Wikitiki89, Atitarev, Benwing2, Guldrelokk, Bezimenen, Jurischroeer, Greenismean2016, Chignon):

Is it useful or useless? Compare descendants with and without {{top3}}. From my view, advantage is that it shortens long narrow lists via filling empty space in right side, thus making entry easier to read. —Игорь Тълкачь (talk) 15:20, 9 June 2019 (UTC)

I think it looks ugly, especially when the columns don't line up with the three Slavic subgroups, which is often the case. We should take into account mobile users, for which the single list is better than the columns. Also, just from a customary point of view, single lists are most definitely the norm on Wiktionary, with columns only used in a few cases. —Rua (mew) 15:21, 9 June 2019 (UTC)
From what i see: 1) It's problem of some browsers: correct in Google Chrome, incorrect in FireFox, Internet Explorer, unknown in Opera. 2) In mobile version columns are single. 3) It's hard to count in Main namespace, but in the Reconstruction there are ~118 cases (e.g. *ćwíšah, *xātun): ~80 (Iranian), ~14 (Turkic), 9 (Germanic), 6 (Algonquian), 5 (Semitic), ~5 (other). Anyway it's just extrapolating from other sections (e.g. Translations, Derived terms, ...). —Игорь Тълкачь (talk) 15:27, 10 June 2019 (UTC)
Correct in Opera. —Игорь Тълкачь (talk) 22:37, 11 June 2019 (UTC)
Exception (incorrect in all 4 browsers): More brokenness in {{top3}}, {{mid3}}. —Игорь Тълкачь (talk) 15:17, 16 June 2019 (UTC)
On mobile, the list reverts to a single column. It would be nice if {{mid3}} actually worked to force breaks. @Erutuon --{{victar|talk}} 02:58, 12 June 2019 (UTC)
@Victar: Columns are overridden by .derivedterms, .term-list { -moz-column-count: 1 !important; -ms-column-count: 1 !important; -webkit-column-count: 1 !important; column-count: 1 !important; } in MediaWiki:Mobile.css. User:DTLHS added that rule in these edits. I'm not confident that mobile screens can always show three columns, so would rather not make a decision in the matter. — Eru·tuon 03:06, 12 June 2019 (UTC)
Yeah, that was added at my behest. I pined you about {{mid3}} though. --{{victar|talk}} 03:08, 12 June 2019 (UTC)

I see little response here, so now i tried to count users who added/removed {{top3}} in Proto-Slavic (the list below is incomplete):

This discussion is not new, the earliest probably was in 2015/04/15, but it didn't get any objections. 4 years have passed and now i notice that Rua (2019/04/15) started removing {{top3}}. Such actions can lead to numerous edit conflicts, because {{top3}} is used in 1000+ Proto-Slavic entries. —Игорь Тълкачь (talk) 16:43, 16 June 2019 (UTC)

@Rua, Useigor: User talk:Wikitiki89/2018 § top3, mid3, bottom in Proto-Slavic entries. I think I support Rua's proposal to remove the template, because it's broken all too often for me. (I'm Chignon) Canonicalization (talk) 11:37, 17 June 2019 (UTC)

Proposal: Make Latin the primary script for Serbo-Croatian[edit]

@Ivan Štambuk, Crom daba, Vorziblix At the moment, we duplicate a huge amount of information by having the exact same entry at both the Latin and the Cyrillic spelling. For English, we eventually relented and made colour link to color. I think the same should be done for Serbo-Croatian: the Cyrillic spelling should be defined as an alternative spelling of the Latin spelling (or "Cyrillic spelling"), and all information that is already present on the Latin page, such as etymology and pronunciation, should be removed from the Cyrillic page. Descendants and translations should be given only in Latin script, so no more cumbersome nesting. The reason I think Latin should be the primary script is that it's used in all four countries, and appears to be favoured in everyday use even in those that use both. —Rua (mew) 18:29, 11 June 2019 (UTC)

LOL, never going to happen. --{{victar|talk}} 03:09, 12 June 2019 (UTC)
It might work, it's all up to the community. We're almost there with dual re-transliterations into the other side - Roman/Cyrillic and vice versa but more needs to be done.
  1. Cyrillic to Roman converts one-to-one but there are cases when Roman to Cyrillic need to be decided
  2. All inflection tables need to display both Roman and Cyrillic.
  3. Consider using new Serbo-Croatian language-specific templates like {{sh-l}} with automated conversions, compare with {{zh-l}}, e.g. 中國中国 (Zhōngguó), which display traditional Chinese, simplified Chinese and transliterations with only traditional Chinese 中國 in the input.
  4. Cyrillic entries shouldn't be deleted, IMO but be converted to soft-redirects.
  5. We should also address how we display translations, there's a lot language-specific templates can do what {{t+}} or {{t}} can't, e.g.:
宮崎県 (みやざきけん) (きゅう) (しゅう) () (ほう) (なん) (とう) () () ()する (けん)
Miyazaki ken wa Kyūshū chihō no nantōbu ni ichi suru ken.
Miyazaki prefecture is situated in the south-east part of the Kyūshū region.
The above Japanese example doesn't have any Roman script. I can go on talking about Chinese, Thai, Korean Khmer templates. --Anatoli T. (обсудить/вклад) 06:14, 12 June 2019 (UTC)
I think it's a good idea. Trying to keep two separate entries for every S-C lemma and nonlemma form synchronized is absurd. To Anatoli's points:
  1. Sure, there may be times when a manual Cyrillicization needs to override the automatic one. Ought to be trivial.
  2. Agreed.
  3. Agreed.
  4. Well, duh.
  5. See point 1 above.
It feels like a lot of work, but reducing unnecessary duplication will be worth it. —Mahāgaja · talk 15:02, 12 June 2019 (UTC)
I’m inclined to agree with this proposal; making duplicate entries for every term is frustrating, and maintaining them so they stay synchronized is next to impossible. Latin script predominates even in Serbia (at least outside of official contexts). (It’s worth noting, though, that Cyrillic is also "used in all four countries", though its usage share in each has been rapidly declining over the past two centuries.) Of course, we’d still have duplication between entries for ekavian/ijekavian variants, but a two-way duplication is a decided improvement over a four-way one. — Vorziblix (talk · contribs) 15:15, 12 June 2019 (UTC)
Symbol support vote.svg Support (my main concern is duplication of content, so I'd be fine too with making the Cyrillic spellings the lemmas). @Victar, are you opposed to the proposal, or do you simply think it won't garner enough support? Canonicalization (talk) 16:08, 12 June 2019 (UTC)

@Fay Freak Canonicalization (talk) 17:41, 12 June 2019 (UTC)

It would be easier to create entries. And indeed the inflection tables need more stuff done automatically. All that saves time. It does not despend on new linking templates though those are possible. Changing existing Cyrillic templates to display the new order is however critical in so far as they as they are already out of sync. What happens if some noob has added additional information to the Cyrillic entry that lacks on the Latin entry? Ivan Štambuk had some machine to detect whether entries are exact mirrors. But if there isn’t the parallelism and this is detected, I am afraid, a human must clean up and move because no machine can decide.
Or what’s with some kind of gadget that could convert a Latin entry into a Serbo-Croatian one, and what with a bot that applies changes done to one side only after some time to other?
At some point I have suggested on Wiktionary already – I’d need to search where – to have some template on Cyrillic pages (or the opposite) that fetch the content or whole language section of the other page (like {{desctree}}): so the scripts look treated equally and one finds all at every page but it isn’t twice the work. This looks the best to me. Ideally one could perhaps even write only one line: instead of putting ==Serbo-Croatian== one calls a template on that line that invokes a module. For why write {{spelling of|sh|Cyrillic|čutura}} plus header plus L3 and L2 and possibly altforms and inflection there too if you can get the whole with one line and then it even displays all there? Least work for editors, greatest gain for readers because they find all at every spelling and virtually synchronously. It would be fun to expand Serbo-Croatian on Wiktionary with such an architecture. That’s the utmost concentration.
Coding is needed for every alternative. Fay Freak (talk) 23:02, 12 June 2019 (UTC)
@Fay Freak: Perhaps labeled section transclusion (LST) could be used to grab the Serbo-Croatian section instead of a Lua function. That would save Lua memory, but maybe not much work otherwise; lots of entries would have to be edited (both the source and target of transclusion) and templates would still have to be made to display the right script on each page. — Eru·tuon 23:31, 14 June 2019 (UTC)
Not to forget the table templates. Those that I have created a month ago like the playing cards one have a based |sc= parameter whereby the display switches the script according to the script code. Others like the colours template just display all scripts. The list templates have subpages for all. This must be regularized for all Serbo-Croatian tables or list templates for the module to get it. Fay Freak (talk) 23:22, 12 June 2019 (UTC)

(in response to what has been posted so far) I'm not proposing to do it all in one go, or to make huge sweeping changes to our infrastructure for SC. The proposal is just to codify our intent to convert Cyrillic spellings to alternative forms of the Latin ones, and to remove the nested structure that is currently present in descendants and translations. This could be done on a page-by-page basis, whenever someone happens to come across it. As long as we know what direction we're going to be moving in on this. I don't really see the need for special-purpose templates for SC, let alone page-copying stuff stuff like what Fay Freak proposes. In proposing Latin as the primary script, I meant that when we link to a SC term, we link only to the Latin script form, which in turn lists the Cyrillic script for those interested. —Rua (mew) 10:21, 13 June 2019 (UTC)

But it is preferable if one can treat the display equivalently. So one can give in a Cyrillic term without needing to follow a soft-redirect just to use the dictionary like a dictionary. Linking Latin and Cyrillic forms at the same time is the least problem. What you propose is also a decrease in usability – why convert Cyrillic spellings to alternative forms of the Latin ones if they are successfully parallel at the time being? Why force people to click on Latin links if we could also link both, and even easily via {{sh-l}}? “I don't really see the need.” Fay Freak (talk) 11:57, 13 June 2019 (UTC)
The same argument could be made in opposition to the proposal entirely, because we're removing definitions and etymologies from the Cyrillic pages. They are "successfully parallel at the time being" too, after all. Why treat them equally in some respects but not in others? —Rua (mew) 12:50, 14 June 2019 (UTC)
I don’t think the entries at the time being are even successfully parallel; in the absence of Ivan Štambuk’s watchful eye a noticeable number have drifted apart, and a good many lack a Cyrillic or Latin counterpart entirely. — Vorziblix (talk · contribs) 16:20, 14 June 2019 (UTC)
But I did not talk about the whole, but about those which are parallel (why—if); in fact I warned that the conversion is manual work for those that are not equal. If a Cyrillic page is made a soft redirect this means a usability decrease. Readers want Cyrillic main entries. But we want to repeat less and we want to save attention from the synchronization. So the idea is to have the whole at the Cyrillic pages but in an automated fashion. The page only as displayed will be copied, in the source code we won’t have to do anything but insert a template which fetches the one page from the other.
Oh, I see; sorry, I misinterpreted a bit. I agree that your proposal (automated fetching) would definitely be ideal if it’s technically workable. I must admit I have no idea whether or not that’s the case. If not, though, I’d still support alt-form-style conversion of entries as preferable to the current system. — Vorziblix (talk · contribs) 23:13, 14 June 2019 (UTC)
I don’t understand what “Why treat them equally in some respects but not in others?” is supposed to mean. I have said it already: Different interests: If we treat inequally only to save work, we only have to do it to the extent in which it saves work. What I have outlined is a milder measure, and it will be more agreeable. The hard fans of Cyrillic will say: That’s a great measure, you save work but it does not look like the Cyrillic script is inferior or something. Then I am confident that in the future it will never be in question “why we have done that”. It is best if the end user does not see the problems the creator of the application had. Now the end user can choose arbitrarily which alphabet he types in and the system does no pressure to use Latin. It would be sad to sacrifice the Cyrillic script for occasional danger of asynchronicity and some saved repetitions only because we have no module to get the most out of all. And if we have, it enlightens the attitude of any editor who is pro-Cyrillic. Fetching all through a template with a module is danker than creating alternative forms. Think about the editors we possibly lose because they have a personal dislike to be subjected to such a ranking of Latin. If it is done differently, the resistance will be less. Forever – I think it is the best possible solution. Think about our marketing claims: “In Wiktionary you can type in Latin and Cyrillic and it will be equal.” Fay Freak (talk) 18:32, 14 June 2019 (UTC)

"heading" label[edit]

Examples of the "heading" label at draw#verb:

  1. (heading) To move or develop something.
    1. To sketch; depict with lines; to produce a picture with pencil, crayon, chalk, etc. on paper, cardboard, etc.
    2. To deduce or infer.
    ...
    ...
  2. (heading) To exert or experience force.
    1. (transitive) To drag, pull.
    2. (intransitive) To pull; to exert strength in drawing anything; to have force to move anything by pulling.
    ...
    ...

To me, this "heading" label seems superfluous almost to the point of being confusing. I am tempted to remove it where I see it. What do other people think? Does anyone think the label is useful? Mihia (talk) 20:02, 15 June 2019 (UTC)

  • I don't know what it is supposed to be saying. Feel free to remove it. SemperBlotto (talk) 20:05, 15 June 2019 (UTC)
I agree it is unclear. It's an attempt to group related senses together; my suggestion would be either to provide a definition that is a gloss, or a non-gloss definition, as appropriate, like this:
  1. Senses meaning to move or develop something.
    1. To sketch; depict with lines; to produce a picture with pencil, crayon, chalk, etc. on paper, cardboard, etc.
    2. To deduce or infer.
    ...
    ...
  2. To exert or experience force.
    1. (transitive) To drag, pull.
    2. (intransitive) To pull; to exert strength in drawing anything; to have force to move anything by pulling.
    ...
    ...
SGconlaw (talk) 22:26, 15 June 2019 (UTC)
AFAICR, it was an invention of @-sche intended to allow grouping of senses where there is no single definition that the contributor can think of that could stand in that location. It is necessary to make it clear to a user that what is on such a line is NOT a definition, even though it is positioned where one would expect a definition. Usually someone comes up with some more helpful label or non-gloss definition than "(heading)", along the lines Sgconlaw suggests. MWOnline and other dictionaries have groups of subsenses that do not have a sense-level definition. It is an artifact of wikiformat ("#" and "##") that we cannot duplicate their numbering scheme. DCDuring (talk) 22:38, 15 June 2019 (UTC)
It is an empirical question whether italics, even with the good wording Sgconlaw uses, are a sufficient indication that the content of the definition line should not be read as a definition. Sadly we don't have reliable means of running an experiment. DCDuring (talk) 22:45, 15 June 2019 (UTC)
I find subsenses confusing, headings or not, but the consensus seems to be that they should get used more (a while ago: Wiktionary:Beer parlour/2015/May#ELE: explicitly ban nested subdefinitions/subsenses? Or allow in rare cases?). In any case, they should at least get mentioned in WT:EL. – Jberkel 23:28, 15 June 2019 (UTC)
I am not keen on the "Senses meaning ..." suggestion. If we are going to use this format, we should just make the heading line read as a broad definition, in my opinion. Mihia (talk) 00:45, 16 June 2019 (UTC)
(@DCDuring's comment above) you are probably thinking of times I've converted entries to use subsenses :) but I always use coherent gloss or non-gloss definitions for the "super-sense"; "heading" labels are not my doing and I remove them when I see them. - -sche (discuss) 01:06, 16 June 2019 (UTC)
@Mihia: I would say give a broad definition wherever possible, but in some cases you may find that a non-gloss definition beginning with “Senses meaning […]” may be more appropriate, so I wouldn’t rule it out. For example, in some entries it seems appropriate to use NGDs like “Nautical senses” or “Senses relating to animals”. — SGconlaw (talk) 02:40, 16 June 2019 (UTC)
More extreme measures may be required for this entry. The highest level groups seem to me to be too abstract. For this word MWOnline has no more than five definitions in any of their groups of definitions, many of which have no master definition. They have nearly 50 definitions, compared to our 39. If we want to extirpate this kind of definition structuring, User:ReidAA, active from early 2013 to late 2015 did (some of?) them and used "structuring" in his edit summaries, AFAICT. DCDuring (talk) 03:32, 16 June 2019 (UTC)
I think we should usually keep the subsenses / top-level senses, and only remove "{{lb|en|heading}}". - -sche (discuss) 04:22, 16 June 2019 (UTC)
That's what we should do in the ambulance, but when we get this particular patient to the hospital, we can't just send it home. DCDuring (talk) 05:06, 16 June 2019 (UTC)
  • OK, well, notwithstanding the other issues, there does not seem to be support for the "heading" label, so I have removed it. Mihia (talk) 17:16, 16 June 2019 (UTC)

Partial blocks deployment to Wiktionary[edit]

Hello Wiktionary contributors,

Wikimedia Foundation Anti-Harassment Tools team is continuing to make improvements to Special:Block with the addition of the ability to set a partial block

While no functionality will change for sitewide blocks, Special:Block will change to allow for the ability to block a named user account or ip address from:

  • Editing one or more specific page(s)
  • Editing all pages within one or more namespace(s)

Additionally, changes are being made to the design of the user interface for Special:Block to enable admins to set partial blocks.

Until now partial block has only been deployed on Wikipedias. Since Wikipedia administrators found partial blocks useful and there are no serious known issues or bugs, our team is planning to introduce partial blocks into more Foundation wikis. We think it is important to find any bugs that might exist for Wikisource, Wiktionary, Commons, Wikidata, etc. that might not be on Wikipedias so we are going to deploy to a few of these wikis next week with our software developers ready to respond to any issues that may arise.

Currently it is scheduled to SWAT deploy to English Wiktionary on Monday, June 17, 2019.

Let me know if you have any questions or thoughts about introducing partial blocks on Wiktionary. For the Anti-Harassment Tools team. SPoore (WMF) (talk) 22:21, 15 June 2019 (UTC)

We always welcome useful hand-me-downs.
Why is this specifically an Anti-Harrassment matter? Is the idea that we can partially implement IBANs by not letting alleged harassers post on individual user talk pages and on Wiktionary discussion space? DCDuring (talk) 23:07, 15 June 2019 (UTC)
Yes, there are times when a full site block might not address the issue as well as other editing restrictions might. One of our working hypotheses is that some users are not given a full site block because it is too harsh. So, partial block is a more targeted option. This page lists some uses.
Additionally, partial blocks are being used to block ip contributors and vandals from one or a few pages to prevent collateral damage to other good users. Also, I can share documentation with you that show how other wikis are changing their local block policy and writing help pages about setting a partial block. SPoore (WMF) (talk) 12:41, 17 June 2019 (UTC)
Partial blocks is now deployed. Let us know if you notice any issues or have questions.
Here is a description of the use of partial blocks Also here is a page that the Italian Wikipedia created about partial blocks. This wiki might want to update there policies according with something similar. SPoore (WMF) (talk) 20:56, 17 June 2019 (UTC)
  • Do we want to make policies first or use these partial blocks and develop policies as needed? DCDuring (talk) 21:16, 17 June 2019 (UTC)
I put something informational on WT:Blocking policy#Partial blocks. Does it need a vote? DCDuring (talk) 21:34, 17 June 2019 (UTC)

RP pronunciation[edit]

Is the RP pronunciation of o'er wrong or is RP pronunciation different form the one in the Cambridge dictionary, /əʊə/? --Backinstadiums (talk) 15:30, 17 June 2019 (UTC)

Collins and Longman both give both pronunciations: the one that rhymes with more and the one that rhymes with mower. We should too. —Mahāgaja · talk 18:01, 17 June 2019 (UTC)

Languages that "use English"[edit]

Wikipedia has articles on:

These two links seem to imply something about the nature of Hakka and Min Nan dialects. They are using the "English" spelling as the name of their page for that nation in their Wikipedias. So IS 'Mauritius' a Hakka word? Is it a Min Nan word? If not, is there ANY place on this website where we would link to hak:Mauritius and nan:Mauritius?

--Geographyinitiative (talk) 04:43, 18 June 2019 (UTC)

You're making the mistake of assuming that a Wikipedia in a language is necessarily an accurate reflection of that language. Chinese is a macrolanguage, which means that the dominant lect tends to be used for many topics rather than the people's native lects. In languages such as these without an extensive corpus of writings in every possible subject, it's often impossible to find an authentic native word for everything that requires an article- so Wikipedia editors tend to make stuff up or borrow it from other languages. Of course, that's not unlike the kind of borrowing that happens at some time in the history of every language, but in the case of Wikipedias, the words tend not to be used by actual speakers who aren't writing Wikipedia articles. Not only that, but sometimes authentic words do exist that Wikipedia editors don't know about- so you have made-up words taking the place of real ones. I can't tell you how many times we've have to revert people who add bad translations in languages they don't know, "borrowed" from wikipedias in those languages. Chuck Entz (talk) 05:57, 18 June 2019 (UTC)
That's a pain of many languages but it also reflects the lack of language policies, especially when there is no such thing with mostly spoken dialects. Even Vietnamese, which has a rather peculiar situation with foreign place names, has a native word for Mauritius, it's Mô-ri-xơ, which we want in the dictionary, even if they often "borrow" the English name for country names, e.g. "Mauritius" (which will still be pronounced "Mô-ri-xơ") and many others. I think it's best not to add the "borrowed" spelling. 毛里求斯 (Máolǐqiúsī) has the Min Nan form, even if Min Nan Wikipedia uses "Mauritius". --Anatoli T. (обсудить/вклад) 06:17, 18 June 2019 (UTC)