Wiktionary:Beer parlour/2020/December

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Linking to specific meaning item within a Definition[edit]

I have in the past set an anchor e.g., {{anchor|ExampleMeaning}} and linked to it using [[ExampleDefinition#ExampleMeaning|LinkedWordInMyWikqArticle]] but I don't know if such changes are accepted within the Wiktionary Commuity as Kosher. Will such anchors set in Wiktionary articles be deleted by rigorous editors? Is there an alternative preferred method of linking to a specific meaning within the general definition from another wiki page, e.g., a Wikiquote article? I generally link to the specific part of speech, but would like the capability of narrowing down to the specific meaning which applies to the word I am linking from. Is this a valid programming or community issue? Currently, it seems that the meanings are floating, i.e., the order of the meanings can be changed by an editor, so the anchors, if retained with the specific definition, will also float with that specific meaning.

ELApro (talk) 22:00, 1 December 2020 (UTC)[reply]
A current practice is to add {{senseid}} in front of the definition and then use the |id= parameter of various templates to link to it, e.g. {{l|en|fragment|id=internet}} => fragment. – Jberkel 22:26, 1 December 2020 (UTC)[reply]
Doesn't that template only apply within Wiktionary? How could it be applied for an external link from within a Wikiquote article?–ELApro (talk) 17:44, 7 December 2020 (UTC)[reply]
Yes, it's designed to be used within Wiktionary, but you could build templates in other Wikis to generate the links, in the format used by {{senseid}} (#Language-ID). – Jberkel 17:57, 7 December 2020 (UTC)[reply]
Sure, but having a template is good practice to avoid hardcoding the specific fragment format. In theory, the links created by {{senseid}} could change. If it's just for one-off linking there's no point creating a template of course. – Jberkel 18:34, 7 December 2020 (UTC)[reply]
Thanks. That will work great for existing senseID tags, but if I need to create one, is there a "List" of "standard" senseid category identifiers that I can reference, so I'm not adding a senseID that will need to be replaced?-ELApro (talk) 23:33, 7 December 2020 (UTC)[reply]
No, it's freeform. If there's already a (unique) label attached to the sense I often use that (in the example above (“internet“). You can also use Wikidata ids (QXXX…), for the fragment this would be fragment identifier (Q1440450). – Jberkel 00:02, 8 December 2020 (UTC)[reply]
I would add that there's no reason to avoid duplication as long as you aren't duplicating a sense ID on the same page. Even then, the worst thing that could happen is you link to the wrong place on the correct page. Chuck Entz (talk) 04:16, 8 December 2020 (UTC)[reply]

What do "Template:rfdef" mean?[edit]

What do the letters of "{{rfdef}}" template mean? I'm curious. And which is the template similar to requested etymology? --Vivaelcelta (talk) 00:41, 2 December 2020 (UTC)[reply]

@Vivaelcelta:
{{rfdef}}: request for definition, as I've understood it anyway. Along similar lines, request for etymology == {{rfe}}. ‑‑ Eiríkr Útlendi │Tala við mig 01:13, 2 December 2020 (UTC)[reply]
@Vivaelcelta: Also, please note that WT:Information desk is the better place for such questions. The Beer parlour is for discussions about policy and the like. Andrew Sheedy (talk) 05:10, 2 December 2020 (UTC)[reply]
@Andrew Sheedy: Thank you. --Vivaelcelta (talk) 00:01, 4 December 2020 (UTC)[reply]

Derived terms templates[edit]

There seem to be two competing and visually incompatible sets of templates for the derived terms section in common use. One set is template:der-top and template:der-bottom which hides everything under a banner with a "hide/show" toggle on the right. (See for example steigen#Derived terms.) Another is template:der3 (really just a redirect to template:col3) which hides all but 9 of the entries with a "show more/less" toggle at the bottom. (See for example sprechen#Derived terms.) I personally prefer the top/bottom style because it looks neater (imo), there's usually no need to view this list at all when you're looking up a word, and when you do need to view the list having just nine visible doesn't help much. On the other hand with der3 template you can just put the language code at the top and not have to have a separate label template for each entry. Another issue is that der3 automatically sorts the entries, and I can see the advantages of that but I can also imagine there might be circumstances where you might want to have the entries in a customized order. So is one of these formats preferred over the other? I'd like to know so that going forward if I end up adding a derived terms section myself I'll know which one to use. Also, since these lists can be very long, would it be helpful to break them up into subsections? For example as prefixed forms/suffixed forms/multiword forms. --RDBury (talk) 22:23, 3 December 2020 (UTC)[reply]

I've noticed more editors adding the col templates than der-top for a while now. I prefer col for the following reasons:
  • It takes advantage of horizontal space, especially in long lists (see kyrka). The fact that der-top defaults to 2 is a mistake in my opinion.
  • Without typing L templates, it takes much less time to create a list, and it's easier to read in the edit window.
  • Automated alphabetization. I don't know when you'd want to bypass it, but you can wrap everything in L templates to avoid sorting.
  • Short lists aren't hidden, like with der-top, so they don't add to page length, and they don't need to be clicked to read. The need to expand is added after a reasonable amount of terms.
I can't find an example, but I've seen {{sense}} used to separate tables. If you do distinguish terms by sense, you'll have lots of shorter lists, which is more user friendly with col (much less clicking). And as a random anecdote, I find the col lists more inviting in their design. It took me a long time to embrace them though. Ultimateria (talk) 19:18, 5 December 2020 (UTC)[reply]
Thanks. Just so you know, there are der-top3, der-top4 and der-top5 templates for higher numbers of columns, so I don't think the horizontal space issue matters either way. For common words there will often be some idioms in the list as well so fewer columns may be necessary. Personally, I prefer the list to be hidden until needed because it avoids clutter, the same reason conjugation/declination tables are hidden until needed. I also find partially hidden lists more confusing than the fully hidden lists. For example if I'm looking for "entsprechen" on the list for "sprechen", first I look at the list, find that "entsprechen" isn't there, then notice that it's between the bottom of the first column and the top of the second column, then notice the expand button, click on it, and finally discover "entsprechen". It only takes an extra second or two, but that's enough to interrupt a train of thought. I agree without about the L templates though. Anyway, I gather the upshot is that it's up to the person adding the list. In which case I suppose the best policy is to leave existing lists alone unless changing the format is absolutely necessary; follow the path of least disruption in other words. --RDBury (talk) 08:51, 6 December 2020 (UTC)[reply]

New automated nesting proposal for translations[edit]

Proposal 1[edit]

Ottoman Turkish (currently split):

* Turkish: {{t+|tr|Türkiye}}
*: Ottoman Turkish: {{t|ota|تركیه|tr=Türkiye}}

The Ottoman Turkish transliteration is capitalised above just to match the modern Turkish (it's not a statement, not an opinion)

Support[edit]

  1. Support --Anatoli T. (обсудить/вклад) 03:38, 4 December 2020 (UTC)[reply]
Why not the other way around, if I may ask? Allahverdi Verdizade (talk) 10:32, 13 January 2021 (UTC)[reply]
@Allahverdi Verdizade: Just following the Greek/Ancient Greek, French/Old French, etc. conventions. E.g. Greek at Greece#Translations. Modern Turkish is most likely to have translations.
* Greek: {{t+|el|Ελλάδα|f}}
*: Ancient: {{t|grc|Ἑλλάς|f}}
--Anatoli T. (обсудить/вклад) 10:46, 13 January 2021 (UTC)[reply]
Well, then Support Allahverdi Verdizade (talk) 10:48, 13 January 2021 (UTC)[reply]
  1. SupportVox Sciurorum (talk) 14:01, 9 February 2021 (UTC)[reply]
  2. SupportEru·tuon 18:56, 2 June 2021 (UTC)[reply]

Oppose[edit]

Abstain[edit]

With three supports and no opposes is this now policy, or is the verdict not enough people care? Vox Sciurorum (talk) 18:49, 2 June 2021 (UTC)[reply]

Proposal 2[edit]

Mongolian (manual) version #1:

* Mongolian: {{t+|mn|Монгол}}
*: Uyghurjin: {{t|mn|ᠮᠣᠩᠭ᠋ᠣᠯ}}

Mongolian (manual) version #2:

* Mongolian:
*: Cyrillic: {{t+|mn|Монгол}}
*: Uyghurjin: {{t|mn|ᠮᠣᠩᠭ᠋ᠣᠯ}}

Can we have a new language codes for the traditional Mongolian script, but distinct from cmg?

Support[edit]

  1. Support version #2 --Anatoli T. (обсудить/вклад) 03:38, 4 December 2020 (UTC)[reply]

Oppose[edit]

  1. Procedural oppose to "a new language codes" as the mechanism for nesting things that we're not distinguishing as separate languages (with their own L2 language headers in entries, etc). I'm fine with nesting the scripts on different lines, like for Serbo-Croatian. - -sche (discuss) 03:10, 14 January 2021 (UTC)[reply]
  2. Agree with User:-sche; nest them, but a separate language code seems unnecessary. — Eru·tuon 18:55, 2 June 2021 (UTC)[reply]

Abstain[edit]

Proposal 3[edit]

Nest Indonesian (currently separate):

* Malay: {{t+|ms|Malaysia}}
*: Indonesian: {{t+|id|Malaysia}}
*: Jawi: {{t+|ms|مليسيا}}

Support[edit]

  1. Support --Anatoli T. (обсудить/вклад) 03:38, 4 December 2020 (UTC)[reply]

Oppose[edit]

  1. Oppose. Indonesian has branched off and become a separate language from Malay. Even though a lot of words are shared, or inherited from Malay to Indonesian, there are differences in vocabulary and even false friends, similar to how Afrikaans has branched off from Dutch. There are also differences in pronunciation and spelling. Despite being mutually intelligible to some extent, I still believe there are too many differences between the two languages for either of them to deserve being nested under the other. There are words in Indonesian that don't exist in Malay (e.g. deputi, krusial, imut), and there are words in Malay that don't exist in Indonesian (e.g. beg, besen, pentadbiran) (I mention these examples not based on whether their Wiktionary entries exist, but based on whether each exists on both KBBI Daring (Indonesia's official dictionary site) and Pusat Rujukan Persuratan Melayu (Malaysia's official dictionary site)). This, I believe, is unlike Serbo-Croatian. For purposes of nesting in translations, I wouldn't nest Afrikaans under Dutch (and at the moment that is not the case), so I think Indonesian shouldn't be nested under Malay. Just my two cents (and I'm Indonesian, so I may or may not be biased). --Bismabrj (talk | contribs) 07:55, 28 December 2020 (UTC).[reply]
    EDIT 1: Also, you can see on standard deviation, under Translations, that the translations for Malay and Indonesian are considerably different. As an Indonesian, I don't even know what sisihan or piawai means, let alone their equivalents in Indonesian, so I wouldn't nest Indonesian under Malay there. --Bismabrj (talk | contribs) 10:37, 1 January 2021 (UTC).[reply]
    EDIT 2: Oops, my bad, turns out piawai is an Indonesian word, though with a different meaning (another false friend). I apologize for that as I've never heard that word used before, it's probably a word rarely used in Indonesian compared to Malay. The point I tried to make on Edit 1 is that Indonesian and Malay can have different compounds that don't look similar at all, and even (this is something I haven't mentioned before) have loanwords with different etymologies (for example, Indonesian vocabulary from Dutch compared to Malay vocabulary from English), which would cause some etymology misrepresentation problems when Indonesian is always nested under Malay (as if all existing Indonesian words are inherited from Malay, but turns out, no, Indonesian does have loanwords from other languages too without passing through Malay). An example of this problem would be with Indonesian partai from Dutch vs. Malay parti from English, both listed as translations of political party. --Bismabrj (talk | contribs) 01:23, 14 January 2021 (UTC)[reply]
@Bismabrj: Would you support nesting Jawi script under Malay, if Indonesian is left as is (unchanged, separate)? --Anatoli T. (обсудить/вклад) 10:56, 13 January 2021 (UTC)[reply]
@Atitarev: Jawi script under Malay? I don't really have a stance on that. I wouldn't mind if it happens, but I also wouldn't mind if it doesn't happen. What I would like to point out is that that would be a transliteration rather than a translation, so maybe it would be better to somehow put the Jawi transcription next to the term instead of below, or something like that. I'm not sure. --Bismabrj (talk | contribs) 01:23, 14 January 2021 (UTC)[reply]

Abstain[edit]

Proposal 4[edit]

Can we have a new language code for Malay Jawi (Arabic script)? --Anatoli T. (обсудить/вклад) 03:38, 4 December 2020 (UTC)[reply]

Hmm, if we're not going to distinguish two script forms of a language by giving them entirely separate L2 headers, I think they don't need their own language codes, do they? What is it that you want the code in order to do—is it to make the different scripts nest on different lines? In that case, we should be able to use whatever code causes Cyrillic vs Roman/Latin Serbo-Croatian to nest on different lines as a model for getting other things to nest. - -sche (discuss) 05:57, 8 December 2020 (UTC)[reply]
@-sche: Sorry for the late reply. I know what you mean but I find it difficult to add and maintain nestings in tra nslations. Even "Cyrillic" gets messed up for Serbo-Croatian by the translation-adder if there are already Cyrillic nestings for Mongolian or Old Church Slavonic. The tool doesn't work with |sclb=. For language codes such as nb or nn, cmn, the nesting is automatic and doesn't create any errors. Perhaps a different solution can be created later. Anyway, this proposal about nesting, not new language codes. --Anatoli T. (обсудить/вклад) 10:53, 13 January 2021 (UTC)[reply]

template:tlb - position, improvement, replacemenet?[edit]

So this template has been bothering me for a while now. In ceratin cases, with languages that add a lot of info to their headword template (eg. Latin), placing it after the headword is ensuring noone ever sees it. It's the part of the dictionary that most people simply skip unless looking for something specific. As such, people avoid using template:tlb altogether and just put {lb} before the definitions. However, sometimes there are many definitions, and all of these usages are obsolete or colloquial, for example. So when editing the entry ужо I've decided to experimentally put {tlb} before the headword template. Is this desirable? Can you offer better ways to solve the issues? I question the ultility of the template in its current use. Brutal Russian (talk) 18:32, 5 December 2020 (UTC)[reply]

I initially created this template's predecessor, T:term-context (counterpart to T:context), to handle the situation where an English word was specific to American or British spelling, where it felt excessive and potentially confusing to label every sense of the word "(British)" or "(British spelling)" when only the spelling and not the sense was restricted. (And imagine stacking labels on terms that also had regionally-specific senses, like "(British, British spelling)".) I do think that, for English, labelling spelling issues somewhere other than on every sense-line of highly polysemous (lemma) words is an impovement, but there are a lot of other situations where it's unclear which template to use (I've been surprised at how it's taken off), or at least, where to place whichever template is used—since this template also enables distinguishing "terms with obsolete senses" from "obsolete terms", etc. I agree it often gets lost in the "noise" of a long headword line. I'd love if we could come up with a better way to handle "American vs British spelling" cases and/or "all senses are obsolete and should be categorized as such, vs only some senses are". - -sche (discuss) 20:57, 5 December 2020 (UTC)[reply]
Looking at some random entries where the template is used, the usage does seems to be all over the board. For example in abalienation there is only one definition so label would be better, in aband it's used in both definitions (yikes!), in color it's used as intended, and in consent it's used as a valence label (i.e. intransitive) which afaik would normally go in the definition lines no matter how many there are. Another issue is that the headword template usually also adds parentheses, so you end up with two sets of parentheses in a row and that seems awkward. (You can see this in the template documentation.) I don't see anything wrong with putting tlb before the headword template, at least in most cases when it's being used as intended. It would be more consistent with the way the label template is used, it would avoid the double parentheses, and yes it would make the information more visible. --RDBury (talk) 09:27, 6 December 2020 (UTC)[reply]
PS. The head template allows additional labels as well, so really the tlb template should not be used in combination with it. I'm thinking grammatical information should go in the headword template when possible, see for example an. --RDBury (talk) 09:55, 6 December 2020 (UTC)[reply]
Except {{tlb}} is rarely used for grammatical information but for usage information, and it would require to make many language-specific head templates to be made much more complicated in order to pass labels, and it still wouldn’t address this template’s being overlooked.
The solution to {{tlb|-}} being overlooked near the head is to use {{tlb|-}} more often near the head. I have deployed it in Arabic entries so systematically that one looks thither automatically, or others got used to seek and put it there (because there wasn’t much labelled or labellable obsolete before me). Fay Freak (talk) 20:15, 6 December 2020 (UTC)[reply]
I was thinking more just the vanilla head template, not all the language specific ones, hence "when possible". In the entries I looked at it was used for grammar about half the time, but it was not a random sample. Perhaps a separate "Grammar" line under the definition, with it's own template, would solve a number of problems. Many German verbs have complicated grammatical features that are awkward to fit into the definition line, and I'm sure German is not the worst offender in this respect. --RDBury (talk) 07:27, 8 December 2020 (UTC)[reply]
As an alternative proposition, if we do decide to move it, it could also go on a separate line after the headword and before definitions, which has the advantages of both being immediately noticeable and cluttering up the headword line less than a pre-headword position. — Vorziblix (talk · contribs) 05:22, 7 December 2020 (UTC)[reply]
As long as we're spitballing: years ago, someone (Ruakh?) pointed out that some dictionaries "highlight" the text of labels with a background colour to set them off and make them more visible, and different colours and/or different kinds of brackets () [] could distinguish different categories of label, e.g. grammatical info vs dialects, or topics (saying that plant relates to "botany") vs restrictions (saying a term is only used by botanists in their jargon). Such things could also be used to distinguish labels that apply to a sense vs those that apply to a whole word, or to make term-labels more prominent, as long as the information was still accessible for colorblind or blind (screenreader) users (which it should be, since the text would still be there). - -sche (discuss) 05:44, 8 December 2020 (UTC)[reply]
Yes, DWDS is a good example of this. Blue for definitions and links, green for label-type information, dark red for what we call "non-gloss definitions". (See [1] for a page that uses all of these.) It also uses font size as well as color in meaningful ways. I don't approve of all their layout choices, but overall it's done tastefully imo. Color can open a can of worms with regard to accessibility though: Will colorblind reader be confused? Will automated text readers be confused? The Wikiverse traditionally sticks to black & white and I'm thinking accessibility worries are part of the reason. --RDBury (talk) 06:51, 8 December 2020 (UTC)[reply]

2020 Coolest Tool Award Ceremony on December 11th[edit]

Creating section headers for old discussions on the talk pages of entries[edit]

Hello, I'm wondering if anyone would object to me moving discussions like the one currently at the top of Talk:empathy into sections with names like "Untitled discussion from October 2011" or "Untitled discussion from 2011". The main reason I am interested in doing this is to make talk pages easier to navigate and the table of contents more accessible. I am posting about this here because this is usually the place where community standards (AKA "policy") are set. Best. —The Editor's Apprentice (talk) 21:57, 7 December 2020 (UTC)[reply]

I often do this manually myself, for the same reason, but I'm not sure that "Untitled discussion" would help much. It's better to give them meaningful names for the table of contents. Equinox 22:39, 7 December 2020 (UTC)[reply]
I agree; I do this too, and yes, if there's an obvious meaningful title, use it... but if not I have indeed used generic titles like proposed. - -sche (discuss) 05:48, 8 December 2020 (UTC)[reply]
Thank you both. I think this is sufficient agreement given the scale of the change so I will go forward inline with what y'all have suggested. Best. —The Editor's Apprentice (talk) 17:39, 8 December 2020 (UTC)[reply]

Descendants of Old Church Slavonic[edit]

Moved from Talk:заповѣдь#Descendants
User @Atitarev has reverted my contribution with the comment: "Only Bulgarian is considered a descendant of OCS". In many other articles I saw other South Slavic languages listed as descendants of OCS, so I need a reference for this statement. --Mladifilozof (talk) 21:50, 8 December 2020 (UTC)[reply]

@Mladifilozof:: At Module:languages/data2 "bg" has ancestors = {"cu"}. So if you try using {{bor|bg|cu|TERM}} it would be incorrect, {{inh|bg|cu|TERM}} should be used instead.
Calling @Benwing2, Bezimenen, Rua: could you please send a link for the discussions re this decision? @Mladifilozof has been adding a lot of OCS contents, so we need that formalised somehow (if it's required).
BTW, Linking to edit is better done this way: diff.--Anatoli T. (обсудить/вклад) 22:11, 8 December 2020 (UTC)[reply]
What he saw may have been old uses of {{desc}} without |bor=1. Discussions include Wiktionary:Beer parlour/2020/February § Bulgarian as descendant of Old Church Slavonic and Wiktionary:Beer parlour/2019/September § I want to add Church Slavonic terms. Well it is known on which regiolects the Old Church Slavonic language is based, some from which Modern Bulgarian and Macedonian partially descend. I don’t know why Macedonian is not reckoned a descendant of Old Church Slavonic. Fay Freak (talk) 03:43, 9 December 2020 (UTC)[reply]
What he saw was probably User:Ivan Štambuk’s old entries from back when we didn’t have many Proto-Slavic entries and instead vaguely conflated Proto-Slavic with OCS. There are a lot of these left over from the old days. They should all be changed, and the descendants moved to Proto-Slavic. — Vorziblix (talk · contribs) 17:07, 11 December 2020 (UTC)[reply]
Else, does Mladifilozof consider the Freising manuscripts “Old Church Slavonic”? If not, then it is evident why Slovene does not descend from Old Church Slavonic, similarly Serbo-Croatian. Fay Freak (talk) 03:54, 9 December 2020 (UTC)[reply]
@Fay Freak, Vorziblix, Benwing2: I don't support the theory that South Slavic languages descend from a common "Old South Slavic". They are all from different dialects, even Macedonian is apparently from a close but different dialect than Bulgarian, otherwise, it would be made another descendent of OCS.
In practical terms, Slovene, Serbo-Croatian, Bulgarian/Macedonian (as two close languages) are as close to each other as Polish to Slovene or Russian to Serbo-Croatian. They are mutually understandable (to some degree) or have a lot of similarities to each as most unrelated Slavic languages to each other. Unlike descendants of Old East Slavic, Chech/Slovak, Bulgarian/Macedonian, the rest of major Slavic languages probably don't have a common ancestor and "South Slavic" and "West Slavic" are merely territorial names, they don't have a linguistic base, even if territorial proximity and in some cases common history also had an affect on their linguistic similarities. --Anatoli T. (обсудить/вклад) 23:41, 20 December 2020 (UTC)[reply]
@Atitarev, Fay Freak, Vorziblix, Mladifilozof I am late to this discussion but yes only Bulgarian should be considered a descendant of Old Church Slavonic, at least in the standard recension. Macedonian seems to have descended from a slightly different dialect (most notably, one that did not confuse the two yers). I agree with Fay Freak that cases where other languages (including Russian!) are given as descendants are due to a missing |bor=1 in {{desc}}. Serbo-Croatian and Slovenian are definitely not descendants; they were in fact geographically isolated from Bulgarian/Macedonian for several centuries (probably by Proto-Romanian speakers, I think), which explains why they evolved in such different directions. Benwing2 (talk) 02:07, 27 December 2020 (UTC)[reply]

Jeju entries for Hangul letters[edit]

There are some Jeju entries for Hangul letters: Category:Jeju letters. With the potential caveat of and , which are now deprecated in the modern standard language, are these useful? Note that Jeju was historically not a written language. There are virtually no texts in it by native speakers, and the official, general-purpose orthography was promulgated only in 2014. In my opinion these are likely only to distract readers (by virtue of coming first in the entry) from the actually meaningful section with etymologies and everything, which are of course the Korean ones.--Karaeng Matoaya (talk) 11:37, 10 December 2020 (UTC)[reply]

I don't think that Jeju not being called 'Southernmost Korean' is a valid argument against the letters of Jeju having their own entries. RichardW57 (talk) 17:46, 10 December 2020 (UTC)[reply]
I suggest that the proper solution is to make the part that discusses the origin and usage of the letters as letters translingual; translingual entries are preceded only by English entries. Jeju combining jamo may then be treated on the same footing as Welsh etc. letters - for which there is a vote in interminable discussion to remove from the pages housing lemmas. RichardW57 (talk) 17:46, 10 December 2020 (UTC)[reply]

Desysopping on Inactivity[edit]

Per this vote, shouldn't we desysop users like Polyglot (talkcontribs), Timwi (talkcontribs), Dvortygirl (talkcontribs), and others? — This unsigned comment was added by Imetsia (talkcontribs) at 19:51, 10 December 2020 (UTC).[reply]

Yes. Polyglot hasn't used his/her admin tools since May 2014, Timwi hasn't used his since February 2006 (!), and Dvortygirl hasn't used hers since March 2013. —Mahāgaja · talk 22:00, 10 December 2020 (UTC)[reply]
@SemperBlotto, Chuck Entz, Surjection? Imetsia (talk) 18:13, 19 December 2020 (UTC)[reply]

Definition: Gloss or sentence[edit]

This discussion has mostly been sparked by the recent edits of User:PadshahBahadur. Simple definitions in Wiktionary always had two possible formatting possibilities: giving a simple gloss ([[cat]]) or a sentence (A [[cat]].).[1] I believe it is time to establish some boundaries to the editors' freedom to change a well-established gloss to a sentence and vice versa, either by establishing when to use which type of definition or by agreeing not to change the style of other editors.

What I personally see as a downside to formatting (noun) definitions as sentences is the possible incorrect or inaccurate depiction of the definition as portrayed by the following example: the word cat is neither definite nor indefinite by itself, while the definition A cat. suggests this is not the case. The opposite occurs with languages without articles: Russian кот is both definite and indefinite. Thadh (talk) 21:29, 10 December 2020 (UTC)[reply]

Non-English definitions are often one-expression glosses, without a terminating dot. English definitions are usually formatted with a dot. Only certain proverbial expressions are actually defined as sentences. This seems to reflect the preferences of the relevant communities of editors, AFAICT. DCDuring (talk) 23:24, 10 December 2020 (UTC)[reply]
I think an exception can be made for non-English definitions when there is no simple English gloss, but rather an explanation of the term is required. I don't have an example off-hand, but I know I've seen them and have probably written some myself. In those cases, I think sentence formatting for a non-English entry is warranted. —Mahāgaja · talk 10:30, 11 December 2020 (UTC)[reply]
An example of such an exception in my mind is the third sense of doties or falsettone. I think this situation is generally poorly explained on the current style guide page and might try to re-write it in the future in order to clarify general practices, which may need to be surveyed. —The Editor's Apprentice (talk) 20:24, 11 December 2020 (UTC)[reply]
There is something about this in Wiktionary:Style guide#Types of definitions. Two types are distinguished: “full definitions”, which are in sentence case and end with a full stop (e.g., “The larva of a butterfly or moth; leafworm.”), and “simple glosses”, which should not be capitalized and should not end with a full stop (e.g., “caterpillar”). The former is said to be the preferred form for definitions of English terms, the latter for other languages. Next to these two main types, there are also “non-gloss definitions” for cases that cannot be defined by another phrase (in English) with the same meaning. Cases like Italian falsettone are indeed poorly explained. Is the proposal to turn the type that is currently preferred to one that is prescribed? Turning the preferred type into a non-preferred type should certainly be discouraged, but the converse (within reason – each rule has cases where it does not fit) should IMO be encouraged.  --Lambiam 23:34, 11 December 2020 (UTC)[reply]
At this point I am disinterested in trying to create prescriptions, especially since doing so requires organizing a community vote, and I don't think anyone else is proposing creating them in this discussion. The main thing that I'm personally thinking about is making the phrasing clearer and checking in with others when I notice any ambiguities. —The Editor's Apprentice (talk) 04:26, 12 December 2020 (UTC)[reply]
I personally like full-sentence definitions and add entries using both these and uncapitalised gloss definitions. I would certainly oppose an initiative to only restrict full definitions to English. Whether they should be allowed or not should in my opinion be left to the community of editors active in a language. Massively changing the style of entries without community consensus is certainly poor form, however, and doing so in languages one doesn't even know is also poor judgement. Really, changing other editors' style should be avoided in absence of a good reason. For various languages, in particular many in East Asia and Southeast Asia, nouns are not marked for grammatical number; changing to those full-definition style without community input is especially ill-considered, because that style might suggest to readers that those nouns are necessarily singular. ←₰-→ Lingo Bingo Dingo (talk) 14:11, 12 December 2020 (UTC)[reply]
I'm definitely not a fan of short definitions formatted in this style. I think having a single word with a period is silly, there is no way you are pretending that it's a full sentence so why try? I also think the use of the article is completely superfluous, it adds nothing to the definition and makes it longer than needed. —Rua (mew) 18:14, 13 December 2020 (UTC)[reply]

Community Wishlist Survey 2021[edit]

SGrabarczuk (WMF)

15:03, 11 December 2020 (UTC)

We forgot to wish for more memory. – Jberkel 15:43, 11 December 2020 (UTC)[reply]
The Wiktionary wishlist items look mostly like ways of adding mass quantities of content. They might make sense for many languages with few entries, but are likely to require lots of review. One example, is a tool to vastly simplify the addition of pronunciations. That would probably lead to many legitimate regional pronunciations as well as many spurious files that we might consider vandalism. We would need to develop some means of sorting out legitimate pronunciations from others and also reduce the visual clutter of the legitimate new content at the top of L2 sections where we have pronunciations.
My instinct is to push back, both by oppose votes and comments about the need for accompanying tools to facilitate review of such content. DCDuring (talk) 17:13, 11 December 2020 (UTC)[reply]
It seems a bit perverse that all Wikimedians get to vote on what is supposed to be good for Wiktionary (or Wiktionaries?). Transwikied “dictdefs“ from Wikipedia were also not a welcome gift.  --Lambiam 23:40, 11 December 2020 (UTC)[reply]
I agree with User:Jberkel that we forgot to wish for more memory per page (or at least, to raise the maximum allocated memory per page). There are still entries in the category wikt:CAT:E that have errors reading "Lua error: not enough memory". This proposal from last year's Community Wishlist Survey could have been re-proposed: meta:Community_Wishlist_Survey_2020/Wiktionary/More_Lua_memory_for_Wiktionary (and there, one can read more about the issue). Maybe next year. Let's hope we won't forget. --Bismabrj (talk | contribs) 14:27, 27 December 2020 (UTC)[reply]
Disallowing RFDs nominated by IPs sounds like a very good idea. DonnanZ (talk) 17:20, 16 December 2020 (UTC)[reply]

Inflected reflexive Italian infinitives[edit]

Domandarmi is defined as "first-person singular infinitive of domandarsi". By definition you can't have a first person infinitive. There must be a less confusing way to phrase this construct, domandare (to ask) + -mi (myself). Vox Sciurorum (talk) 16:01, 11 December 2020 (UTC)[reply]

hwc-en Issue[edit]

Recently, “hwc” (Hawaiian Creole English) was reinstated on Wiktionary. However, many Hawaiian Creole English words have been placed under “en” (English).

Here’s an example: hammajang

This word is clearly from Hawaiian Creole, though because Hawaiian Creole had no recognition until earlier this year, it was placed as an English entry. I’m curious if I should replace the English entry and only put the Hawaiian Creole entry on the page, or just leave both at the same time.

It’s a confusing decision for me to make since Hawaiian Creole and English sit on a continuum, and speakers of both languages (including me) code-switch depending on the situation.

Also, out of curiosity, have there been situations like this before where a new language code sort of messes things up? — Okonomiyaki39 (talk) 03:39, 12 December 2020 (UTC)[reply]

It's not too hard: don't remove the English section if it's used in English, like the quote at hammajang#English. Also, I assume you're the same person as Coastaline/Haimounten/Haimaunten. Please stop creating new usernames like this; it is disruptive and confusing for others. —Μετάknowledgediscuss/deeds 04:13, 12 December 2020 (UTC)[reply]
Okay but is “hammajang” within the realms of English? If one were to travel to Hawaii, no one would see “hammajang” as an English word, or at least not a standard one. I don’t believe it’s an English loanword just because we found a book quote that uses it.
Furthermore, I talked about the account changes with Chuck Entz and he didn’t seem to have a problem with it provided I’m not using new accounts to evade bans/kicks. But you’re right, it can get disruptive. I apologize.
But anyways, I’ll take your advice and just leave the English entry. Thanks for responding. — Okonomiyaki39 (talk) 18:45, 12 December 2020 (UTC)[reply]
Yeah, like you say, it's tricky because HWC~EN is a continuum. But this isn't the only such situation (we also have this issue with Jamaican Creole and Scots, etc), and it comes down to evaluating whether the surrounding text is English or the other language. In this case, the rest of the sentence "I can't think straight, my thoughts are all [...], but I say, 'When is his funeral?'" definitely looks like standard English and does not look like Hawaiian Creole texts like the HWC Bible quoted in the earlier discussion of HWC. And since the word is not set off in italics or quotation marks or anything else that would suggest it was being set off as a foreign term / code-switching, it looks like a solid citation of English usage of the word. (If the word were RFVed, we'd need two more such examples.) It gets trickier if there is a sentence where e.g. the grammar is English but there are a lot of HWC words. - -sche (discuss) 20:12, 12 December 2020 (UTC)[reply]

As we're in the 2020s already, when emojis are used so widely, having a link between a word and an emoji seems like a good idea. After seeing a link at aubergine, I got inspired to add more, but stopped because it was probably a poorly though-out plan. What do y'all think about it? La más guay (talk) 23:16, 15 December 2020 (UTC)[reply]

Er, no thanks. DonnanZ (talk) 17:17, 16 December 2020 (UTC)[reply]
I'm not a great fan, but we already have several. We probably won't stop you. SemperBlotto (talk) 17:20, 16 December 2020 (UTC)[reply]

In some languages, which normally express colors with adjectives or nouns, there are also some stative verbs (or adjective verbs) expressing that something is, looks, shows, or appears a certain color (as opposed to making or becoming that color, i.e. dynamic color verbs), for example Proto-Indo-European *h₁rudʰéh₁ti, Latin rubeo, French verdoyer, German grünen, Russian сине́ть/​голубе́ть/​пестре́ть/​черне́ть/​зелене́ть (sinétʹ/​golubétʹ/​pestrétʹ/​černétʹ/​zelenétʹ), Esperanto blui/​ruĝi/​verdi, Hungarian kéklik/​zöldell/​sárgállik, Palauan bekerkard, Navajo łichííʼ, Marshallese kilmir, Chickasaw homma, Afar qasa, Ainu フレ, and possibly also some Korean terms. If we create a category for them, I suppose in Category:Colors and perhaps in Category:Stative verbs by language as well, shall we name it Category:Stative verbs for colors, Category:Adjective verbs for colors, or something else? Adam78 (talk) 14:06, 16 December 2020 (UTC)[reply]

Example of what you can do with Wikidata compared to Wiktionary[edit]

Continuing this old discussion, @pamputt what about pronounciation and examples for every single sense and form? That seems pretty nice to me too. Its easy with SPARQL to get a list of forms that do not have a pronounciation yet and fix it. A caveat in Wikidata is that we cannot legally add citations from works still in copyright, I'm not sure if you do that here or not.--So9q (talk) 13:59, 19 December 2020 (UTC)[reply]

I would guess that most of our citations are from works still in copyright, given that our coverage includes contemporary words, senses, and phrases that may have only existed for the past few decades (e.g., ghost#Verb (sense 10), twerk#Verb, olinguito, Netflix and chill, hit it and quit it). bd2412 T 15:44, 19 December 2020 (UTC)[reply]
The vast majority of our citations are from the 20th and 21st centuries, followed by the 19th century. It falls off precipitously from there. DTLHS (talk) 16:40, 19 December 2020 (UTC)[reply]

AWB access request[edit]

I'll be working on a new version of the Middle English conjugation templates ({{enm-conj}} and {{enm-conj-wk}}; there'll probably be a {{enm-conj-st}}, {{enm-conj-irreg}}, and possibly others). It'd be helpful to have use of AWB so I can avoid the arduous task of replacing all the templates manually. Hazarasp (parlement · werkis) 11:38, 23 December 2020 (UTC)[reply]

@Hazarasp: AWB access granted. — Eru·tuon 22:11, 23 December 2020 (UTC)[reply]
When I try to use AWB, it says "Hazarasp is not enabled to use this"; I think I need to be added to the list at Wiktionary:AutoWikiBrowser/CheckPage#Users. Hazarasp (parlement · werkis) 04:14, 24 December 2020 (UTC)[reply]
@Hazarasp: Okay, I guess the version that will use Wiktionary:AutoWikiBrowser/CheckPageJSON hasn't actually been released? I've added you to the old page as well. — Eru·tuon 09:05, 24 December 2020 (UTC)[reply]

@Mxn, PhanAnh123 I am planning to do a bot run to convert {{vi-hantu}} into {{vi-readings}}. I want to make sure (a) I do it correctly, (b) there are no objections. I think the following should work:

  • {{vi-hantu|READING|rs=SORT}} --> {{vi-readings|reading=READING|rs=SORT}} with links removed from the readings.
  • If |chu=Nom is present, either I will leave them alone or use {{vi-readings|nom=READING|rs=SORT}} (what do you think?). BTW there are 30 pages with |chu=Nom in them; that is few enough that they can be converted by hand if needed. They are as follows: , , 𩂏, 𪽵, 𠀗, 𠀖, 𥿗, 𠀲, 𠁂, 𠈋, 𥇹, 𡴉, 𥐆, 𢖮, 𣅶, 𡪇, 𩈩, 𢢲, 𤣡, 𢞂, 𠫏, 𪛇, 𪛅, 𪶾, 𫠢, 𫡁, 𫡂, 𢠄, 𬔗, 𤞼
  • Entries with |pos= probably need to be converted by hand. Luckily there's only one:
  • I'll also check to make sure all converted entries are single-character, and leave alone any that are multi-character.

Comments? Benwing2 (talk) 02:23, 27 December 2020 (UTC)[reply]

I don't think there is any reason to oppose these proposals. In the case of |chu=Nom, I think the later option would be preferable.PhanAnh123 (talk) 02:59, 27 December 2020 (UTC)[reply]
@Mxn, PhanAnh123 I went ahead and converted the templates. There were a few that couldn't be converted automatically (see below); could one of you fix them up? Thanks!
  • Page 21 : WARNING: Empty reading, skipping: {{vi-hantu}}
  • Page 22 : WARNING: Empty reading, skipping: {{vi-hantu}}
  • Page 35 : WARNING: Empty reading, skipping: {{vi-hantu|rs=卜00}}
  • Page 36 : WARNING: Empty reading, skipping: {{vi-hantu|rs=卩00}}
  • Page 38 : WARNING: Empty reading, skipping: {{vi-hantu|rs=厂00}}
  • Page 48 : WARNING: Empty reading, skipping: {{vi-hantu}}
  • Page 1290 : WARNING: Empty reading, skipping: {{vi-hantu|hanviet=đùm|rs=手12}}
  • Page 1291 : WARNING: Empty reading, skipping: {{vi-hantu|hanviet=đan, đản|rs=手12}}
  • Page 1651 : WARNING: Empty reading, skipping: {{vi-hantu|rs=木15}}
  • Page 2586 : WARNING: Empty reading, skipping: {{vi-hantu}}
  • Page 2736 : WARNING: Empty reading, skipping: {{vi-hantu|rs=艸05}}
  • Page 3386 : WARNING: Saw pos=, skipping: {{vi-hantu|đồng, đòng|pos=noun|rs=金06}}
  • Page 3808 國會: WARNING: Length of page title is 2 > 1, skipping
  • Page 3918 User:Bumm13/templates: WARNING: Length of page title is 21 > 1, skipping
  • Page 3959 𠀳: WARNING: Empty reading, skipping: {{vi-hantu|rs=一07}}
  • Page 3964 User:Kc kennylau/沙盒: WARNING: Length of page title is 19 > 1, skipping
  • Page 4023 常川: WARNING: Length of page title is 2 > 1, skipping
  • Page 4057 User:Dixtosa/ja: WARNING: Length of page title is 15 > 1, skipping
  • Page 4059 㗂西: WARNING: Length of page title is 2 > 1, skipping
Benwing2 (talk) 01:38, 28 December 2020 (UTC)[reply]
@Benwing2 Thanks for taking care of unifying the templates. I think I've addressed all the main-namespace transclusions above, copying from the Vietnamese Wiktionary which uses {{R:WinVNKey:Lê Sơn Thanh}} as its source. Now Category:Vietnamese Han characters with unconfirmed readings has over 6,200 more entries to research. At least now we have a good grasp on how much of a mess Unihan made of Vietnamese! – Minh Nguyễn 💬 03:43, 5 January 2021 (UTC)[reply]
@Mxn Thanks for that. I have changed {{vi-hantu}} to prominently display a "deprecated" message when it's used, and also in its doc page. (I didn't delete it because it is in the page history of several thousand pages.) Hopefully that will scare people off from using it in the future. Benwing2 (talk) 03:54, 5 January 2021 (UTC)[reply]

Merging language variety data[edit]

Currently there are at least four places where data on language varieties can be found:

  1. Module:etymology languages/data;
  2. Module:labels/data/subvarieties;
  3. Module:el:Dialects, Module:fr:Dialects and similar;
  4. On category pages associated with particular varieties, such as Category:Gascon or Category:Early Modern Korean.

I would like to consolidate this as much as possible. My current plan is to leave Module:etymology languages/data as-is and consolidate the remainder, avoiding duplicated data. Some discussion points:

  1. Any objections?
  2. Where should the data go? Module:labels/data/subvarieties is the most complete source of data but I think splitting by language is probably better. I'm thinking a format of either Module:varieties/fr, Module:varieties/data/fr or maybe Module:fr:Varieties. I'd like to avoid "dialect" because it's a loaded term; "variety" is already used in the data in Module:languages/data2 and such, as well as in Wikipedia articles such as Varieties of Arabic, Varieties of Chinese, Varieties of French, etc.

Benwing2 (talk) 02:42, 27 December 2020 (UTC)[reply]

Some lects that are intermediate between two languages are hard to pin down as being a variety of specifically one of the two, something that the proposed organization appears to require. I do not know whether this Buridanic problem will arise in lexicographic practice, but an example is found in the Bavarian to Alemannic transition zone, basically the drainage basin of the Lech, which includes Augsburg.  --Lambiam 12:52, 27 December 2020 (UTC)[reply]
What will this do to memory use on large pages? Vox Sciurorum (talk) 09:09, 28 December 2020 (UTC)[reply]

Japanese sort keys vs. Chinese sort keys[edit]

(Notifying Eirikr, TAKASUGI Shinji, Atitarev, Suzukaze-c, Poketalker, Cnilep, Marlin Setia1, Huhu9001, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233): I'm looking at removing the manually-specified Japanese sort keys in {{charactercat}} and using the automated ones in Module:zh-sortkey. In the vast majority of cases, the manually-specified sort keys are the same as the automated ones. See the sample below, which represents about 1500 characters; only 29 of them had a manual sort key different from the automated ones. Most of the time, the manual and automated sort keys have the same radical and differ by one or two strokes, but in a few cases the radical is different. Could some Japanese speakers take a look at the sample below and let me know if these differences are significant? I suspect they're not, but I want to make sure. Thanks! Benwing2 (talk) 23:45, 27 December 2020 (UTC)[reply]

  • Page 609 Category:Japanese terms spelled with 将: WARNING: Japanese/Okinawan category with manual sort key 寸07 != automatic 寸06: {{charactercat|ja|将|sort=寸07}}
  • Page 639 Category:Japanese terms spelled with 視: WARNING: Japanese/Okinawan category with manual sort key 見04 != automatic 示07: {{charactercat|ja|視|sort=見04}}
  • Page 649 Category:Japanese terms spelled with 故: WARNING: Japanese/Okinawan category with manual sort key 攴04 != automatic 攴05: {{charactercat|ja|故|sort=攴04}}
  • Page 689 Category:Japanese terms spelled with 奈: WARNING: Japanese/Okinawan category with manual sort key 凵03 != automatic 大05: {{charactercat|ja|奈|sort=凵03}}
  • Page 829 Category:Japanese terms spelled with 衛: WARNING: Japanese/Okinawan category with manual sort key 行10 != automatic 行09: {{charactercat|ja|衛|sort=行10}}
  • Page 851 Category:Japanese terms spelled with 黙: WARNING: Japanese/Okinawan category with manual sort key 黑11 != automatic 火11: {{charactercat|ja|黙|sort=黑11}}
  • Page 1003 Category:Japanese terms spelled with 充: WARNING: Japanese/Okinawan category with manual sort key 儿03 != automatic 儿04: {{charactercat|ja|充|sort=儿03}}
  • Page 1009 Category:Japanese terms spelled with 章: WARNING: Japanese/Okinawan category with manual sort key 立06 != automatic 音02: {{charactercat|ja|章|sort=立06}}
  • Page 1016 Category:Japanese terms spelled with 堅: WARNING: Japanese/Okinawan category with manual sort key 土09 != automatic 土08: {{charactercat|ja|堅|sort=土09}}
  • Page 1038 Category:Japanese terms spelled with 描: WARNING: Japanese/Okinawan category with manual sort key 手08 != automatic 手09: {{charactercat|ja|描|sort=手08}}
  • Page 1077 Category:Japanese terms spelled with 晩: WARNING: Japanese/Okinawan category with manual sort key 日08 != automatic 日07: {{charactercat|ja|晩|sort=日08}}
  • Page 1099 Category:Japanese terms spelled with 盛: WARNING: Japanese/Okinawan category with manual sort key 皿07 != automatic 皿06: {{charactercat|ja|盛|sort=皿07}}
  • Page 1103 Category:Japanese terms spelled with 蒸: WARNING: Japanese/Okinawan category with manual sort key 艸10 != automatic 火10: {{charactercat|ja|蒸|sort=艸10}}
  • Page 1182 Category:Japanese terms spelled with 幹: WARNING: Japanese/Okinawan category with manual sort key 十11 != automatic 干10: {{charactercat|ja|幹|sort=十11}}
  • Page 1184 Category:Japanese terms spelled with 聴: WARNING: Japanese/Okinawan category with manual sort key 耳11 != automatic 耳12: {{charactercat|ja|聴|sort=耳11}}
  • Page 1215 Category:Japanese terms spelled with 誠: WARNING: Japanese/Okinawan category with manual sort key 言07 != automatic 言06: {{charactercat|ja|誠|sort=言07}}
  • Page 1257 Category:Japanese terms spelled with 践: WARNING: Japanese/Okinawan category with manual sort key 足06 != automatic 足05: {{charactercat|ja|践|sort=足06}}
  • Page 1288 Category:Japanese terms spelled with 巡: WARNING: Japanese/Okinawan category with manual sort key 巛04 != automatic 辵03: {{charactercat|ja|巡|sort=巛04}}
  • Page 1334 Category:Japanese terms spelled with 禅: WARNING: Japanese/Okinawan category with manual sort key 示09 != automatic 示08: {{charactercat|ja|禅|sort=示09}}
  • Page 1438 Category:Japanese terms spelled with 餅: WARNING: Japanese/Okinawan category with manual sort key 食08 != automatic 食06: {{charactercat|ja|餅|sort=食08}}
  • Page 1450 Category:Japanese terms spelled with 級: WARNING: Japanese/Okinawan category with manual sort key 糸03 != automatic 糸04: {{charactercat|ja|級|sort=糸03}}
  • Page 1454 Category:Japanese terms spelled with 獄: WARNING: Japanese/Okinawan category with manual sort key 犬10 != automatic 犬11: {{charactercat|ja|獄|sort=犬10}}
  • Page 1481 Category:Japanese terms spelled with 睡: WARNING: Japanese/Okinawan category with manual sort key 目08 != automatic 目09: {{charactercat|ja|睡|sort=目08}}
  • Page 1524 Category:Japanese terms spelled with 免: WARNING: Japanese/Okinawan category with manual sort key 儿06 != automatic 儿05: {{charactercat|ja|免|sort=儿06}}
  • Page 1582 Category:Japanese terms spelled with 墨: WARNING: Japanese/Okinawan category with manual sort key 土11 != automatic 黑03: {{charactercat|ja|墨|sort=土11}}
  • Page 1633 Category:Japanese terms spelled with 忘: WARNING: Japanese/Okinawan category with manual sort key 亠04 != automatic 心03: {{charactercat|ja|忘|sort=亠04}}
  • Page 1772 Category:Japanese terms spelled with 些: WARNING: Japanese/Okinawan category with manual sort key 二06 != automatic 二05: {{charactercat|ja|些|sort=二06}}
  • Page 2040 Category:Japanese terms spelled with 嘩: WARNING: Japanese/Okinawan category with manual sort key 口10 != automatic 口12: {{charactercat|ja|嘩|sort=口10}}
  • Page 2086 Category:Japanese terms spelled with 繭: WARNING: Japanese/Okinawan category with manual sort key 糸12 != automatic 糸13: {{charactercat|ja|繭|sort=糸12}}
Radical classifications depend on each dictionary, especially for Japanese shinjitai. Some of them are differences between Chinese simplified characters and Japanese shinjitai at the same code point, such as:
  • : 寸07 (all characters containing instead of 𪧷)
  • : 行10 (all characters containing )
  • : 土09 (all characters containing )
  • : 手08 (all characters containing ) — it should be 手08 also in Mainland China while it is 手09 in Taiwan
  • : 足06 (all characters containing instead of (jiān))
  • : 示09 (all characters containing instead of )
TAKASUGI Shinji (talk) 01:35, 28 December 2020 (UTC)[reply]
@TAKASUGI Shinji Thanks. I think what you're saying is that at least some of the manually specified sort keys that differ from the Chinese ones are correct for Japanese. Are any of them incorrect? Also do you know of a machine-readable source for Japanese radical classifications? One thing we could do is fix Module:zh-sortkey to produce the correct Japanese sort keys when the language is specified as Japanese. The function in question (makeSortKey) already takes a language as its second parameter, but currently ignores it. Benwing2 (talk) 01:46, 28 December 2020 (UTC)[reply]
奈 must be 大05, 黙 must be 火11, and 聴 must be 耳11 (just like ) everywhere. — TAKASUGI Shinji (talk) 05:40, 28 December 2020 (UTC)[reply]
Isn't
require("Module:zh-sortkey").makeSortKey(pagename, "ja")
doing this currently? -- Huhu9001 (talk) 11:29, 28 December 2020 (UTC)[reply]
@Huhu9001 It doesn't. It accepts a language parameter but ignores it; you get the same output regardless of language. I can make it produce Japanese-specific output when called with "ja", but I need a machine-readable source of Japanese radical/stroke analyses for all Unicode characters (or at least the relevant subset of them). Do you know of such a source? Benwing2 (talk) 04:52, 29 December 2020 (UTC)[reply]

Inflections of multi-word verb phrases[edit]

An anonymous editor frequently adds inflections to multi-word verb phrases, such as at bang one's head against a brick wall, which now reads:

bang one's head against a brick wall (third-person singular simple present bangs one's head against a brick wall, present participle banging one's head against a brick wall, simple past and past participle banged one's head against a brick wall)

To me this seems unnecessarily laborious, and I am tempted to just delete them. Didn't we have a discussion about this before? I can't seem to find it now. What did we decide? Mihia (talk) 01:56, 28 December 2020 (UTC)[reply]

@Mihia If there was a previous discussion, I missed it. I added support to {{en-verb}} to make it easy to add inflections to multiword phrases, and went ahead and fixed all the existing phrases missing inflections to include them. You'll notice that the headword is just declared as {{en-verb|*}}, so it isn't actually laborious to include them. So I am fine with what the IP is doing. Benwing2 (talk) 02:22, 28 December 2020 (UTC)[reply]
Thanks, I don't mean that it is laborious to code them, as I have indeed noticed the "*" syntax, but just that the output appears laborious and unnecessary in the article, in my personal opinion. I deliberately use the "head" template to suppress inflections in long multi-word cases such as this, but then someone comes along and puts them in. Mihia (talk) 02:28, 28 December 2020 (UTC)[reply]
Is it possible to link only to the verb forms?
  • bang one's head against a brick wall (third-person singular simple present bangs one's head against a brick wall, present participle banging one's head against a brick wall, simple past and past participle banged one's head against a brick wall)
TAKASUGI Shinji (talk) 02:53, 28 December 2020 (UTC)[reply]
Why just have the verb inflection? What about noun plurals (heads) and (walls)? And what about other determiners or, at least, articles? DCDuring (talk) 03:34, 28 December 2020 (UTC)[reply]
@TAKASUGI Shinji Yes, this is possible. @DCDuring Not sure I see the point of your sarcasm. Benwing2 (talk) 04:05, 28 December 2020 (UTC)[reply]
Even with the (clever) template changes to make them easier to code, one could argue that they shouldn't then have separate pages created for them. I also feel they're a bit unnecessary. Equinox 11:54, 28 December 2020 (UTC)[reply]
Another potential issue with these: why is the past tense of hunt where the ducks are "hunted where the ducks are" and not "were"? Equinox 15:19, 28 December 2020 (UTC)[reply]
Perhaps because the ducks are still there? -- Huhu9001 (talk) 15:40, 28 December 2020 (UTC)[reply]
They might not be. I'm saying that the past tense could be either (and for me "were" is more natural). Equinox 15:43, 28 December 2020 (UTC)[reply]
@Benwing: What makes verb inflection so interesting a variation of forms in such predicates when variation in, say, number, determiners, prepositions gets neglected? If users need their hand to be held with respect to verb inflection, they must really need help with all the other variations attestable for such an expression. If they DON'T need help with these other variations, then why do we think they need help with verb inflection, especially when the verb inflection is just a click away. Variation in word selection (determiners, prepositions) is not available so conveniently. DCDuring (talk) 21:17, 28 December 2020 (UTC)[reply]
@Equinox, Huhu9001 Fixed tense of hunt where the ducks are. @DCDuring Verbs are a lot more complex in English than are any other parts of speech, so I think it's justified to show the inflections explicitly. I also don't really get your point about other variations being "neglected"; bang one's head against a brick wall even comes with a usage note indicating some common variations. Benwing2 (talk) 06:27, 29 December 2020 (UTC)[reply]
The problem with the partial inflection of elaborate predicates is not limited to this one. Having hundreds of character-long inflection (multi-)lines makes it harder to keep one's bearings when looking at the entry. Most of these elaborate predicate entries don't have usage notes discussing the range of variation. I wonder whether entering one of the variations other than inflection would even find our main entry for such a long predicate. And the ease of applying the template means that few are ever likely to consider whether the inflected forms occur other than rarely or even at all. In short I find these long inflection lines aesthetically ugly, distracting to users, likely to facilitate misleading impressions of the common usage of idioms, and unhelpful with respect to the actual range of variation. I could imagine a more helpful inflection line that showed possible or likely variations for each component term when the user hovered the cursor over the element show in the lemma inflection line. This has the flavor of something that is added because it was (relatively) easy, not because there is a user-based need. DCDuring (talk) 17:39, 29 December 2020 (UTC)[reply]
@Benwing2: I have re-added the alternative past tense form “hunted where the ducks are”, with a quotation of it. J3133 (talk) 15:26, 29 December 2020 (UTC) Also “hunting where the ducks were”. J3133 (talk) 15:35, 29 December 2020 (UTC)[reply]
@Equinox: I believe such potential issues in general, regarding there being more than one possible variation of a particular inflection, are solved by the amazingly detailed documentation at Template:en-verb/documentation#Multiword_expressions_with_irregular_verbs, and I think the last three example sentences of that section really show that this is okay (and so, @Mihia, it seems that inflections to multi-word phrases are a normal thing on Wiktionary): (1) rock and roll (verb); (2) reap what one sows; and (3) know which side one's bread is buttered on. Turns out, each verb in the phrase can be inflected separately as needed, and that's not a problem. Just my two cents though, I'm relatively new here, I might be mistaken. Whether or not it is necessary, beats me, because somehow it is and also isn't. (Also, to be honest, I have no idea about there being a previous discussion about inflections to multi-word phrases. I would like to read that discussion if anyone still remembers where it is.) --Bismabrj (talk | contribs) 16:33, 29 December 2020 (UTC)[reply]

Reorganisation of Slavey (den)[edit]

I propose that either the languages North Slavey (scs) and South Slavey (xsl) be placed as descendants of Slavey (den) [This follows how e.g. Cree languages are classified as of now on Wiktionary], or - perhaps even better - that Slavey also be re-organized as a proto-language or language family (since obviously any lemma of Slavey is either North Slavey [i.e. Bearlake, Hare and Mountain] or South Slavey). The only current lemma of Slavey (teh) should be moved to either NS or SS teh- in any case, following the provided source, so that shouldn't be an issue. Thadh (talk) 18:07, 28 December 2020 (UTC) If anyone knows any active Athabaskan contributors, feel free to ping them on my behalf[reply]

It sounds like you're saying that Slavey, North Slavey, and South Slavey are not three distinct languages, but only two. In that case, either (1) Slavey should be deprecated as a language and instead be made a language family, or (2) North Slavey and South Slavey should be deprecated as full languages and instead be made regional dialects of Slavey (perhaps also etymology-only codes if necessary). For me, one important factor in making the decision is how similar NS and SS are orthographically. If we had entries for, say, all Swadesh-list words in both lects, what percentage of them would be spelled the same? If there's a large overlap, I'd be inclined to treat Slavey as a single language with two dialects. If there's relatively little overlap, I'd be inclined to treat Slavey as two languages. —Mahāgaja · talk 08:47, 30 December 2020 (UTC)[reply]
From what I understand, Slavey consists of four major dialects (Bearlake, Hare, Mountain and (South) Slavey), which in turn have significant internal variation. Rice (1989) states that the outer ends of the dialect chain may not be mutually intelligible, which is an important factor here. I don't see much difference per se between South Slavey and the other dialects, so I wouldn't oppose splitting the three Northern dialects. I don't see much merit in merging the dialects however, because the variation seems large enough (especially for more complex words than dene (man)) to create a far too great amount of alternative forms and/or synonyms (as is the case with e.g. North Frisian) to be productive. Thadh (talk) 12:45, 30 December 2020 (UTC)[reply]
That does indeed sound like an argument for changing den to a family and recognizing only scs and xsl, or even for deprecating scs creating three ad hoc codes for Bearlake, Hare, and Mountain. However, since we apparently don't have any entries for North Slavey yet at all, that doesn't seem to be a particularly urgent priority. I notice that w:Slavey language mentions "Slavey proper" and distinguishes it from Bearlake, Hare, and Mountain, so I assume that "Slavey proper" means South Slavey? —Mahāgaja · talk 13:45, 30 December 2020 (UTC)[reply]
Slavey Proper is the same as South Slavey. I could add quite a lot of North Slavey lemmas following {{R:den:Rice:1989}}, but it may not be a good idea yet if the language is to be split. Furthermore, if I or anyone else am to add the etymologies and Proto-Athabaskan reconstructions, the question of Slavey being a family is more critical (for now, I have added it as if it were at Proto-Athabaskan *tuˑ, but that may need changing). Thadh (talk) 13:59, 30 December 2020 (UTC)[reply]
Speaking of the word for 'water', the only entry in CAT:Slavey lemmas is teh. Which dialect is that? It isn't listed at Proto-Athabaskan *tuˑ. As for whether Slavey is a family, is there any doubt that the NS and SS dialects are more closely related to each other than they are to other North Athabaskan languages? Or could this be a case like Northern Paiute and Southern Paiute, where the conventional English names are misleading because each have closer relatives than each other? —Mahāgaja · talk 15:08, 30 December 2020 (UTC)[reply]
NS and SS are undoubtedly closer to each other than to others, since they form a continuum and are often considered one language (namely Slavey).
As for teh, see my first message: no such word is mentioned in the presented source, rather the prefix given is teh- (related to water), shared by all dialects and used in multiple derivatives, such as Bearlake North Slavey tehwáa (mink) and South Slavey tehk’ái (muskrat). Thadh (talk) 16:44, 30 December 2020 (UTC)[reply]
Then I see no reason not to make den a family. —Mahāgaja · talk 20:55, 30 December 2020 (UTC)[reply]

Operation of "supermajority" voting rule[edit]

According to the vote at Wiktionary:Votes/2019-03/Defining_a_supermajority_for_passing_votes:

A vote passes if the ratio of supports to the sum of supports and opposes reaches 2/3 or more. A vote where that ratio does not reach 50% should be closed as "failed"; a vote that has at least 50% but less than 2/3 should be closed as "no consensus".

Suppose that we presently do X, but this is disputed, and, in fact, 15 people want to stop doing it, while 10 want to continue. Suppose we vote on whether to "Stop doing X". 15 supports, 10 opposes, no consensus, so no change, and we carry on doing it. Correct? On the other hand, suppose we vote on whether to "Continue doing X". 10 supports, 15 opposes, what happens then? Mihia (talk) 18:22, 28 December 2020 (UTC)[reply]

I came across this question here already. Can't remember who asked. It was something about a "negative vote" or "reverse polarity vote." People like me would call out the person setting up the vote on it. We'd say the status quo must never be threatened by default. We'd even go as far as changing the wording of the vote. Additionally, who starts a vote to find out if everything should stay the same? — Dentonius 19:49, 28 December 2020 (UTC)[reply]
There is no rule that I am aware of to say that a vote must be of any particular polarity. If this was intended then it should have been stated as part of the supermajority rule. There is presently no basis that I am aware of on which to "call out" someone who, let's say, in the situation of dispute about doing X, supports doing X and creates a vote on "Continue to do X" in the hope of resolving the dispute by its passing. Unless someone can point out some aspect of this that I have missed, it seems clear to me that the "supermajority" rule is fundamentally flawed and needs to be revised (possibly so as to also somehow stipulate polarity of vote, or require a majority to change the status quo, if that was the original intention, though this could also create an issue of who decides what is the status quo.) Mihia (talk) 20:00, 28 December 2020 (UTC)[reply]
I don't think it's flawed. I think it's perfectly fine. It makes it harder for people to impose their will by forming small collectives. Besides, in your recently concluded vote, didn't I comment on this aspect of your vote? Didn't we correct it together? I'd say the system works. — Dentonius 20:06, 28 December 2020 (UTC)[reply]
With respect, I think you fail to see or understand the problem. Mihia (talk) 20:07, 28 December 2020 (UTC)[reply]
I think you're making up a problem that could easily be resolved by rewriting a vote before it goes live. You can, of course, create a bureaucratic vote to clarify this issue, but I don't really see the point of it. —Μετάknowledgediscuss/deeds 20:09, 28 December 2020 (UTC)[reply]
"resolved" to whose benefit? The person who wants to do X or the person who wants not do X? I must repeat again the "unless I am missing something" caveat, but no, in the absence of that, I definitely am not making up a problem. The asymmetry of the supermajority rule, without any stipulation about the "direction" of "support" or "oppose", makes the rule purely nonsensical. Mihia (talk) 20:14, 28 December 2020 (UTC)[reply]
I understand what you're saying Mihia. But here are a few questions for you: (1) What kind of person would abuse it? (2) What would it take for such a thing to go unnoticed? (3) Would such a person even have support from the general community? — Dentonius 20:18, 28 December 2020 (UTC)[reply]
(1) It's not a question of "abuse". It is something that can happen even unwittingly, just on the vagaries of how someone phrases a vote. Look at my example. (2) Apparently not much, since the "supermajority" vote happened nearly two years ago, and no one has apparently noticed the "fatal flaw" until I did. Mihia (talk) 20:21, 28 December 2020 (UTC)[reply]
Dentonius is asking what it would take for the community not to notice that a vote was written with an unexpected polarity. —Μετάknowledgediscuss/deeds 20:31, 28 December 2020 (UTC)[reply]
Which vote in my original example is "unexpected polarity"? Mihia (talk) 20:38, 28 December 2020 (UTC)[reply]
The example is an abstract generic one. In other words, it is not an example. An example could be an attempt to formalize the lemming principle by something like “Do not delete terms that are included in at least three major dictionaries”. However, if such a vote was to be proposed in this negative form, it will politely be pointed out to the proposer that they should instead formulate a positive proposal to change the wording of WT:CFI. Rather obviously, I hope, the failure of the proposal as originally negatively formulated would not mean we should turn Wiktionary into a repository of obscure words hardly attested in any language by deleting all terms found in at least three major dictionaries. In general, a proposal to be voted on should have the form of “Effectuate some (concretely specified) change”, and not “Leave things as they are”. The failure of a proposal should always mean, “No change – leave things as they are”.  --Lambiam 03:24, 30 December 2020 (UTC)[reply]
@Lambiam: Apologies for the delay in getting back to this thread. "Do not delete terms that are included in at least three major dictionaries" is not a case of my original example. In my example, X is something that we presently do. We do not presently "delete terms that are included in at least three major dictionaries". You say that the failure of a proposal should mean no change, and I agree that this is very presumably the intention or assumption of the supermajority rule, but this is unstated and unexplained, and it is by no means always as easy or obvious as you may be assuming to spot "wrong" polarity items, or even to know what "wrong" means, e.g. where there is no clear or consistent status quo (e.g. different people doing different things). If you prefer a concrete example, please see Wiktionary:Votes/2020-10/Use_of_"pronunciation_spelling"_label. Are the options in that vote written with the correct or incorrect polarity? Originally, Dentonius complained that the "continue doing what some people are already doing" polarity of some options was an attempt to game the supermajority rule (which it wasn't, not intentionally), but in the end we all voted on the "continue doing" options, and no one (else) said that there was any problem. Mihia (talk) 18:13, 28 February 2021 (UTC)[reply]
This, or a similar issue, has been discussed before and come up directly in at least one pair of competing polarity votes, namely on allowing-vs-banning entries for romanized Russian, where people who thought it was allowed felt a vote to ban it would need to pass and people who thought it was not allowed thought a vote to allow it would need to pass, and each thought the opposite vote was a "continue to do [allow / not allow] this thing we already do [allow / not allow]" vote. The people on the "losing" side of what they see as an attempt at rules-gaming could behave according to the fact that if we do already do X then the failure to get consensus to "continue doing X" is not per se a successful vote to "stop X" and hence does not prevent them continuing to do X; but as in the case of romanized Russian, the most likely time this will come up without being "fixed" during drafting is when there is disagreement over what the status quo is, what is currently in fact already being done / allowed / banned. In that case, no change to the rules to require that votes "propose a change rather than a continuation" would help, would it?, since the disagreement is over which thing is a change. (And there's no impartial, omniscient robot who knows and enforces all our rules, the people who enforce the rules are all the site's editors ... if enough think something already allowed or banned respectively, and behave accordingly, especially if they're longtime users / admins, then ... that's not necessarily solvable via more rules, is it?) - -sche (discuss) 23:24, 28 February 2021 (UTC)[reply]
Anyway, if it is agreed that we presently do X, I'd say failure of a vote to "continue doing X" is effectless, it's =/= passage of a vote to ban X. Compare how we had to explicitly pass Wiktionary:Votes/2020-02/De-sysop votes to pass by simple majority into order to desysop people by simple majority; a reason for that vote is that prior to it, if someone created a vote "Doe should continue being an admin" and it got 45% support and 55% oppose, it wouldn't have desysopped Doe. - -sche (discuss) 01:25, 1 March 2021 (UTC)[reply]
Apparently, "Doe should continue being an admin" should not be allowed as a proposal under the supermajority rule, since a "support" vote is not a vote to change the status quo. Nonetheless, in the "pronunciation spelling" example above, we did vote on a proposal that was effectively to continue doing what was already happening. The whole thing is highly unsatisfactory. At minimum it needs to be explicitly stated, along with the rule, that proposals must be worded so that "support" votes are votes to change the status quo. This leaves two other issues to possibly also be mentioned in the rules: what happens if editors cannot agree on what is the status quo, and what happens if there is no viable status quo (e.g. present situation is an inconsistent muddle). An example of the latter case would be where many people do X, and many people do Y, but X and Y are incompatible and mutually exclusive, where it is impossible to say which of "Do X, implying don't do Y" or "Do Y, implying don't do X" is of the "correct" polarity for the supermajority rule to reasonably apply. Mihia (talk) 18:23, 1 March 2021 (UTC)[reply]

Please take a look at this proposed vote. It aims to settle exactly what attestation requirements apply to appendix-only constructed languages, by setting a lower standard (one use) so that we don't have to lose that content, but can also avoid unused words. —Μετάknowledgediscuss/deeds 22:36, 30 December 2020 (UTC)[reply]

I would have suggested a different compromise between the criteria for WDLs and LDLs: either a mention in an official/authoritative source or three durable uses. The current proposal allows nonce creations singly attested on Usenet but prevents 'official' vocabulary from being included (note that "not durably used" does not equal "not used anywhere"). I don't think it is a good idea to exclude any of the 120+ official words of Toki Pona but include obscure fan coinages with fewer than three uses. I think a more restricted approach is okay in an extreme case like when a conlang's academy has published a dictionary of 10,000+ unused words, but I prefer to treat those as special cases. ←₰-→ Lingo Bingo Dingo (talk) 14:27, 31 December 2020 (UTC)[reply]
@Lingo Bingo Dingo: How would you recommend avoiding the "special cases" with your proposed rule? Can you find any official Toki Pona words that would be excluded? (I can't.) Why is 'official' vocabulary that's never used (i.e. nonce creations by a dictionary-writer) more valuable than fan coinages that are actually used? —Μετάknowledgediscuss/deeds 18:48, 31 December 2020 (UTC)[reply]
Heuristically it shouldn't be difficult to determine which languages have huge official wordlists; action can be taken on that basis (ideally by the editors active in a language).
Which Toki Pona words can be durably attested? When I search for '"akesi" toki pona' I don't see anything that seems durable.
Official vocabulary is more likely to be found and used by language users than a durably used coinage on a deserted newsgroup. The content of our appendices is somewhat odd stuff anyway (for instance there is one for unattested protologisms), so I don't mind it if the conlang appendices are less descriptive than the mainspace should be. Again, use and durable use are not identical. ←₰-→ Lingo Bingo Dingo (talk) 15:09, 1 January 2021 (UTC)[reply]
@Lingo Bingo Dingo: Re “Which Toki Pona words can be durably attested?”: Is Sonja Lang’s book Toki Pona: The Language of Good not durably archived? J3133 (talk) 16:00, 1 January 2021 (UTC)[reply]
@J3133 It probably is durable, but the uses should not be example sentences from a textbook. Use in longer stories like reading exercises is probably fine. ←₰-→ Lingo Bingo Dingo (talk) 20:01, 1 January 2021 (UTC)[reply]
@Lingo Bingo Dingo: The problem you raise with durable vs nondurable use is valid (and it's why we have e.g. English COLAtard but not some Twitter slang). Besides the exceptions that would need to be identified, a problem with your approach that has just occurred to me is the uncertainty in US law concerning whether conlangs can be copyrighted. If we copy a dictionary wholesale, this could be an issue, but if every word is supported by an actual use, I think we should be safe, though IANAL. —Μετάknowledgediscuss/deeds 20:32, 1 January 2021 (UTC)[reply]

Besides the problem of durability (which is, I think, broader than the conlangs problem and therefore a separate issue), there is the issue of who will track down quotes. It has been mentioned several times before that a great many if not most of the entries in some constructed languages may not survive RFV, but nobody bothers. So maybe instead of requiring the existence of durably archived quotes (as we do now), we should require those quotes to be present here on Wiktionary. That way there is no need for lengthy (and often interminable) RFVs, and entries without quotes could be deleted right away. (Although I guess some grace period should be allowed.) MuDavid 栘𩿠 (talk) 03:11, 4 January 2021 (UTC)[reply]

Switching from existence of cites to presence of cites would be a radical change in the way we do things here. If there is a problem with RFV, it is that relatively few people are willing to help out. I think the practice of sending entries to RFV rather than simply deleting them is fundamentally sound. —Μετάknowledgediscuss/deeds 04:15, 4 January 2021 (UTC)[reply]