Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit
2002
December
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019


Contents

February 2019

Proposal: Separate namespace for entries in Category:Chinese terms written in foreign scripts[edit]

The main issue: Various discussions from the past:

Chinese loanwords that were written in foreign script were originally used only for technical terms such as α粒子 (ā'ěrfā lìzǐ), σ鍵σ键 (xīgémǎ-jiàn), but the advent of globalization has introduced terms such as 卡拉OK (kǎlā'ōukèi), NG (ēnjī), man#Chinese into the Chinese language. Many of these are in colloquial use, but appears to be unregulated.

As a dictionary that aims to describe all words of all languages, it would be useful to include such entries, particularly entries such as fighting#Chinese which has a different meaning from what one would usually expect.

However, of late, this has turned into a rather contentious issue. The main arguments were (1) Chinese terms should be written in Chinese script, not foreign script (2) Chinese terms written entirely in foreign scripts are code-switched. KevinUp (talk) 02:48, 1 February 2019 (UTC)

Prelude: It seems that KTV#Chinese, which had passed RFV in 2014, was recently removed from Wiktionary for being "not Chinese". I wish to point out that KTV#Chinese was among the 39 pioneer entries listed in 现代汉语词典 (Xiandai Hanyu Cidian, 3rd edition, 1996) under its appendix for lemmas that begin with the Latin script (西文字母开头词语).

The following was listed after the definition for KTV:
K卡拉OKTVtelevision缩写  ―  Kēi, zhǐ kǎlāOK; TV, yīng television de suōxiě.  ―  K refers to karaoke while TV is an abbreviation of English television.
I'm not sure whether "K (kēi)" is an abbreviation of Chinese 卡拉OK (kǎlā'ōukèi) or Japanese カラオケ (karaoke), but it should be consistent with the "K" used in 唱K (chàngkèi). Note that we already have an entry for K#Chinese.

On the other hand, the appendix for lemmas that begin with the Latin script in 现代汉语词典 (Xiandai Hanyu Cidian) has expanded from 39 entries (3rd edition, 1996) to 239 entries in the 6th edition (2012). Of the 239 entries, 226 entries were capitalized, while only 7 entries - e化 (yìhuà), e-mail, hi-fi, pH值, Tel, vs, Wi-Fi) were not fully capitalized (the remaining 6 entries contained Greek α,β,γ). For comparison, the original 39 entries found in the 1996 edition are listed below: KevinUp (talk) 02:48, 1 February 2019 (UTC)

Of these, I found that the following entries were not found in the 6th edition (2012)

The volatile nature of such entries (note the removal of Internet#Chinese in the 6th edition) prompted me to come up with the following proposal:

Proposed solution: A separate appendix for Chinese loanwords (外來語外来语 (wàiláiyǔ)) that are written, either partially or fully in foreign script will be created. These "entries" will have full etymology, pronunciation etc, similar to what we have for English snowclones such as "X is the new Y", "have X, will travel" which are listed in a separate appendix. The {{zh-see}} template will then be used to redirect entries such as 卡拉OK#Chinese to a separate namespace such as Appendix:Chinese terms written in foreign scripts/卡拉OK or Appendix:Foreign words used in Chinese/卡拉OK, which is up to the community to decide. KevinUp (talk) 02:48, 1 February 2019 (UTC)

Comments[edit]

I don't like the idea of moving these to an Appendix; the Appendix has poor findability. I stand by what I wrote in the 2018 BP thread. —Suzukaze-c 03:13, 1 February 2019 (UTC)

(I am partial towards User:Fay Freak's idea of adding "code-switching quotes". —Suzukaze-c 03:27, 1 February 2019 (UTC))
I agree with Suzukaze-c for the most part. Also, putting all these words of varying degrees of acceptance into Chinese (e.g. 卡拉OK vs. part-time) in the appendix seems to sweep everything under the rug and would not be dealing with the core of the issue. — justin(r)leung (t...) | c=› } 03:30, 1 February 2019 (UTC)
It seems that "Appendix:Snowclones/X is the new Y" has the proper categorization (Category:English lemmas, Category:English phrases). The only difference is the title of the page looks different. Yes, it's a bit hard to search for "X is the new Y", but for Chinese entries we'll use {{zh-see}} so it is still searchable.
The reason of moving such entries into an appendix is for obvious reasons: Chinese entries are generally not written using foreign scripts, unless it is a transliteration. This does not solve the issue of whether or not an entry is part of code-switching. For me, code-switching hold true for overseas communities, but most of the people in mainland China do not speak much or any English at all. KevinUp (talk) 04:05, 1 February 2019 (UTC)
Why attach so much significance to whether an entry is in the appendix namespace? Are ordinary users supposed to somehow know what that means? As you say the categories are the same. DTLHS (talk) 04:07, 1 February 2019 (UTC)
One reason for this is because we want people to view Wiktionary as a serious project. It does feel awkward to have iPhone#Chinese among Danish, French, Portugese, Spanish, etc. Another reason, some of these lemmas come and go, e.g. Internet#Chinese which was found in the 3rd edition (1996) of 现代汉语词典 but removed in the 6th edition (2012). If we have a separate namespace we can better monitor such entries. I'd like to mention that KTV#Chinese was recently removed without any formal discussion (despite passing RFV in 2014), and the etymology of KTV#English contains errors (MTV = Movie TV?) KevinUp (talk) 04:27, 1 February 2019 (UTC)
I say add usage notes to all entries of this type explaining the situation or just link to an explanatory page, which explains how the situation is controversial. 'iPhone' is used by Chinese people all the time. I don't care if it is considered Chinese or not, but there's no reason for Wiktionary to ignore the fact that Chinese people use that word in amongst Chinese speech in just the same kind of way that 'taco' is used freely in English. --Geographyinitiative (talk) 04:54, 1 February 2019 (UTC)
Turns out some scholars disagree with the inclusion of such entries in w:Xiandai Hanyu Cidian#Controversies. See also news report here.

《人民日报》高级记者傅振国说:“《现汉》第6版在‘正文’中收录了英语缩略词等词汇之后,等于将汉语汉字的标准规范擅自改变为英语等外语可以进入汉语,英文可以代替汉字。”

KevinUp (talk) 12:38, 1 February 2019 (UTC)
It would be serious enough if there were “code-switching quotes”, foreign language quotes in English sections displayed as “code-switching ▼” instead of “quotations ▼”; in the citation namespace perhaps we change the {{citations}} template so the wording is not “English citations” but “Citations for English”, and perhaps with an extra parameter for subsections like “Citations for English [x] in [language y]”. The findability is good. One cannot expect anyway that a term in a text one searches is found on Wiktionary in the language of the text. If I have a word in Slovenian a section in Serbo-Croatian will usually help, and Persians make use of the Arabic sections for Persian texts. If something is in Latin script in a Chinese text, one expects people to search it as English more than as Chinese. So you editors just need to get over the novelty of this view.
It also works trans-script btw – instead of ridiculous Sanskrit-as-English entries we can put quotes for the terms used in esoteric English texts on the citation pages of the Sanskrit entries. This is like Serbo-Croatian entries in Cyrillic can contain quotes in Latin script since one intends to have mirrored Latin and Cyrillic entries, and like one does not always quote every alternative form on its own page when it would be more useful to centralize to showcase the meaning for example and when variants depend on readings of manuscripts (zancha actually contains a quote for zanca, for culullus the readings are all uncertain …), and Azerbaijani is now using {{spelling of}} like on اۆز‎, and if you quote from audio-records you can’t quote spellings anyway – but I might be too liberal here and you don’t make this second step though assenting to this code-switching quoting. Point is editors need to open their minds for the first step. Sweeping normal quotes into the appendix is caitiff. Fay Freak (talk) 13:35, 1 February 2019 (UTC)
I think this would be a better way of dealing with the current situation (to have foreign language quotes in English sections displayed as “code-switching ▼” and “Citations for English [x] in [language y]”). This is usually encountered for proper nouns (personal names, placenames, etc). The Vietnamese Wikipedia often uses the original Latin spelling without converting it to the Vietnamese alphabet. KevinUp (talk) 21:05, 1 February 2019 (UTC)

What about creating a dummy "language"/language code/Language header for cross-linguistic terms- that is, terms that are used in a given language, but don't really belong to that language. We already have "und", which displays as "undetermined". We even have entries: see Category:Undetermined language. We would have to set some ground rules so that we wouldn't be basically duplicating our coverage of a term for every language that might use it in running text, and we would still have to weed out translingual terms and genuine borrowings. Figuring out what to do about script support might be tricky, though. Chuck Entz (talk) 05:07, 1 February 2019 (UTC)

The scope of this is a bit too wide. We're currently looking at Chinese loanwords that retained part of its foreign script and whether or not such entries can be considered as Chinese lemmas. KevinUp (talk) 12:38, 1 February 2019 (UTC)
 If there are two languages to ascribe a word to the language of it is no more “undetermined” than the etymology of a word is “unknown” ({{unknown}}) if we have two etymologies we are not sure to choose between. It’s not undetermined, it is underdetermined, a thing that usually isn’t a problem in language. ΖΩΑΠΑΝ is an example of what a word of undetermined language is. If a word is sorted as of undetermined language or an etymology as unknown, there is hope that at some point the language is determined respectively the etymology is resolved (we even categorize pages with und language links). For the state of things was known somewhen to someone. In the code-switching examples it is no issue to leave it unresolved. They have arisen in a state of ambiguity. Fay Freak (talk) 13:47, 1 February 2019 (UTC)

Another solution[edit]

As suggested by User:Fay Freak, I think the inclusion of foreign language quotes in English sections which are displayed as “code-switching ▼” instead of “quotations ▼” as well as “Citations for English [x] in [language y]” in the Citations namespace would be a better solution. @Justinrleung, Suzukaze-c, any further comments? KevinUp (talk) 03:42, 2 February 2019 (UTC)

I totally oppose including a quotation in which someone uses one English word in what is otherwise a Chinese sentence under an ==English== section, no matter the formatting. If the sense used in the quotation exists in English, use English quotations to cite it; if the sense doesn't exist in English, then it shouldn't be in an ==English== section and it also strongly suggests the string does deserve a ==Chinese== (or whatever) section. (I would make exceptions for extinct languages attested in Greek manuscripts and things of that sort. But using code-switching to attest, or provide as quotations to illustrate the use of, a WDL? No.) - -sche (discuss) 23:24, 2 February 2019 (UTC)
From a wholesome view, this is language and thus to be covered. It is an extra one can have. I do not deem it likely that Wiktionary can overflow with code-switching quotes in any fashion that could significantly give offence. Note that one has the offence already. One has the quotes already but sorted in a crude fashion. This is only about showing the quotes as what they are: Multilingual text. Fay Freak (talk) 22:56, 3 February 2019 (UTC)
Yeah, some users might oppose to having foreign language quotations, because it feels weird to see Greek or Chinese text popping up within an English entry. Perhaps a new template similar to {{seemoreCites}} in the English section for code-switched/foreign language quotes would be more appropriate. KevinUp (talk) 04:35, 4 February 2019 (UTC)
Why add code-switching quotations at all? Our mission is to define all words, not to record all quotations. We can cite and define English words using English quotations. AFAICT the only reason to bring up a Chinese- or German- or whatever- language quotation in which one "English" word has been embeded, is if the "English" word (or sense) can't be attested via English quotations . . . in which case, shoehorning it into an ==English== section in any fashion is wrong, on a basic WT:CFI level, and furthermore strongly suggests the word is in fact a word in the language of the surrounding quotation. - -sche (discuss) 05:26, 4 February 2019 (UTC)
Yes, it's not our mission to record such quotations. If we were to create a section for "code-switched quotations", this will have to be restricted to lemmas written in a nonnative script. The main reason for having this section is to prevent entries such as iPhone#Chinese or iPhone#Vietnamese from appearing. KevinUp (talk)
Why/how would preventing/banning iPhone#Chinese require adding code-switching quotations? Anyone who runs across a code-switching quotation and wants to know what any of the words in it means can look up each of them, and find Chinese entries (with Chinese quotations) defining the Chinese words and iPhone#English (with English quotations) defining the English word, enlightening them on the meaning of all of the words in the quote. I don't see why the code-switching quote itself would need to be recorded. - -sche (discuss) 06:29, 4 February 2019 (UTC)
Good point. The thing is, there's currently a loophole in our system. It is not impossible to find quotations for APP#Chinese, の#Chinese or iPhone#Chinese (no quotations yet), but are these words considered actual lemmas in their respective languages? What can we do to prevent users from creating such entries? Guidelines are needed to identify whether quotations provided to lemmas written in a nonnative or nonstandard script qualify as code-switching. KevinUp (talk) 07:41, 4 February 2019 (UTC)
Consider also Talk:hiam. I also used google:"ai swee mai mia" and google:"ai sui mai mia" as the basis for creating 愛媠莫命, but really I could not find usage of the same phrase in Chinese characters, excluding Standard Chinese calques. —Suzukaze-c 06:11, 4 February 2019 (UTC)
Interesting. Now we're looking at Min Nan terms code-switched into English. I disagree with the creation of hiam#English. If it's Singlish or Singaporean English, it should at least be found here: http://eresources.nlb.gov.sg/newspapers/ (Singapore newspaper archive). I think it's Min Nan code-switched into English, because non Min Nan speakers might not be able to catch its meaning (Singapore is a fairly diverse society).
As for 愛媠莫命爱媠莫命, I think we can create POJ entries such as ài-súi-mài-miā rather than poorly transcribed "ai swee mai mia" or "ai sui mai mia". Hokkien is often transcribed without tone marks by the locals, but when read its still pronounced exactly like Hokkien. KevinUp (talk) 07:41, 4 February 2019 (UTC)

Citations namespace[edit]

This seems perfect for the citations namespace instead of an appendix. It's linked automatically from the entry, and there's no need to specifically label it as a particular language. DTLHS (talk) 23:31, 2 February 2019 (UTC)

The important lexicographic information concerning the Chinese entry, like pronunciation or the measure word it takes, cannot be placed on the citations page, so there would be a loss of information. Also, citations pages are standardly labelled by language using {{citations}}. —Μετάknowledgediscuss/deeds 19:04, 3 February 2019 (UTC)
Yeah but foreign terms which haven’t passed are arbitrarily adapted to the sound system of a language. This has also been shown for “APP” in Chinese, and is well-known of code-switching in general: As is that when multilinguals switch languages it is not unusual to take over the pronunciation of one language when one is in the other, and even conscious speech assumed there lack standards. Even the pronunciation of “passed” words is rather arbitrary, dependent on educational background and also intentionally ridiculized.
I have suggested the citation namespace, but with proper earmarking of multilingual quotes. Fay Freak (talk) 22:44, 3 February 2019 (UTC)
Symbol support vote.svg Support the use of the citations namespace for entries such as APP#Chinese which has the exact same definition as its corresponding English.entry (app). As for its pronunciation, there isn't a proper guideline for that. Xiandai Hanyu Cidian states the following in its appendix for lemmas that begin with the Latin script:
漢語西文字母一般西文這裡不用漢語拼音標注讀音 [MSC, trad.]
汉语西文字母一般西文这里不用汉语拼音标注读音 [MSC, simp.]
Zài hànyǔ zhōng xīwén zìmǔ yībān shì àn xīwén de yīn dú de, zhèlǐ jiù bùyòng hànyǔpīnyīn biāozhù dúyīn. [Pinyin]
Translation: In Chinese, Western letters are generally read based on its pronunciation in the Western language. Here, there is no need to mark the pronunciation in Hanyu Pinyin.
I hope we can make a decision on this soon, i.e. (1) entries in Category:Chinese terms written in foreign scripts such as size#Chinese, part-time#Chinese, iPhone#Chinese, which has the same meaning as in English, are to be moved to the citations namespace. (2) Only entries such as man#Chinese, fighting#Chinese, NG#Chinese, which has a meaning different from what one would expect from its usual English definition, are to be included in the Chinese section. KevinUp (talk) 04:25, 4 February 2019 (UTC)
The loss of pronunciation information in entries such as size#Chinese (Cantonese: saai1 si2) is regrettable, but that information belongs to 晒士/嘥士, not size#Chinese. Until a lemma has been properly lemmatized into Han script (e.g. cheese芝士 (zhīshì)), its pronunciation is often unclear and varies depending on each individual.
Then again, I have no idea why APP#Chinese is pronounced like an initialism in mainland China, but I think this information can be included in the usage notes of APP#English instead. KevinUp (talk) 04:25, 4 February 2019 (UTC)

A specific low memory template for compounds of Japanese kanji, Korean hanja, Vietnamese Han characters[edit]

The page for (shuǐ) is currently out of Lua memory. Even after memory consuming templates such as {{Han etym}} were removed, the same problem persisted. This may have something to do with Module:columns. I think, we may need to rely on an older version, such as Module:columns/old. I found that {{der-top3}} uses less memory compared to {{der3}}.

Also, a few months back, a user was confused by the many derived terms in the Japanese section (Wiktionary:Tea_room/2018/August#者,_difference_between_derived_terms_under_Kanji_vs._under_suffix?), so whatever template used for compounds or derived terms needs to have a customizable title ({{der3}} doesn't have a title anymore). KevinUp (talk) 02:48, 1 February 2019 (UTC)

@KevinUp:

Test title

I just made a der3 with a custom title right here. Or was there a discussion about deprecating it ASAP? mellohi! (僕の乖離) 19:00, 1 February 2019 (UTC)

Yes, the customized title of {{der3}} has been deprecated. Prior discussion can be found at Wiktionary:Beer parlour/2018/November#Titles of morphological relations templates. Take a look at wine#Derived terms. The unboxed title looks out of place. KevinUp (talk) 21:06, 1 February 2019 (UTC)
Ah, you weren't referring to the unboxed titles. Speaks of me being out of the loop. mellohi! (僕の乖離) 21:09, 1 February 2019 (UTC)
No worries. Basically, the Lua memory used for {{zh-der}} (Chinese compounds) and {{der-top3}} (JKV compounds) needs to be reduced. KevinUp (talk) 03:42, 2 February 2019 (UTC)

Update: seems to have enough memory now. @Erutuon, do you know which template/module was using up the memory? I just realized {{der-top3}} does not use Lua memory. KevinUp (talk) 04:49, 4 February 2019 (UTC)

Update 2: By using {{der-top3}} instead of {{der3}} and subsituting {{ja-r}} with {{ja-r/multi}} and {{ja-r/args}}, Lua memory in is now reduced to 44.59 MB. KevinUp (talk) 09:24, 11 March 2019 (UTC)

Kanji compounds for Japanese given names[edit]

Previous discussion: Wiktionary:Tea room/2018/August, User talk:Shāntián Tàiláng#Given name request

I'm interested to know what the community thinks about creation of kanji compounds such as 亜実利 that are only used in given names. There are up to up 148 possible kanji combinations listed at あみり. Are we going to create entries for all of these? Readings for Japanese given names (known as nanori) is often arbitrary and there are no strict rules on which kanji to use.

When I look at pages such as Category:Japanese terms spelled with 実 read as み, most of it appears to be given names. To isolate actual kanji compounds, one would have to search for — incategory:"Japanese terms spelled with 実 read as み" -incategory:"Japanese proper nouns" intitle:実 [1] to obtain the 14 entries of 実 with reading み that are not proper nouns.

We could perhaps include only the top 5000 kanji compounds used for given names. I think listing the possible kanji forms for Japanese given names at hiragana pages such as あみり is good enough. To find out how to pronounce a person's name written in kanji, we could just use the search box, or check the nanori readings listed at individual kanji pages.

On an unrelated note, most South Koreans still use hanja for their given/personal names, but we don't have any entries for hanja given names. There are only 3 entries in Category:Korean given names, compared to the 6229 entries we have in Category:Japanese given names, while Category:Chinese given names was recently deleted, because most Chinese given names are sum of parts, and any combination is possible as long as it's not a lewd word.

So the question is, should we continue to create such entries, or should we limit this to something like 5000 most popular kanji compounds (I have no idea where to find this). KevinUp (talk) 03:42, 2 February 2019 (UTC)

For me, all Japanese given names should be lemmatized at the hiragana form, with the kanji spellings being soft redirects to the hiragana lemmas, where an exhaustive list of kanji spellings can be added. Subsequently, Category:Japanese given names and all its subcategories should be purged of any kanji spellings of given names, leaving only the hiragana lemmas left. mellohi! (僕の乖離) 04:45, 2 February 2019 (UTC)

Pinging @Eirikr, Poketalker, Dine2016, 荒巻モロゾフ, Suzukaze-c over here for their thoughts. mellohi! (僕の乖離) 04:45, 2 February 2019 (UTC)

At least, It might be necessary to isolate articles of people names that can be made unlimitedly. It’s necessary to delete if there are which does not have actual usage.--荒巻モロゾフ (talk) 06:30, 2 February 2019 (UTC)
My only opinion is that we should not rely on the EDICT names dictionary, and should do at least the minimum effort to make sure that a name or its kanji spelling is actually used. —Suzukaze-c 23:44, 4 February 2019 (UTC)
I’d like rename all Japanese first name entries to hiragana. One can very freely choose kanji for a given pronunciation. — TAKASUGI Shinji (talk) 00:44, 5 February 2019 (UTC)
In response to the original question, speaking as a beginner reader of Japanese, I would like as many kanji personal "first" names and family names as possible to be look-uppable. Sometimes for beginners it may not even be clear that a kanji compound is a personal name. Generally, if I see incongruous characters, e.g. for topographical features or "beautiful flower"-type meanings, then I tend to guess that a personal name is meant, but sometimes it is not obvious for beginners. Mihia (talk) 01:18, 15 February 2019 (UTC)
In Japanese texts, especially those that are beginner friendly, given names are often written with the suffix さん (-san). In addition, Japanese personal names (full names) often consist of four to five character kanji compounds (usually two characters for surname, followed by two or three characters for given name), so it is not that hard to identify a Japanese personal name while reading more advanced texts. I don't think it is practical to have as many kanji personal "first" names because many different kanji variations are possible for the same Japanese given name written in hiragana. Family names, on the other hand, are fixed when it comes to kanji choice, due to strict rules in the koseki system, so kanji compounds for family names can be included as they fulfil our attestation requirements. KevinUp (talk) 03:05, 17 February 2019 (UTC)
Erm, well, thanks, I know さん, and that names are character compounds! The thing is that these characters usually have literal meanings too. When these are obviously incongruous to the subject matter it is not too bad, but this isn't always the case. Also there are no capital letters to help, of course. Mihia (talk) 00:04, 20 February 2019 (UTC)

Taxonomic names in individual languages[edit]

In Dutch, but I'm sure also in other languages, there are terms for taxonomic clades as well as members of them, that are different from the scientific/Latin-based translingual names. For example, the normal term for Felidae is katachtigen and for Mustelidae it's marterachtigen. These are plural forms of nouns, and the singulars katachtige and marterachtige refer to individuals of these groups.

There doesn't seem to be any kind of category tree for such names currently. We have a big set of categories for the translingual taxonomic names, but they don't seem to have equivalents in other languages, only translingual. Given that both the group and its members are part of a single lemma in Dutch, how should these be categorised? Only marterachtigen refers to a group, but it's not a lemma, so it shouldn't have any categories. Should the lemma have something like {{lb|nl|in the plural}} ''[[Mustelidae]]'' as a second definition? —Rua (mew) 13:27, 2 February 2019 (UTC)

In English, too, a mustelid is a member of the Mustelidae, and the whole group could be referred to by the plural of that word. But if you say (and I tend to agree) the plural isn't a lemma—if its use to refer to the Mustelidae isn't so lexical it needs to be given as a definition on the page marterachtigen / mustelids—why isn't it sufficient to define the singular as "a member of the family Mustelidae"? Is it not comparable to how "humans" in the plural can mean "humanity" / "humankind", but we probably don't need to add a sense to "human" (or "humans") that says "(in the plural) Humanity / humankind", or a sense at "elf" for "(in the plural) elfkind", "dog" "(in the plural) dogkind", etc? As far as categorization, what would you suggest would be needed beyond putting marterachtige in Category:nl:Mustelids the way mustelid is in Category:en:Mustelids? - -sche (discuss) 17:04, 2 February 2019 (UTC)
I don't really think mustelid should be in Category:en:Mustelids, based on WT:Beer parlour/2018/December#Should set-type categories also contain their namesake?. But that aside, we have a lot of categories specific to taxonomic names, but only in Translingual, not in any specific language. My question was more related to whether we should replicate this structure in all languages that have terms referring to species/taxonomic groupings (like English and Dutch, as you showed). That is, should mustelid be in a to-be-created Category:en:Taxonomic names or Category:en:Taxonomic names (family), the way that Mustelidae already is? —Rua (mew) 18:23, 2 February 2019 (UTC)
Aha, I see you what you mean. Hmm...if marterachtige(n) / mustelid(s) is categorized as Category:foo:Taxonomic names (family), would witbandgierzwaluw and black swallow-wort be categorized as Category:foo:Taxonomic names (species)? Would birds / vogels be categorized as a taxonomic name for a class? And then, would cohosh also be in Category:en:Taxonomic names (species) although it refers to two species? I guess I'm not opposed to that, though the birds/vogels (clearly just a common name/word) and cohosh (ambiguous / two species) examples seem like evidence these aren't truly taxonomic (unambiguous) names. (It seems related to the question of whether mul taxonomic names can have translations, to which the de jure answer may be no but the de facto answer—looking at Navajo, for example—is yes. On that note, I suppose marterachtigen and mustelids should be added to Mustelidae#Translations...) - -sche (discuss) 23:17, 2 February 2019 (UTC)
Perhaps languages written in non-Latin scripts can give answers. How are taxonomic names rendered in Russian or Chinese? I cannot read Chinese, but w:zh:鼬科 is the interwiki for w:Mustelidae and it has a name in Chinese characters, with the 学名 (scientific name) given after it in Latin letters. w:ru:Куньи is likewise in Russian, and gives the scientific name but labels it "Latin". Would "scientific name" and "taxonomic name" be the same thing? What is the term for native-language equivalents of taxonomic names, like 鼬科 (鼬科) and куньи (kunʹi)? Should we give them their own categories or just place them in the regular lifeform set categories? —Rua (mew) 21:07, 3 February 2019 (UTC)
“Would ‘scientific name’ and ‘taxonomic name’ be the same thing?” In the context of discussing a taxon: yes. See 学名 and scientific name.  --Lambiam 02:11, 4 February 2019 (UTC)

Format of custom header text in new {{der4}}[edit]

@Erutuon: Can you please change the formatting of the custom header text in the new {{der4}}? It has the same bold text as Derived terms and this does not make it obvious to readers that the multiple tables under Derived terms still belong to this section. They seem like separate sections. I tried to get used to it but every time I see it, I find it confusing. See fárad. I'd prefer text in italics and parentheses, with closing colon, e.g. (Compound words): Thanks. Panda10 (talk) 20:07, 3 February 2019 (UTC)

A change like that needs consensus, though admittedly I didn't get input on what the header text in {{der4}} and similar templates should look like when I chose the style. But you can change it to the style you propose just by adding the following to your common.css:
.term-list-header {
	font-style: italic;
	font-weight: inherit; /* remove this line if you would like the header to still be bolded */
}
.term-list-header:before {
	content: "(";
}
.term-list-header:after {
	content: "):";
}
Eru·tuon 20:40, 3 February 2019 (UTC)
@Erutuon: I really appreciate the script but I'm not sure if modifying my common.css is the correct solution. I think it is better to see the entries as a Wiktionary reader would see it. Panda10 (talk) 21:46, 3 February 2019 (UTC)
@Panda10: Well, okay. I agree that the current style is confusing. I don't like the combination of parentheses and colon myself, but if others like it, I can implement it. In the meantime I should probably make the header not use inline CSS though. — Eru·tuon 22:01, 3 February 2019 (UTC)
Why not just hard-format it in a more satisfactory way. Let a thousand flowers bloom and then pick from among them. DCDuring (talk) 00:22, 4 February 2019 (UTC)
Well anyway, DTLHS added the CSS a few days ago. — Eru·tuon 23:38, 9 February 2019 (UTC)
  • Because of the extremely poor presentation of the modified templates ({{der3}}, {{der4}} etc.) I have stopped using them completely. I now use {{der-top3}} for new conversions; it's not perfect but the presentation is far better. Progress, huh? DonnanZ (talk) 17:28, 10 February 2019 (UTC)
I agree. I'm also using {{der-top3}} for Han character entries. At least it doesn't use any Lua memory. KevinUp (talk) 08:42, 11 February 2019 (UTC)
I'm kind of disappointed, but can't blame you. The new layout doesn't look very good, and it's annoying to have the toggle button at the bottom, because you can't collapse the list when you're reading through a page. I am open to ideas for improvement. I am not great at graphic design or whatever this is. It would be nice to at least bring it to the level where nobody hates it so much that they can't bear to use it. — Eru·tuon 22:56, 11 February 2019 (UTC)

Usage of kanji in Ryukyuan languages besides Okinawan[edit]

Unfortunately, is once again out of Lua memory, even though it was working yesterday. I would like to know whether the following languages: (1) Miyako, (2) Northern Amami-Oshima, (3) Oki-No-Erabu, (4) Southern Amami-Oshima, (5) Yonaguni, (6) Yoron, are actually written using kanji (historical or modern times) by native speakers.

The sections at appears to have been added by the following two users: Special:Diff/25636005/25750073. Should these languages be lemmatized using kana instead of kanji? KevinUp (talk) 11:46, 5 February 2019 (UTC)

(are they written by native speakers at all, in the first place? 🤔 —Suzukaze-c 07:03, 6 February 2019 (UTC))
I don't think so. The entry for 海豚 even has (7) Kikai and (8) Kunigami, added by User:Nibiko in this 2016 edit. Some of these languages appear to have test wikis at Wikimedia Incubator, but I'm not sure about the script used. KevinUp (talk) 10:24, 6 February 2019 (UTC)
They use kanji in the purposes to write the lyrics of their traditional songs (examples: [2][3][4][5]). Note that those spellings are not necessarily phonologically strict, and not linked to the spellings for convenience which prepared by researchers. Modern Ryukyuan languages don't have any official orthographies defined.--荒巻モロゾフ (talk) 14:54, 11 February 2019 (UTC)
Since these appear to be used in modern times, I think we can use {{nonstandard spelling of}} or {{nonstandard form of|phonetic spelling in hiragana or katakana}} at the definition lines of kanji entries for Ryukyuan languages that are not Okinawan. Meanwhile, kanji forms copied from online dictionaries that lack attestation will be removed. KevinUp (talk) 07:24, 23 February 2019 (UTC)

2018 ISO code changes[edit]

The changes the ISO made to codes in 2018 were posted. They:

  • split and retired ais Nataoran Amis, merging the Amis part into ami "Amis" and creating a new code szy for Sakizaya (commentary).
    Yes check.svg Done. - -sche (discuss) 16:13, 16 March 2019 (UTC)
  • merged asd Asas into snz Sinsauru (Sensauru), and renamed it Kou (alternative spelling: Kow), on solid grounds.
    Yes check.svg Done. - -sche (discuss) 18:25, 9 February 2019 (UTC)
  • split dud Hun-Saare into uth ut-Hun and uss us-Saare.
  • retired lba Lui as spurious, citing Wikipedia, which cites that ISO document. But ISO cites other sources too, so it's not just citogenesis.
    Yes check.svg Done - -sche (discuss) 07:46, 23 February 2019 (UTC)
  • merged llo Klor / Khlor into ngt Kriang, saying: "In January, 2018 I happened to be sitting next to a man from Sekong province. I asked him about Klor and to my shock and his, he reported that he himself was Klor. He confirmed that it is pronounced [klɔːr] with no aspiration and that the langauge is spoken only in Ko' [kɔʔ] village. He reported that Klor is completely mutually intellibile[sic] with Kriang and that he considers the Klor to be Kriang. We counted to ten together and it was indeed the same as Kriang. This leads me to propose that Khlor [llo] be retired and Klor (note spelling difference) be added as a dialect of Kriang [ngt]."
  • merged myd Maramba into aog Angoram.
    Yes check.svg Done - -sche (discuss) 07:44, 23 February 2019 (UTC)
  • retired myi, which we already retired.
  • merged nns Ningye into nbr, which they renamed Numana (from Numana-Nunku-Gbantu-Numbu).
    Yes check.svg Done - -sche (discuss) 00:08, 24 February 2019 (UTC)

They also added codes: xsj Subi (a lect previously merged with Shubi; we merged Shubi into Rwanda-Rundi, but Subi is said to not be closely related and only often associated by confusion), lvi Lavi (which we current encode as mkh-law) (done), lsv Sivia Sign Language, cey Ekai Chin (WP prefers just "Ekai"); the Australian languages wkr Keerray-Woorroong, tjj Tjungundji, and tjp Tjupany, about which see WP; pnd Mpinda, lsn Tibetan Sign Language, and tvx Taivoan (Taivuan). If anyone has a reason we should not follow suit on these code deprecations and creations, please speak up. (They also made a number of name changes we could look into.) - -sche (discuss) 06:43, 6 February 2019 (UTC)

Thanks, @-sche. --{{victar|talk}} 07:03, 6 February 2019 (UTC)

Tocharian B[edit]

The entries in Category:Tocharian B lemmas are all written in the Latin script. Is this correct? SemperBlotto (talk) 07:31, 8 February 2019 (UTC)

They were written in the w:Tocharian alphabet (also see https://www.unicode.org/L2/L2015/15236-tocharian.pdf]]) and in the w:Manichaean alphabet. —Stephen (Talk) 08:45, 8 February 2019 (UTC)
We cannot write them differently until Unicode encodes the Tocharian script, we have a similar situation with Sogdian and Old Uyghur. Crom daba (talk) 20:07, 8 February 2019 (UTC)
@Crom daba, SemperBlotto: Sogdian was already added to Unicode 11.0, as was Manichaean (back in 7.0), so technically you could be creating Tocharian B entries in Manichaean when attested. But yes, alas, Tocharian has yet to be encoded. --{{victar|talk}} 21:08, 9 February 2019 (UTC)

Use of the term "West Frisian"[edit]

On Wiktionary, the Frisian language as spoken in the Netherlands is always referred to "West Frisian". This is its usual name outside the Netherlands, contrasting with East Frisian and North Frisian. However, in Dutch "West-Fries" usually refers to a dialect spoken in the province of North Holland. This variety is Dutch, not Frisian, but is called "West-Fries" because it is spoken in the historical region of West-Friesland.
So far, so good. We all know what is meant by it, and we usually don't add word from Dutch dialects. However, there is an actual variety of Frisian that is extinct today but was spoken in (pockets of) West-Friesland until about 1700. Not much has survived of this language, but it would love to add those words that have. But to do so, we would have to settle on names. I can't call these entries "West Frisian", since that name is already in use for the living language that us Dutchpeople call "Westerlauwers Frisian". My proposal would be to adopt Dutch terminology: rename all existing West Frisian lemmas to "Westerlauwers Frisian" and reserve the name "West Frisian" for this language. I admit it would be cumbersome, but at least it would be unambiguous. What do you think? Steinbach (talk) 12:46, 11 February 2019 (UTC)

Another option would be use a geographical description like "Noord-Holland" or the historical term "Noorderkwartier". ←₰-→ Lingo Bingo Dingo (talk) 15:52, 11 February 2019 (UTC)
Do linguists treat this West-Frisia Frisian as a separate language from Westerlauwers Frisian? — Ungoliant (falai) 17:12, 11 February 2019 (UTC)
That's a difficult question. As you know, linguists tend to stay away from the arbitrary distinction between "language" and "dialect". The two varieties were clearly distinct, however. A 17th century Frisian poem could with certainty be identified as being from North Holland, not Friesland, by its text alone. At least one defining feature that sets Westerlauwers Frisian apart from East Frisian, the words sa and ta rather than so and to, did not occur in West Frisia Frisian. Some innovations relative to Old Frisian are shared, some aren't. In combination with the geographical and political separation, a solid case can be made to treat the two varieties as separate languages. Steinbach (talk) 20:24, 11 February 2019 (UTC)
@Steinbach Could you give a pointer to literature about this Frisian lect? The proposal is to move away from the usual terminology in English, so it would be useful to see how others deal with it. ←₰-→ Lingo Bingo Dingo (talk) 08:02, 12 February 2019 (UTC)
Give me some time. I'm not in that stage myself, I've been inspired to this proposal by an article in mainstream press. For the time being, here's a link to the sole surviving longer text in this dialect. It can give you an impression of how it differs from Westerlauwers Frisian. Steinbach (talk) 08:19, 12 February 2019 (UTC)
What I understand from the article is that the language of this sole surviving text of 331 words (160 different words) was known to be a quaint variant of Westlauwers Frisian, and has about a year ago been identified as being specifically a North-Holland variant (not quite surprising, seeing as it is one of the song texts in a collection titled d'Amsteldamsche Minne-zuchjens). Interesting, but hardly a reason to upset the Frisian language classification. And redefining “West Frisian” to mean neither West Frisian Dutch nor the West Frisian language as the term is commonly understood by linguists, but to reserve it for this variant, will be utterly confusing. Just like guv has a label {{lb|en|British}}, we can use some label like {{lb|fy|North Holland variant}} for words found only in this variant.  --Lambiam 13:25, 12 February 2019 (UTC)
This article (in an issue of De Vrije Fries from 1906) discusses possible printing errors in the text – apparently not considering the possibility that the language may be a variant of West Frisian. (BTW, pejeer may be an attempt to render pear.)  --Lambiam 13:42, 12 February 2019 (UTC)
I agree, setting apart a new language code goes too far for this. Any words can be included as obsolete West Frisian. ←₰-→ Lingo Bingo Dingo (talk) 08:09, 13 February 2019 (UTC)
It is definitely more than obsolete West Frisian. It differed greatly from seventeenth-century Westerlauwers Frisian, too. The work of Gysbert Japiks already looks rather similar to modern day Frisian, something that can't be said of this text. Steinbach (talk) 09:00, 13 February 2019 (UTC)
I wouldn't claim it was merely obsolete Westlauwers, just that it should be included under West Frisian and labelled as obsolete, in addition to a geographical tag. So something like {{lb|fy|North Holland|obsolete}} should in my view do the trick. I am also curious about the extent that the similarity of Japicx's Middle Frisian to modern-day West Frisian is due to his orthography influencing later orthography. ←₰-→ Lingo Bingo Dingo (talk) 13:44, 13 February 2019 (UTC)
Technical discussions aside, that's a hilarious poem. Soo molle bolle Femke! — Mnemosientje (t · c) 14:59, 12 February 2019 (UTC)
I suppose that is one way of dating it to the seventeenth century, beside the title page and the spelling. Maybe nice for use on Valentine's Day? ←₰-→ Lingo Bingo Dingo (talk) 08:09, 13 February 2019 (UTC)

English-based creoles of Suriname[edit]

The Surinamese creole languages Sranan Tongo, Aukan and Saramaccan currently do not have any ancestors recognised by Wiktionary's classification. For Sranan and Aukan it is uncontroversial that these are English-based creoles (some consider Saramaccan a Portuguese-based creole instead); they in many ways resemble Guayanese Creole (which also has no ancestor languages in the categorisation) and Jamaican Creole (which is recognised as a descendant of English). Several scholars posit also posit a common creole ancestor to those variety. Implementing that latter view might go too far now, but it seems a good idea to at least enable Sranan Tongo and Aukan to have terms as inherited from English. ←₰-→ Lingo Bingo Dingo (talk) 15:47, 11 February 2019 (UTC)

(Note the prior discussion at Talk:dofu.)
Adding English as an ancestor of Sranan at least seems sensible per our earlier discussion, the other languages I can't well judge. But if Aukan is so obviously English-based, then I personally don't see the harm. — Mnemosientje (t · c) 15:55, 11 February 2019 (UTC)
My feeling is that it is misleading to say that a word like wroko was inherited from English. It suggests that some branch of the English language tree evolved in some way so as to morph into Sranan. But these words were incorporated into the creole language as it was crystallizing out of a pidgin that was not a language in the usual sense of that term but an unstable mishmosh varying from plantation to plantation. For English to be an ancestor, there should be intermediate versions of the language that are closer to English than modern Sranan is, while also closer to modern Sranan than English is.  --Lambiam 21:46, 11 February 2019 (UTC)
I too disagree that we should be marking lexifiers as ancestors of creoles; {{der}} is the best template to use. See this discussion, among others. —Μετάknowledgediscuss/deeds 21:59, 11 February 2019 (UTC)

Another matter is whether the Surinamese creoles should be linked in a similar way. Aukan is generally considered to descend from (very) Early Sranan or Proto-Sranan, and the same is often considered for Saramaccan. ←₰-→ Lingo Bingo Dingo (talk) 15:50, 11 February 2019 (UTC)

The book Pidgins and Creoles: An Introduction contains a chapter on Sranan in which the authors write: “As far as the shared histories of [the Atlantic group of English-based creole languages] are concerned, we may point to such aspects as the common supplier of the vast majority of the imported slaves — the Dutch, and the history of colonization, whereby a new colony was founded by groups from one or more existing colonies. Surinam, for instance, was first settled from Barbados, St. Kitts, Nevis and Montserrat. In this way [Sranan] is linked to the other Caribbean English-based creoles. [...] Within this group Sranan belongs to a clearly defined Surinam subgroup. This subgroup can be demonstrated in historical linguistic terms (with languages Sranan, Ndjuka-Aluku-Paramaccan-Kwinti, Saramaccan-Matawai). Outside this subgroup Sranan has a particular relationship with Krio, and other similar languages on the West African coast, as well as with the Maroon Spirit Language of Jamaica (Bilby 1983).” (Ndjuka is another name for Aukan.) Unfortunately, Google does not allow me to view most of the section entitled “History and current status”.  --Lambiam 12:43, 12 February 2019 (UTC)
You can find it on Library Genesis. Not sure if we're allowed to link it here. Some other comments of interest:

So we cannot say that Sranan (the major English-lexifier creole of Surinam; see chapter 18) derives in any gradual fashion from Early Modern English – its most obvious immediate historical precursor. [...] we are dealing with two completely different forms of speech. There is no conceivable way that Early Modern English could have developed into the very different Sranan in the available 70 or so years. [...] So creole languages are different from ordinary languages in that we can say that they came into existence at some point in time. [...] we have to reckon with a break in the natural development of the language [...] The parents of the first speakers of Sranan were not English speakers at all, but speakers of various African languages, and what is more important, they did not grow up in an environment where English was the norm.

From the section you couldn't see on the Google preview, here are some comments of interest:

The origins of Sranan (see also chapters 2 and 10) must be sought in the seventeenth century. Surinam started its post-Amerindian history as an English colony in 1651. The period of English occupation only lasted officially until 1667. English influence can be considered to have become negligible by 1680. So the period in which the direct linguistic influence of English can be assumed to have been operative was less than thirty years.[...] How precisely English functioned in the development of Sranan is highly controversial. In for instance the bioprogram hypothesis of Bickerton (see chapter 11), English lexical items and language universals combined to produce Sranan. In the substrate approach the African language(s) of the early slaves had a decisive influence (chapter 9).

From this, I have to agree with Metaknowledge that perhaps a simple {{der}} may be best, at least in the case of Sranan. — Mnemosientje (t · c) 14:50, 12 February 2019 (UTC)
Thanks. I didn’t know about LG. Not having access to a research library, it looks like a useful addition to my research tools.  --Lambiam 00:00, 13 February 2019 (UTC)
Yes, in that case {{der}} also looks like the best choice for Aukan terms deriving from seventeenth century English. ←₰-→ Lingo Bingo Dingo (talk) 07:57, 13 February 2019 (UTC)
Short note to participants in this discussion (@Lingo Bingo Dingo, Lambiam, Metaknowledge, HansRompel - I have created WT:About Sranan Tongo as a draft (including a note about the use of etymology templates), perhaps some of you might be interested to improve it. I think it might be useful to add a note about Sranan orthography. I have noticed variant spellings being a thing but don't know enough to add a useful description of the issue on the think tank page. — Mnemosientje (t · c) 10:15, 24 February 2019 (UTC)
@Lambiam, Mnemosientje, HansRompel Do you all agree with the use of {{bor}} or {{borrowing}} for terms that were taken from Dutch without an intermediary? ←₰-→ Lingo Bingo Dingo (talk) 14:24, 5 March 2019 (UTC)
Fine with me. But what do we do with terms taken from Portuguese like kaba and pikin? I think these, as well as terms originating from African languages, like bakra and fodu, should use {{der}} just like terms in early Sranan coming from the English lexicon.  --Lambiam 20:30, 5 March 2019 (UTC)
@Lambiam I'd be all right with that change. In many cases the exact trajectory for loans from Portuguese seems hard to establish, apparently some scholars suppose that there was a Portuguese-based pidgin or creole involved at some point. As for African languages, I'm not entirely sure. In some (but perhaps not all) cases the terms must have been directly borrowed into the creoles via native speakers, right? ←₰-→ Lingo Bingo Dingo (talk) 13:21, 6 March 2019 (UTC)
@LBD – There is a striking commonality of words found in Atlantic creoles that originate from Portuguese. It is generally acknowledged that a possible and even plausible explanation is that they had been swirling around in the mishmosh of Atlantic pidgins developed along the Atlantic coast of Africa already generations before the slave trade, and that they travelled with the slave ships to the Americas. For all we know the same may hold for words in Atlantic creoles originating from African languages, which also display such commonality, like Sranan Tongo fodu ~ Haitian Creole vodou and Sranan Tongo bakra ~ Gullah buckra. It is furthermore generally assumed that a new creole language is forged by the first generation born into a society that has no shared language and communicates through a pidgin. It is therefore (in my opinion) entirely possible that the words stemming from African languages were copied from the pidgin of the older generations, but not necessarily via a native speaker of the original language. If we were to confer language status to the Atlantic pidgin mishmosh (something we definitely shouldn’t do) we could use {{inh}} and {{bor}} in describing a two-stage scenario: term in [creole language] inherited from Atlantic mishmosh, borrowed into Atlantic mishmosh from [African language].  --Lambiam 19:21, 8 March 2019 (UTC)

Blocker role[edit]

What are peoples thoughts on creating a blocker role so that non-sysops can issue short-term blocks to be reviewed later by an admin? --{{victar|talk}} 21:59, 12 February 2019 (UTC)

What would be the point of "reviewing" them if they were short? Most blocks are short anyway. What action would a "reviewer" take? What happens if they aren't reviewed? Why would an admin want to review blocks that they could have done themselves? Symbol oppose vote.svg Oppose. DTLHS (talk) 22:30, 12 February 2019 (UTC)
@DTLHS: The point would be to have more users to catch vandals in the act. If we don't need more people to do that, why are we even having admin votes for that role? You could think of them more like blocker-bots, and the second they're doing a poor job of it, you decommission them and take away the role. --{{victar|talk}} 02:16, 13 February 2019 (UTC)
I have no problem with giving more people vandalism fighting abilities. My main issue is with the "reviewing" that probably wouldn't happen. DTLHS (talk) 02:44, 13 February 2019 (UTC)
That's fair. At the very least, the blocker role users could issue their block and request an perma block when needed. --{{victar|talk}} 02:55, 13 February 2019 (UTC)
Also oppose, if there was actually a time that no admin was active there are emergency options (stewards). In my mind this is the functions of admins which is most "powerful", so anyone I would want having this ability I would be happy to have as an admin. - TheDaveRoss 00:12, 13 February 2019 (UTC)
@TheDaveRoss: You yourself were questing the quality of admins be have these days. In this way, we can have people stopping vandals in their tracks, while still holding admins to a higher standard. --{{victar|talk}} 02:16, 13 February 2019 (UTC)
I am saying I would hold the blockers to the same standard that I hold admins. - TheDaveRoss 02:53, 13 February 2019 (UTC)
Which, IMO, ensures sub-quality admins and not enough vandal blockers. --{{victar|talk}} 03:02, 13 February 2019 (UTC)
Do you feel that vandals frequently go unblocked for long periods of time? In my experience blocks happen within minutes if not seconds of vandalism taking place. If there are times that vandals are able to persist for longer periods I would be interested to hear about that, since it would be happening while I am unaware. My opinion remains unchanged about the need for a distinct role, I would not vote to approve anyone as a blocker if I would not also vote to approve them as an admin. My bar to become an admin is fairly low, I would guess I have voted yes in well above 90% of admin votes in which I have voted at all. - TheDaveRoss 15:19, 13 February 2019 (UTC)
@TheDaveRoss: I don't really feel anyway one way about it, but you see the conflict in the statements "judgement [...] has been a problem with existing admins of late" and "my bar to become an admin is fairly low", right, haha? --{{victar|talk}} 04:48, 14 February 2019 (UTC)
@Victar: Just because the bar is low doesn't mean everyone clears it. I think it is fairly easy to be civil, e.g., which is one of my criteria for voting yes on an admin vote, and yet there are some current admins who place very little value (seemingly) on civility. Most admins (and other editors, and proposed admins) easily demonstrate the level of civility that I hope for in an admin. I think I have a similar viewpoint about judgment and the other criteria which I value, most people easily surpass my expectations, some few fall short. I don't see a conflict with having a low bar and not always determining that everyone clears it. - TheDaveRoss 13:38, 14 February 2019 (UTC)
My feeling is that the need for this arises from not having admins in a certain time zone. "No admins are awake, but 4chan is attacking us, and creating a zillion stupid pages! Luckily, a blocker is here!" This raises the issues that (i) it really just means you don't have enough admins, or not a wide enough geographical spread of admins, and (ii) even if you had a special "blocker" role it would be susceptible to the exact same issue that maybe all blockers are asleep too. Equinox 02:51, 13 February 2019 (UTC)
Anyway oppose because it's easy to be whitelisted (by creating 100 entries in some under-loved language) but a lot harder to get admin status, and the ability to stop people from editing is a very significant and powerful one. (Mostly unrelated thought: what if admin responsibilities included dealing with x-percent of untouched anon edits in your language? Sometimes I find stuff I did two months ago not logged in that still hasn't been reviewed.) Equinox 05:53, 13 February 2019 (UTC)

"Eskimos have 50 words for snow"[edit]

https://popula.com/2019/02/11/white-words/Justin (koavf)TCM 07:39, 13 February 2019 (UTC)

David Robson (January 14, 2013), “There really are 50 Eskimo words for ‘snow’”, in The Washington Post[6], The Washington Post. The article originally appeared in The New Scientist of 18 December 2012 under the title “Are there really 50 Eskimo words for snow?”. Instead of 50 you also find other numbers like 40, 52 and even 100, so “Eskimos have X words for snow” is a snowclone.  --Lambiam 09:25, 13 February 2019 (UTC)
Eskimos have 50 snowclones. —Justin (koavf)TCM 01:47, 14 February 2019 (UTC)

Ideophones as ur-language[edit]

https://aeon.co/essays/in-the-beginning-was-the-word-and-the-word-was-embodiedJustin (koavf)TCM 07:57, 13 February 2019 (UTC)

On crafting scientific language in Zulu[edit]

https://www.theopennotebook.com/2019/02/12/decolonizing-science-writing-in-south-africa/Justin (koavf)TCM 01:48, 14 February 2019 (UTC)

Text's here; both sources aren't durably archived, thus no sources for WT (would be nicer if the article appeared in print). --Brown*Toad (talk) 07:32, 17 February 2019 (UTC)

Layout of "of" qualifiers[edit]

I see "of" qualifiers written in two different ways, as these definitions, respectively from wet and fast, illustrate:

  1. Of weather or a time period: rainy.
  2. (of photographic film) More sensitive to light than average.

Is the second style preferred? Should the first style generally be converted to the second style when encountered? Mihia (talk) 20:36, 14 February 2019 (UTC)

I think the second is much more common. It also makes some sense to mark such qualification for searches. {{lb}} with of within the label serves as a fairly natural marker. DCDuring (talk) 21:54, 14 February 2019 (UTC)

Proto-Bantu Verbs[edit]

Currently, all Proto-Bantu verb entries have the default suffix -a. However, I think it would be better if this suffix were removed from the lemma forms of PB verbs, as it's not part of the verb root, and not all Bantu languages make use of this suffix. Smashhoof2 (talk) 06:24, 15 February 2019 (UTC)

We've taken the route of trying to reconstruct what PB actually looked like (so putting the final vowel on verbs, putting noun class prefixes on nouns), which is contrary to the BLR style, which just shows lexical roots. I don't know what's better, but reconstructing words rather than roots is more in keeping with Wiktionary being a dictionary and attempting to treat languages similarly when possible. —Μετάknowledgediscuss/deeds 19:23, 16 February 2019 (UTC)
That's fair. Smashhoof (talk) 21:32, 16 February 2019 (UTC)

Stub entries and minimum required content[edit]

My talk page contains post to the effect that there exists some additional requirements for minimum content of entries that I am unware of. Such requirements can be created if desired, so let's have an amicable conversation about it.

My understanding of minimum content of an entry is as follows. The entry needs:

  • 1) Language header
  • 2) Part of speech header
  • 3) Somewhat controversially, a definition, translation or, for non-lemma entries, the required content for a definition line. I say controversially since some people thought that it would be a good idea to create many definitionless entries, but there was no consensus either way, from what I remember. Furthermore, a dump analysis can show that nearly all English Wiktionary lemma entries have a definition line with a definition or a translation.

The above seems consistent with WT:EL#A very simple example except that the example speaks of references, which are demonstrably lacking in an overwhelming majority of en wikt entries.

I am not aware of any further requirements on minimum entry content. In particular, as far as I know, there is no requirement on provision of pronunciation and inflection. During my time of contribution of Czech entries to the English Wiktionary, I mostly avoided entering pronunciation and inflection, focusing rather on semantics.

What do you think? Should there be increased requirements on minimum content beyond the three items above? Should such requirements be specified on a per-language basis? If so, should the decision be delegated to a small group of editors of a particular language, say 3 editors if there are no more? Thus, should the English Wiktionary be split into small oligarchies rather than there being One English Wiktionary?

--Dan Polansky (talk) 19:20, 16 February 2019 (UTC)

We don't need a legalistic framework or "small oligarchies". Dan, nobody I can think of wants to institute strict rules about what entries need to have at minimum. We were just asking you to put in a slight amount of effort, like putting in the gender of a noun when the very dictionary you're referencing gives the gender, or even just using a template like {{be-noun}} rather than {{head|be|noun}}. That's it. —Μετάknowledgediscuss/deeds 19:21, 16 February 2019 (UTC)
On my talk page, it says a Russian entries need to 1) include the accent, [...], 3) include the declension or conjugation, and 4) include the pronunciation. I ask the editors if they would be so kind and indicate whether they want to establish minimum content above my three listed items. --Dan Polansky (talk) 19:23, 16 February 2019 (UTC)
One of WF's tricks is to create rfdef entries with nothing but a lazy quotation from the sports news, sometimes SoP. But the worst ones I can remember were the dozens of [name_of_country] Sign Language entries with no definition and often not meeting CFI. I don't think it's been a big enough problem to need policy yet (in English anyway). Equinox 19:31, 16 February 2019 (UTC)
By the way, I was one of those vehemently opposing volume creation of definitionless entries. Semantics is the life and soul of a dictionary, by my lights. --Dan Polansky (talk) 19:39, 16 February 2019 (UTC)
I think this whole thing has gotten seriously out of hand. Meta maybe came across as a bit officious, but it was a reasonable request- as a request. Dan interpreted it as more of an order, and got defensive- after which it escalated. There are legitimate issues about burdening editors in specific languages with fixing up terms that they wouldn't have created themselves and hijacking their priorities- but that's a matter of courtesy, and far too complex to reduce to rules. We've all created entries that needed work by others, and the dictionary would be a fraction of what it is now without that. We need empathy and consideration, not arguments and battles- it's too easy to drive away good editors over such things. Chuck Entz (talk) 22:13, 16 February 2019 (UTC)
You see, in User_talk:Dan Polansky/2018#κλινικός, I received the following order from Metaknowledge: "do not create entries in languages you do not know and have not studied". I think interpreting communications from the same contributor on the same subject as orders in disguise is pretty reasonable. But this thread is about policy, not about me in particular, and is merely triggered by certain posts on my talk page. The key question is, shall small subcommunities be able to increase the requirements for minimum entries per language, and therefore, should the English Wiktionary be understood as a collection of oligarchies, small ruling groups? --Dan Polansky (talk) 08:04, 17 February 2019 (UTC)
The key is cooperation. You shouldn't say that you refuse to do it and say the entries are fine as they are if they are not for editors of that language. You can simply add {{rfinfl}} and {{attention}} The request for higher standard of entries based on existing entries is legitimate, even if it gets harder to keep the same minimum level of quality of entries is already high. You can make simpler stubs for languages with low contents but you can still mark them with {{attention}} so that other editors can at least find entries that require attention. As for Russian entries, it takes more effort, knowledge and time but it's not that Russian inflections and genders are unavailable. It doesn't belong to poorly documented languages. But, since it can also be error-prone, editors with less knowledge of a language shouldn't be completely discouraged from editing but are asked to mark them incomplete. Everybody does it. I did too for languages I wasn't confident in and when I knew what was required, for example, languages with complex scripts. It's strange that you vehemently opposed definitionless entries with {{rfdef}} for otherwise great entries for high frequency words. It often takes much less effort to add a definition than reformat headers and add inflections. --Anatoli T. (обсудить/вклад) 09:07, 17 February 2019 (UTC)
The subject of this thread is minimum content, not marking. To address your subject (out of scope of this thread) of marking entries with {{attention}}: no such marking is required since if there is consensus that entries with {{head|ru|noun}} need to be in a convenient category, {{head}} can be instructed to place such entries into a maintenance category automatically. Czech entries without inflection are not marked with {{attention}} and as far as I know, such marking is not a common practice for most languages, and I can make a dump analysis to check the actual facts; "everybody does it" is easily verified to be false. Here again, the general question that I saw no clear answer to so far is, should small groups make up their own rules for other editors to follow? --Dan Polansky (talk) 09:19, 17 February 2019 (UTC)

You said “I want my undivided attention to be channeled toward making sure that the semantic information I am entering is correct” but I deny that using the appropriate templates excludes it. The templates all have the same names and you can even care for it it after you added the glosses. And of course for Russian the stress is one of the main reasons why one consults the dictionary, so unless one has no information about the pattern because it is a kind of archaic word nowhere included, one can already give the complete information in the headword and in the table, which latter is important because Benwing’s bot creates the non-lemma forms and users like to look into the tables.
About adding pronunciations: For English it is not easily predictable, so editors don’t add it because they don’t know it (English is the only example for “irregular” in orthographic depth). In most other languages the pronunciation sections have indeed the character of clutter we only add because we have unlimited room, but the stress mark in the head-word or declension-table is what you need to know the pronunciation already, unless such a case like со́лнце (sólnce) which you could guess wrongly either if you know the stress is on the beginning but have not heard the word. For languages like Arabic and Aramaic where multiple pronunciations can be on the same page I am for avoiding adding IPA pronunciations because it only makes the layout complicated without adding additional information (because as I said the full vocalization or transcription gives all information already) and indeed I create the pages faster and with better overview if I do not add the pronunciation, so I think they divert me and the reader. For Russian, perhaps the bot can add IPA pronunciations since the со́лнце (sólnce) cases should be all included already.
Just ask yourself what the reader would like to know from Wiktionary: It is the stress pattern and the gender for the languages that have such, and the meaning, and even if you have the pattern you have the gender already most likely in Slavic languages and it is only one letter, all if only you know it, so the demands are really low. There being links to other dictionaries is a bad argument to omit stresses and patterns, since copying over the stresses and patterns is what you should do, and for the languages in question many web searches can confirm. “Accuracy combined with verification” does not stop you to tell people what you already know. Also add surface derivations, if you have reasonable ideas of them, else others have to add it.
BTW {{be-noun}} is an incomplete wrapper of {{head}}, some times I used it I had to use {{head}}, because it does not support |m= / |f= (Wiktionary:Grease pit/2018/October § Missed masculine and feminine counterpart parameters in some headword templates). Fay Freak (talk) 13:48, 17 February 2019 (UTC)

Dan, my main concern is that you work *with* the main contributors in a given language. Overall, I completely agree with what Atitarev (talkcontribs) said. This is not a matter of enforcing rules but of (a) keeping up the overall quality of Wiktionary by attempting to follow the example of existing entries, and of (b) maintaining harmonious relationships with others. In this case, if you had tried to figure out the prevailing structure and templates of a Russian entry, and found it too complex, and instead inserted {{attention}} or {{rfinfl}} or a similar request template, I'd have no problem with this. But you seem to have made no such attempt, and in general appear to show little interest in working with others or maintaining consistency. If everyone did this, the whole project would descend into chaos. Benwing2 (talk) 19:59, 17 February 2019 (UTC)
Is it your position that a Russian noun entry must contain pronunciation and inflection as a minimum, or is it not your position? I am puzzled. --Dan Polansky (talk) 20:26, 17 February 2019 (UTC)
As for what I am doing, which is out of scope of this thread per its title, I am interested in using the generic tools for setting up an entry, which is {{head}}, since I am basically little like a slow-moving Tbot working with a plethora of languages, using general human intelligence to verify semantics in applicable sources. Since I work in so many languages, I am not interested in learning any template peculiarities that various language groups may have set up. I need the minimum entries as places to attach verification artifacts and further reading goodies, which happen to be the same thing. As much must be pretty clear to anyone who saw my recent batch of contributions. I am not acting out of malice or disregard for wishes of particular groups, but my enterprise can only work economically if I can work with generics, or non-demanding templates such as {{be-noun}}, which I am now starting to use. I am absolutely not interested in pronunciation or inflection. I am no worse than Tbot, and in fact, I am better in multiple ways: Tbot checked in other Wiktionaries whereas I am checking in external sources even for entries that not a single other Wiktionary has, and I do human checking of semantics, not just checking for existence. I will run out of gumption pretty soon, I guess, and return to creating Czech entries; my best hope is that other editors will pick up the work, including new editors. --Dan Polansky (talk) 20:48, 17 February 2019 (UTC)
@Dan Polansky: I would discourage from using {{ru-noun}} and language-specific templates because this can produce incorrect results - a wrongly detected gender, animacy and a stress/inflection pattern (many things are automated) without your knowledge. It also requires a correct word stress. All we ask for is adding maintenance templates, so that appropriate editors could bring the entry to the required standard. When I said "everybody does it", I meant everybody who is asked to do it. E.g. people know that Chinese entries require traditional and simplified forms. What if you don't know? You need to ask people who know. --Anatoli T. (обсудить/вклад) 22:38, 17 February 2019 (UTC)
I would add that Tbot added a maintenance category to every entry it created, so that others would know to go back and check on it- in that respect, these current entries are inferior to Tbot's. Chuck Entz (talk) 01:35, 18 February 2019 (UTC)
@Dan Polansky Thank you for adding the {{attention}} template to трактористка. That allowed me to find it and fix it up. Benwing2 (talk) 16:30, 18 February 2019 (UTC)
I can keep adding {{attention}} to Russian entries, but a proper (economical, systematic) way to address this need would be to teach {{head}} to put {{head|ru|noun}} to a maintenance category. Otherwise, you will need to keep asking newcomers to do something that machines can do. As for Tbot, it was only a bot doing no real verification so it had to mark the entries with a reader-visible template, indicating a significant risk of inaccuracy; I am doing human verification and take credit for accuracy.
On another note, I came up with a motto: Make yolk and hub and skip all fluff. I maintain that pronunciation and inflection are fluff, things insubstantial, an order of magnitude less important than definitions and translations. Let us do the following thought experiment: you want to learn Serbo-Croatian and you are offered three dictionaries; which one do you buy?
  • Dictionary 1 has 100 000 lemma entries with pronunciation and inflection and no semantic information.
  • Dictionary 2 has 5 000 lemma entries with definitions or translations, and pronunciation and inflection.
  • Dictionary 3 has 20 000 lemma entries with definitions or translations but no pronunciation and inflection.
Which one do you buy to learn Serbo-Croatian? The thought experiment explains why I oppose creation of definitionless lemma entries in volumes; I admit that creating definitionless lemma entries in small volumes (up to 1000?) can be useful in that people can fill in the definitions/translations in reasonably short time. --Dan Polansky (talk) 07:35, 23 February 2019 (UTC)

Module:la-headword[edit]

There is an (AFAICT) undiscussed removal of valuable information going on leading to incomplete and incombrehensible head lines. --Hamator (talk) 11:47, 17 February 2019 (UTC)

Classical Malay?[edit]

I changed the meaning of the worklang= param in {{quote-book}} etc. Formerly it took either a single language code or an arbitrary string like "French and Latin" or "Classical Malay". I changed it so it takes one or more comma-separated language codes, but doesn't allow arbitrary text. I fixed up all the resulting errors except for two, which are in -kah and -kan, which have quotations in Classical Malay, for which we don't have any language code. Could someone add this? I'm not sure if it should be an etymology-only language or a proper language in its own respect. (And what about Old Malay?) Benwing2 (talk) 19:15, 17 February 2019 (UTC)

Thank you for bringing this up. A few months ago I bought up a similar suggestion at Wiktionary:Beer parlour/2018/September#Suggested outcome. Currently, Classical Malay (14th to 18th century) and Old Malay (7th to 14th century) do not have proper language codes defined for them. However, because there is a lack of effort to digitize texts from Classical Malay (written in Jawi script) and Old Malay (written in Pallava script or Rencong script) in its original script form, I think we can wait for ISO 639 to define a proper language code for these languages.
Currently, only two Classical Malay works are available on Wikisource: Hikayat Hang Tuah and Hikayat Bayan Budiman. Modern transcriptions of Classical Malay works are often written in the Latin script, so it is slightly problematic to figure out its original orthography in the absence of an original manuscript.
By the way, Classical Malay and Old Malay is the missing link between the Proto-Malayic language and the modern Malay language. KevinUp (talk) 22:36, 18 February 2019 (UTC)
I have removed the |worklang parameter in -kah and -kan because the texts have been translated and modified to suit readers proficient in modern Malay, rather than transcribed word-for-word based on the original manuscript. KevinUp (talk) 22:36, 18 February 2019 (UTC)

Should I suppress the "(please add an English translation of this quote)" message for Scots?[edit]

Many of the Scots quotations given are so close to English that they are readily understandable without any "translation". Example:

"Och, it's the lassies will be the pleased ones, coiling the blankets round them; it's Auld Kate that kens," and then she gave a screitchy hooch and began to sing in her cracked thin voice-- 'The man's no' born and he never will be, The man's no born that will daunton me.'

Not surprisingly no translation is given, but if this is tagged with |lang=sco, you'll see "(please add an English translation of this quote)". Given the predominance of this situation, should I special-case Scots to remove this message? Benwing2 (talk) 02:51, 19 February 2019 (UTC)

No. If Scots isn't to be translated it shouldn't be a separate language. DTLHS (talk) 02:52, 19 February 2019 (UTC)
Agreed. I'm not 100% on what this means. Per Wiktionary:About Scots, we consider it a separate language instead of a dialect of English. If we consider it English, it's a different story. —Justin (koavf)TCM 03:59, 19 February 2019 (UTC)

Constrduction namespace[edit]

Has it been suggested that constructed languages, like Esperanto, be moved to a "Construction" namespace, ex. Construction:Esperanto/eburo? --{{victar|talk}} 11:15, 19 February 2019 (UTC)

Seems reasonable. We already have a "Reconstruction" namespace. SemperBlotto (talk) 11:24, 19 February 2019 (UTC)
@Victar Is the title of this section meant to be something else? - TheDaveRoss 13:27, 19 February 2019 (UTC)
This is a great idea. Fay Freak (talk) 14:07, 19 February 2019 (UTC)
Ooh... it's a game! Reconstruction>Construction>Contraction>Distraction>Destruction... it almost worked...
Seriously, though, I'm not impressed by the name: Reconstruction: houses reconstructions, but Construction: would house terms in constructioned languages. Chuck Entz (talk) 14:42, 19 February 2019 (UTC)
And I would have gotten away with it too, if it wasn't for you meddling kids! --{{victar|talk}} 19:30, 19 February 2019 (UTC)
I think what Chuck is saying is he'd prefer "Constructed:" instead of "Construction:". Personally I'm on the fence as to whether this is needed at all, under any name. Benwing2 (talk) 15:39, 19 February 2019 (UTC)
@Chuck Entz, Benwing2: I'm not married to the name; I just thought it in keeping with the "Reconstruction:" namespace. I'm fine with "Constructed:" or simply "Construct:", but it could be "Conlang:" or "Artificial:" for all I care. I just think if we keep reconstructions off the main namespace, why should conlangs be shoehorned into natural (if you will, non-constructed if not) languages? I think it's confusing to the reader, as they might mistake Esperanto, for example, for some inherited descendent of Latin, as we have no indicator that it's a constructed languages, like we do reconstructions. I also find that they clutter up entries and every few months there seems to be some vote on allowing another (forgive the hyperbole, but you get my point). --{{victar|talk}} 19:30, 19 February 2019 (UTC)
@Victar I see your point. I don't find it especially confusing but I imagine it might be different for users who haven't heard of Esperanto, Interlingua, Lojban, etc. (OTOH a page like a already has a huge number of random languages on it, and the average user isn't likely to have heard of Kalasha, Mandinka, Lower Sorbian, or Mezquital Otomi, to name a few on that page, and won't get any more confused by the additional presence of Esperanto, Interlingua, Ido, Novial, etc. on the same page.) Benwing2 (talk) 19:56, 19 February 2019 (UTC)
Exactly. What's stopping layman users from thinking Esperanto and Kalasha are categorical equivalents? --{{victar|talk}} 20:09, 19 February 2019 (UTC)
So instead they're supposed to think Na'vi and Esperanto are equivalents? There are clearly more nuanced distinctions to be drawn than "constructed" vs "not constructed". DTLHS (talk) 21:54, 19 February 2019 (UTC)
Yes, I would say more so than Esperanto and Kalasha. --{{victar|talk}} 22:53, 19 February 2019 (UTC)
  • Oppose. Esperanto has become too big to be cordoned off, and unlike nearly every other constructed language, people are going to be looking for it where they look for other languages. —Μετάknowledgediscuss/deeds 20:24, 19 February 2019 (UTC)
    @Metaknowledge, and what would stop them to finding them at another namespace other than main? --{{victar|talk}} 21:18, 19 February 2019 (UTC)
    Let's see... they won't come up in search, they won't be in translation tables... you'd have to be looking for them to find them, which is good for Novial, but bad for Esperanto. —Μετάknowledgediscuss/deeds 21:24, 19 February 2019 (UTC)
    And by "won't come up in search" you mean in the search dropdown because reconstructions certainly show up in search results. There maybe be a technical solution for that, ditto for translation tables, though I'm less familiar with the problem there. --{{victar|talk}} 21:32, 19 February 2019 (UTC)
  • Oppose. I am opposed to deciding we don't want to include languages, and then including them anyway in a roundabout way. Conlangs that aren't in mainspace shouldn't be included anywhere at all, not in any namespace. —Rua (mew) 20:30, 19 February 2019 (UTC)
    @CodeCat, I don't think anyone is suggesting moving poorly attested conlangs like Lojban and Novial to this namespace, as those are being relegated to Appendix:. I'm cliefly referring to Esperanto and Interlingua. --{{victar|talk}} 21:16, 19 February 2019 (UTC)
    @Rua: FYI, the following vote seems relevant to your position: Wiktionary:Votes/2019-01/Moving Novial entries to the Appendix. --Dan Polansky (talk) 14:13, 23 February 2019 (UTC)
  • Oppose. If a constructed language is so unused it should be banished to an appendix or deleted, do that. But e.g. Esperanto is more widely used than at least several hundred of the natural languages we include, and even has some native speakers; I don't see a reason to segregate it into a separate namespace away from e.g. Mbariman-Gudhinma or Berbice Creole Dutch just because we can identify who coined most of Esperanto's words and not the other two languages'. - -sche (discuss) 23:29, 19 February 2019 (UTC)
  • Oppose. I don't understand the motivation for the concept of "categorical equivalents" (I guess German and Dutch are "equivalent"? What does that even mean?) or the need to keep them separate. I don't understand why someone mistaking Esperanto for a Romance language is a danger that we should take seriously. It seems no more plausible or harmful than someone seeing ignominious and concluding, on that evidence alone, that English is a Romance language. It seems extremely condescending to think that someone looking up words in a language would not have such basic knowledge about the language, and I don't see that we're at all responsible for someone using the dictionary that stupidly.__Gamren (talk) 12:49, 23 February 2019 (UTC)
  • Oppose. Esperanto is attested in use; reconstructions such as PIE are not. Mainspace is the natural place for users to look up terms, whatever the language. As for the poor souls who cannot look up entry Esperanto or check Wikipedia's W:Esperanto, they should try harder and look it up. I think PIE could have been in the mainspace, with entries starting with asterisk (*), but that ship has sailed. --Dan Polansky (talk) 14:23, 23 February 2019 (UTC)
  • Oppose. Languages that have actually been used in human communication should be in mainspace. I'd rather move languages like Mbabaram (dog#Mbabaram), which has a tiny recorded vocabulary and no recorded literature, away from mainspace, than move languages with literatures away from mainspace.--Prosfilaes (talk) 08:12, 24 February 2019 (UTC)

Constellation name definitions[edit]

Considering that the IAU has recognized 88 constellations, should the constellation names be defined as translingual, and should the English definitions be moved there? -Mike (talk) 23:39, 19 February 2019 (UTC)

Scots again, and Middle English[edit]

A number of entries have quotations from Template:RQ:Dictionary of the Scottish Language used to illustrate English terms. An example is forspeak, for which definition #1 says "(transitive, dialectal, Northern England and Scotland) To injure or cause bad luck through immoderate praise or flattery; to affect with the curse of an evil tongue, which brings ill luck upon all objects of its praise." Should we allow this? If so, what language should I use to tag the Scots portions of the quoted text? en (English) or sco (Scots)? If not, what should happen to these quotes? (Move to a Scots L2 section? But what about the "Northern England" label?) Note that on the same page is also a Scots entry for forspeak, defined as "To bewitch or cast a spell over, especially using flattery or undue praise; to seduce." Examples like this make me think that the entire decision to include Scots as a separate language may have been wrong, because (a) most terms (like this one) that exist in Scots and don't exist in Standard English also exist in Northern England dialects; (b) in general there's no way to make a clear distinction between Scots and nearby dialects of English. Note that if Scots had a standard literary form the situation would be different, because then we could define the nucleus of Scots as consisting of that literary form.

A related issue: A number of English entries have illustrative quotations from Middle English. An example is ashame, which has a quote from Wycliffe's Bible dated to 1390, complete with translation: "Ashame thou, Sidon, seith the se, the strengthe of the se, seiende, I trauailide not with child, and bar not, and nurshede not out ȝung childer, ne to ful waxing broȝte forth maidenes." (translated as "Be ashamed, Sidon, says the sea, the strength of the sea, saying, “I did not travail with child [give birth], and did not nurse boys, nor to full waxing bring forth maidens.") Should we allow this? If not, what should happen to these quotes? (Move to a Middle English L2 section?)

Benwing2 (talk) 23:42, 19 February 2019 (UTC)

Regarding Middle English quotations: my inclination is to move them to Middle English sections, but some editors have argued they are tolerable in English sections to demonstrate age/continuity of use. (They don't count towards attesting the term, obviously, but neither do e.g. quotations from websites, which are nonetheless infrequently included alongside ATTEST-satisfying citations if they are particularly good illustrations of a term.) - -sche (discuss) 23:54, 19 February 2019 (UTC)
Regarding Scots quotations: right now, they should be moved to Scots entries. Since both Scots and English are WDLs, merging them might need a vote, or at least strong consensus support. But it would certainly simplify distinguishing Scots from Scottish English at RFV if we, erm, didn't distinguish them. And we already include several rather divergent dialects (e.g. Geordie) under English, so I don't expect the issues of e.g. different inflected forms and the like to be much harder to handle than for those dialects. And other (monolingual) English dictionaries tend to include Scots as English. - -sche (discuss) 01:43, 20 February 2019 (UTC)
I considered it odd that Scots is separated from English on Wiktionary, but I've held back from raising the topic myself (it's potentially an emotive subject!). I'm a Geordie and have lived in Scotland, and I find that most of the Scots and Geordie terms are the same (supporting Benwing2's "(a)"). There's work to do in adding Northern English terms to Wiktionary; it's something I've been avoiding so far. Having had a look through, I'd come to the conclusion that the bulk of the job would be to take Scots entries and duplicate them as English (Geordie) entries. As a simple example, User:Stelio/Tyneside Songs has a bunch of orange links (if you've turned on that gadget) most of which are Scots terms, and the songs themselves could easily be misidentified as Scots to someone unfamiliar with their context. -Stelio (talk) 11:00, 20 February 2019 (UTC)
As someone who has spent a significant time around drunken Geordies, my vote is for the Geordie lect being considered its own language. XD --{{victar|talk}} 17:46, 20 February 2019 (UTC)
Mebbees like, but how man, dinna fash yersel'. That's wark, reet? ;-) -Stelio (talk) 14:25, 21 February 2019 (UTC)
Why aye man, canny wark! --{{victar|talk}} 14:53, 21 February 2019 (UTC)
OK, I drafted Wiktionary:Votes/pl-2019-02/Treat Scots as English. Please improve or postpone as needed. - -sche (discuss) 23:42, 23 February 2019 (UTC)
@-sche Thanks. Maybe you should note in the vote that combining Scots with English on Wiktionary makes no statement as to whether Scots should be considered a separate language, but is being suggested due to the difficulty of drawing a clear line between Scots and English (at least, that is my view ...). Benwing2 (talk) 01:59, 24 February 2019 (UTC)
OK, I took out the part about "treating Scots and English as a single language". But the very act of treating Scots as English, no matter how it's worded, is inherently treating it as not being a separate language... - -sche (discuss) 02:50, 24 February 2019 (UTC)
English and Middle English are different languages, hence Middle English quotations belong into Middle English entries only. To show age/continuity there's the section "Etymology". --Brown*Toad (talk) 18:24, 10 March 2019 (UTC)

Talk to us about talking[edit]

Trizek (WMF) 15:01, 21 February 2019 (UTC)

Words of language X used in language Y[edit]

some previous discussions: December 2018, September 2017

It is not difficult to find uses of the English word happiness in Dutch texts ([7], [8], [9]). In many cases this is obviously an instance of code switching (the word is italicized or put between quote signs, or occurs in a longer English phrase) but in some there is no obvious giveaway. Yet I expect most Dutch speakers will agree that this is not a Dutch word. On the other side I expect most Dutch speakers will readily agree that tram belongs to the Dutch vocabulary, including those speakers who pronounce the word as /trɛm/. A giveaway may be that the Dutch diminutive trammetje is actively used, while *happinessje does not occur.
What set off this musing was a request for verification for (allegedly) Dutch anti-roll bar. I think this is an English term used in Dutch texts for lack of a native term. Elsewhere there was a reference to Old French castelwriȝte; I am inclined to think this is a Middle English word used in Old French texts for lack of a native term. Is there a way to make this more firm? What criteria can we use to help us decide when uses of a word from language X in language Y stop being instances of expedient code switching and become evidence of incorporation in the lexicon of Y?  --Lambiam 06:20, 23 February 2019 (UTC)

This has come up before (I added two of the most recent discussions, one involving "marketing" as a Greek word, up top). I agree it's a thorny issue. Italics, quotation marks and different script are all good clues to code-switching, although it's worth noting that something can only be code-switching if the word and sense exists in the supposed donor language: French people is frequently italicized and thought of as a loanword or code-switching, but cannot be code-switching/English and must be a French word because English people never means "a celebrity". Since we've had 3+ threads about this recently, I'll start a draft / think-tank page for this, a la Wiktionary:English adjectives. - -sche (discuss) 07:01, 23 February 2019 (UTC)
The draft is now live at Wiktionary:Code-switching. - -sche (discuss) 07:33, 23 February 2019 (UTC)
(Wiktionary:Beer parlour/2019/February#Proposal: Separate namespace for entries in Category:Chinese terms written in foreign scripts isn't entirely irrelevant. —Suzukaze-c 07:25, 23 February 2019 (UTC))
I would like to share this journal article I found which was published in the International Journal of English Linguistics in 2011: Code-Mixing of English in the Entertainment News of Chinese Newspapers in Malaysia. KevinUp (talk) 19:46, 23 February 2019 (UTC)
I sort of agree with you but am going to play devil's advocate for a moment, because I'm not sure we should remove such entries in all cases. To take the statement 'I think this is an English term used in Dutch texts for lack of a native term.' -- well, is this not how many loanwords are borrowed in the first place? Especially for specialized terms with no native Dutch word, a borrowing-entry (where it meets CFI) might be useful: after all, if one hypothetically were to look for the Dutch translation over at anti-roll bar, they would then find it and understand that there is no current native term, but that this originally English term is in fact in use in Dutch by Dutch speakers. — Mnemosientje (t · c) 20:04, 23 February 2019 (UTC)
We can still explain that in the translation table without having to create an entry: "no equivalent in Dutch; the English term is used instead" or something of the sort. The relevant quotes can be put in Citations:anti-roll bar. Per utramque cavernam 20:14, 23 February 2019 (UTC)
I'd call it a proto-loanword. The distinction between a proto-loanword and a real loanword is not clear though, one flows into the other. chauffeur is very clearly French to a Dutch speaker, yet it's used widely in Dutch despite the existence of bestuurder. Perhaps a solution would be to use the "hot word" approach here: if the word continues in use for X years, then assume it has entered the lexicon of the borrowing language. —Rua (mew) 17:10, 24 February 2019 (UTC)
Yes, I agree with the first part of your message. I'm less sure about the idea of using the "hot word" approach; would being used for X years be a sufficient proof that a word has entered the lexicon? Per utramque cavernam 17:20, 24 February 2019 (UTC)
It's a subjective question, so what matters is whether it's proof enough for Wiktionary. —Rua (mew) 17:27, 24 February 2019 (UTC)
The point of a hot-word approach would be there would be a review some number of years after the date of first attestation. I still wonder what makes something qualify for the proto-borrowing status. Three cites with no typographic marking of the term, but a substantial (75?, more?) majority of appearances with typographic marking? Would a proto-borrowing become an ordinary borrowing the first year there were three of more attestations without typographic marking and fewer attestations with such marking? Is there a simpler approach that has some face validity. DCDuring (talk) 20:12, 24 February 2019 (UTC)

Categories which don't exist, but have been added by users anyway[edit]

There are a large number of non-standard categories which have been chosen by users which appear in Category:Categories with invalid label. The question is, of course, whether these should be created, or whether any of them can be reassigned to existing categories. Some are unusual to say the least, for example Category:ceb:Dried fish. DonnanZ (talk) 19:32, 23 February 2019 (UTC)

I just made Category:en:Bricks, which should exist. --Wonderfool early February 2019 (talk) 19:35, 23 February 2019 (UTC)
There has been a user who has contributed a very large number of entries for Cebuano names of organisms. I'm not surprised that there are more ceb categories than one might naively expect. If dried fish is important in the diet of the Philippines, why not have the category? DCDuring (talk) 23:08, 23 February 2019 (UTC)
There are 11 entries in Category:en:Seasonings. There are 10 in Category:ceb:Dried fish. DCDuring (talk) 23:12, 23 February 2019 (UTC)
I could be wrong, but half of those look like SOP compounds, with bulad and buwad referring generically to dried fish, and the second words referring to the type of fish that's dried. Chuck Entz (talk) 00:19, 24 February 2019 (UTC)
I want to note that "nonstandard" categories only get sorted into Category:Categories with invalid label if they try to use {{autocat}}. This means that 1) categories which are valid for one language but maybe not useful to others can exist, they just need to be put into parent categories manually and without using {{autocat}}, and 2) nonstandard categories that don't use {{autocat}} are out there. (We could potentially find them by searching a database dump (or possibly using the built-in site search) for pages in the category namespace that don't contain {{autocat}}.) "Dried fish" might be useful to a number of Asian / Pacific languages, I don't know; if it is, it could be added to the official category tree so that {{autocat}} would "play nice" with it. - -sche (discuss) 23:24, 23 February 2019 (UTC)
Also, it is hardly a surprise that a new contributor would not master our baroque category implementation.
It is largely due to the efforts of Chuck Entz that so few (1,167) items remain in Special:WantedCategories. If only there were similarly diligent efforts to clear Special:WantedTemplates and Special:WantedPages, either by removing the links or by adding the "wanted" templates and pages. DCDuring (talk) 23:32, 23 February 2019 (UTC)
It's quite been a while since I patrolled Special:WantedCategories. Now User:DTLHS does most of the heavy lifting with his bot and {{autocat}} is very easy to use, so I haven't felt the need. I spend most of my time approaching categories from other angles. Chuck Entz (talk) 00:09, 24 February 2019 (UTC)
23. Category:en:Towns in Alberta, Canada should be removed from that (see below), but is incorporated in an entry (Edson) somehow. DonnanZ (talk) 09:30, 24 February 2019 (UTC)
That was happening somewhere in the innards of Module:place. This edit fixes it. Have to do the same edit for all other Canadian provinces and territories though; not very efficient. — Eru·tuon 09:57, 24 February 2019 (UTC)
Ah, thanks. DonnanZ (talk) 10:32, 24 February 2019 (UTC)
@Erutuon It was agreed upon a while ago that category names for subdivisions would always include the country name. So the name with "Alberta, Canada" is actually the right one going forward. —Rua (mew) 17:02, 24 February 2019 (UTC)
Well, we know what happened to US states, and who was responsible. There is only one Alberta, New South Wales, Florida etc., so that's not the way to go, IMO. DonnanZ (talk) 17:13, 24 February 2019 (UTC)
I did create a couple of standard categories: Category:Towns in Alberta and Category:Villages in Finland. These are still on the invalid label list, but will hopefully disappear soon. DonnanZ (talk) 00:41, 24 February 2019 (UTC)
Dried fish is important for Russians too, in case you try to conceive usage fields. Every grocery store targetted at the Russia-related population in Germany has a section filled with various kinds of dried fish, which however are conceived as supplementary foodstuff and not as staples. They are employed like roasted sunflower seeds, somewhat cliché. Fay Freak (talk) 00:45, 24 February 2019 (UTC)
To note, if a topic cannot contain enough entries in a sufficient number of languages to justify adding category data but the grouping in one language is recommendable, the solution is to create a thesaurus entry. Fay Freak (talk) 00:54, 24 February 2019 (UTC)
  • That's another thing that I have noticed before; looking at Category:sms:Lakes in Finland, Category:Lakes doesn't have any subcategories for "Lakes in ...". I don't see why it shouldn't; although I don't envisage it should contain every lake in Finland or Canada for example. Currently the category lists terms relating to lakes, e.g. in Category:en:Lakes, as well as actual lakes all over the world. DonnanZ (talk) 16:20, 24 February 2019 (UTC)
    • It is a set category though, so it should not contain terms related to lakes. —Rua (mew) 17:04, 24 February 2019 (UTC)
I don't see why not, and this is where "Lakes in ..." subcategories would come in handy. If there is no objection to that, I think I can set them up. DonnanZ (talk) 17:25, 24 February 2019 (UTC)
I do not support converting Category:Lakes into a topical category. —Rua (mew) 17:30, 24 February 2019 (UTC)
That's odd, because you created this. DonnanZ (talk) 17:36, 24 February 2019 (UTC)
I don't see anything odd about it, it's like Category:Countries and Category:Cities. —Rua (mew) 17:49, 24 February 2019 (UTC)
The odd thing about it is that it appears in Category:Categories with invalid label because no module has been set up, so do we fix that or not? DonnanZ (talk) 18:00, 24 February 2019 (UTC)
If you don't want to play ball a move to Category:sms:Lakes would be an alternative. A module can easily be created for that already exists. DonnanZ (talk) 21:32, 24 February 2019 (UTC)

Venn diagrams for semantic nuances[edit]

I think it would improve the entries to add visual examples such as Venn diagrams, for example for the semantic relation in some fields between the terms assimilation, inclusion, exclusion, segregation and integration. --Backinstadiums (talk) 15:48, 27 February 2019 (UTC)

I think these visualizations can easily be misunderstood. “Exclusion” might illustrate expulsion. “Assimilation” might illustrate expulsion followed by cleansing. “Integration” could serve as an illustration of the Nazi concept of Fremdkörper.  --Lambiam 23:21, 27 February 2019 (UTC)

Anglo-Norman[edit]

We must add Anglo Norman tag xno to the tag repository.Aearthrise (talk) 22:19, 27 February 2019 (UTC)

We consider Anglo-Norman a dialect of Old French. Please enter terms under the L2 ==Old French== and label them {{lb|fro|Anglo-Norman}} to categorize them in CAT:Anglo-Norman Old French. —Mahāgaja · talk 22:53, 27 February 2019 (UTC)
I come across this in etymology, where it is necessary to use {{der|en|xno|-}} {{m|fro|[term]}} (or similar). DonnanZ (talk) 00:33, 28 February 2019 (UTC)
@Donnanz: Actually you can use {{der|en|xno|[term]}}; the template will display "Anglo-Norman" but link to "Old French". —Mahāgaja · talk 11:15, 28 February 2019 (UTC)
@Mahagaja: I wasn't aware of that, I will try that next time. Thanks. DonnanZ (talk) 11:23, 28 February 2019 (UTC)
@Donnanz: It works for all the etymology-only languages. —Mahāgaja · talk 11:28, 28 February 2019 (UTC)
OK, I knew you could do that with Late Latin, Medieval Latin etc. DonnanZ (talk) 11:33, 28 February 2019 (UTC)

March 2019

Mozilla releases 1,400 hours of voice recordings[edit]

https://venturebeat.com/2019/02/28/mozilla-updates-common-voice-dataset-with-1400-hours-of-speech-across-19-languages/Justin (koavf)TCM 02:44, 1 March 2019 (UTC)

It's whole sentences, rather than isolated words, so I think it's not especially useful for us at this point. —Μετάknowledgediscuss/deeds 04:35, 1 March 2019 (UTC)
Perhaps we could use them as usexes that have audio, if the license is compatible. —Suzukaze-c 03:05, 4 March 2019 (UTC)
@Suzukaze-c: It's CC-0. —Justin (koavf)TCM 03:27, 4 March 2019 (UTC)
Just had a look at the Italian portion of the dataset and it's definitely good usex material, lots of natural-sounding language. The recording quality varies, some have audible background noise. – Jberkel 22:36, 4 March 2019 (UTC)

Limit the table of contents to language names[edit]

In the table of contents I don't think there's a lot of use in listing subsections beyond the different language entries. In the vast majority of entries you can see all the different subsections within the space of a screen anyway. Personally, I have never ever tried to go to a specific subsection of an entry through the table of contents, whereas I have spent a surprising amount of time scrolling through long and messy tables of contents trying to find the language I'm looking for. The exception would be articles not in the main namespace. ─ ReconditeRodent « talk · contribs » 00:11, 3 March 2019 (UTC)

I tend to agree, but is there an easy way to do that? DTLHS (talk) 00:12, 3 March 2019 (UTC)
I find it useful to be able to click to the etymology when there are a number of different ones. When there are 5 or 6 different homographs, I find it easier to navigate using the ToC. Andrew Sheedy (talk) 00:37, 3 March 2019 (UTC)
Fair enough, though I'm assuming this is for when you're already familiar with an entry(?) (since "Etymology #" isn't very descriptive otherwise.) ─ ReconditeRodent « talk · contribs » 01:31, 3 March 2019 (UTC)
The following CSS should hide all but the top-level headings in the ToC in the main namespace: .ns-0 .toclevel-1 ul { display: none; }. Add to your common.css page, or try it out by entering mw.util.addCSS('.ns-0 .toclevel-1 ul { display: none; }') in your browser's JavaScript console. — Eru·tuon 00:47, 3 March 2019 (UTC)
I'm with Andrew. Has RR considered using the right hand side placement of the table of contents, achieved by a gadget? I also wonder whether a gadget could accomplish selective repression of the offending parts of the ToC. DCDuring (talk) 00:49, 3 March 2019 (UTC)
@Erutuon I take it that one could specifiy .toclevel-[2,3,4,etc] with the corresponding reduced display. DCDuring (talk) 00:52, 3 March 2019 (UTC)
@DCDuring: Yep, that works too if you want to show more header levels. I imagine getting it to look consistent (for instance, to always show part-of-speech headers even when they are at different header levels) would require JavaScript, though. — Eru·tuon 01:00, 3 March 2019 (UTC)
Hey, wow, that's cool! Thanks!
Well I guess I'm happy then, though instinctively I still feel this would be a better default. Would it be impertinent to suggest a vote/!vote? ─ ReconditeRodent « talk · contribs » 01:31, 3 March 2019 (UTC)
@ReconditeRodent: It's a good idea to make sure that the vote has some chance of passing first. For my part, I am not in favor. — Eru·tuon 01:41, 3 March 2019 (UTC)
I like to see how many etymologies there are. Equinox 01:43, 3 March 2019 (UTC)
Through the table of contents the editor can see if he has sorted the headings wrongly or used unbalanced equal signs and the like, which former cannot be easily seen since level 4 and level 5 are of the same size. But I do not even peruse this advantage since I use tabbed browsing. Fay Freak (talk) 12:27, 3 March 2019 (UTC)
I prefer seeing the other headers, although I don't know if that means it should be the default for everyone as opposed to just something users like me opt-out of changing. In any event it might be useful to make the code for hiding non-language headers a gadget users could find in their Gadgets tab. - -sche (discuss) 17:45, 3 March 2019 (UTC)

This thread inspired me to make a super compact TOC CSS work again, and here it is:

/* Use simple horizontal TOC */
/* Appearance: Language names are layed out as a horizontal list and are the only items
   shown in the TOC; borders are only horizontal ones; the result is very compact
   and minimalistic. */
.ns-0 div#toc ul ul { display: none; } /* Reduce the depth of shown headings in TOC */
div#toc span.tocnumber { display: none; } /* Hide numbers in TOC */
.ns-0 .toclevel-1 ul { display: none; }
.ns-0 div#toc li { display: inline; }
.ns-0 div#toc li + li:before { content: ' · '; }
.ns-0 div.toctitle { display: none; }
.ns-0 div#toc { border-color: #DDDDFF; border-right: none; border-left: none; background-color: white; padding-top: 0px; }

In kilo-, it produces approximately this:

English · Czech · Danish · Dutch · Finnish · German · Hungarian · Italian · Latvian · Norwegian Bokmål · Norwegian Nynorsk · Polish · Portuguese · Romanian · Slovak · Slovene · Spanish · Swedish · Turkish

--Dan Polansky (talk) 18:48, 3 March 2019 (UTC)

Wow! That's lovely. It's amazing what can be done with CSS. I would use it if I didn't sometimes want to find subheaders from the ToC. — Eru·tuon 20:18, 3 March 2019 (UTC)
I might try to figure out how to make a compact layout that uses two levels of headings, but I am no CSS guru; the core ideas of the posted code were provided by someone else on en wikt. Incidentally, Wiktionary:Votes/2012-10/Enabling Tabbed Languages passed, and the super compact TOC is no worse than tabbed languages as for availability of subheaders. I used to use the compact TOC CSS before the tabbed language vote, and I am using it right now. --Dan Polansky (talk) 20:41, 3 March 2019 (UTC)
As an aside, thank you for that mw.util.addCSS hint; it is very nice for finetuning CSS. --Dan Polansky (talk) 20:47, 3 March 2019 (UTC)

For what it's worth, my personal CSS gives me this sort of experience. It's not nearly as wonderfully minimalist as Dan's CSS; it keeps all headings. —Suzukaze-c 03:02, 4 March 2019 (UTC)

@Suzukaze-c: I like that because I can still see all the headers, but it requires less scrolling. I've enabled my own slightly modified version (not yet saved on-wiki). — Eru·tuon 03:23, 5 March 2019 (UTC)
How about this? It is quite hacky though. It hides all subsections but still keeps the numbered etymology sections visible as numbers after the language. — surjection?〉 15:56, 11 March 2019 (UTC)

[edit]

From Okinawan onwards I keep getting the error message: “Lua error: not enough memory”. ---> Tooironic (talk) 10:02, 3 March 2019 (UTC)

I noticed the same thing with "me". Happens for every template. ─ ReconditeRodent « talk · contribs » 13:02, 3 March 2019 (UTC)
Some of the modules used (via templates) on the page use a lot of memory, more memory than pages are alloted. This has also hit e.g. water and man in the past (and discussions can be found in the archives of this page and the Grease Pit) and led to translations being moved to subpages. Transliteration (whether generating it or just checking a manually input one) seems to be among the things which is "expensive". Ultimately, we're going to have to do fewer "expensive" things with Lua, or at least (as we did with {{t-simple}}) have a set of much simpler or even Lua-less templates for use on large pages like this; for example, pages like this could use simpler headword templates that would just have the romaji input manually as a parameter and not invoke Lua to generate or check it. - -sche (discuss) 16:47, 3 March 2019 (UTC)
(edit conflict) See CAT:E. I've cleared all the module errors except for this one, so, for the moment, it doubles as a list of pages with this problem. This is a a recurring problem with large entries: each template that calls a module uses memory for that module, and no entry is allowed to use more than 50 MB of module memory. The location where it starts getting the error isn't all that significant, since it results from the system's order of executing the modules. Generally it's not any specific item, but the total number of them that causes the problem.
The only solution is to reduce the total execution time of all the templates in the entry. This is not easy, but here are a few tips (I'm sure @Erutuon can expand on/correct this):
  1. The easiest step is adding the entry to the opt-out list at {{redlink category}}. This template is called by every linking template such as {{l}},{{m}},{{t}}, and {{t+}} so its module use is multiplied over the entire number of linking templates in the entry. That's already been done for the 6 entries in question.
  2. Get rid of any unnecessary module-using content, such as duplication.
  3. Replace linking templates with ones that use less memory. Linking templates do a lot of work behind the scenes to check things and get the information needed to produce the correct link and display the text properly. The {{t-simple}} template does only the bare minimum of this for a translation template, and may be substituted for the regular ones where the loss of functionality isn't a problem.
  4. Replace linking templates with hard-coded wikitext- plain wikitext doesn't use module memory. For links to English entries, there's no need to look up the language information for displaying the text, since English is the default language here: {{l|en|word}} is functionally equivalent to [[word#English|word]]
  5. Move the largest blocks of linking templates such as translation tables or derived-term/compound lists to a subpage or appendix and provide a link to it in the entry. Moving quotes to the citation tab is another variant. Some of the most intractable memory-hogs have required this.
Chuck Entz (talk) 17:03, 3 March 2019 (UTC)
All good tips. Another way to reduce memory usage, which I just did for do, is to replace a bunch of individual linking templates ({{l}}) with a column template (in this case, {{rel2}}). Each template invokes a module, and each invocation uses a certain amount of "startup" memory, so fewer module-based templates tend to use less memory, when they are doing roughly the same job. — Eru·tuon 20:13, 3 March 2019 (UTC)
Thanks for the suggestions, but three CJKV characters, , , has been placed in Category:Pages with module errors for a very long time. Sometimes the memory works fine, but after a few days (even though no one has edited the entry), the page runs out of memory again. And after a few days it works again. This has been going on for a long time. Any idea what is the main cause of this issue? KevinUp (talk) 13:31, 4 March 2019 (UTC)
You can preview each section and look at the parser profiling data at the bottom of the page to see how much memory is used by that section (some memory use is shared between sections, so the numbers may add up to more than 50MB). The basic issue is that those entries have lots and lots of templates which call lots and lots of modules. When you consider all the things that these modules do, it's not surprising that they're pushing the limits of what the system will allow. Chuck Entz (talk) 14:48, 4 March 2019 (UTC)
  • This seems to be a significant issue that has a real impact on users' experience of Wiktionary, at least those who want to look up simple words with multiple definitions in multiple languages. I would recommend we use language-specific soft-redirects (i.e. sub-pages) for words like , , etc. Surely this would free-up lua memory enough so all the information can be displayed for users, even if a second click is required. As someone with no IT expertise, I can only hope one of our talented editors can facilitate such a solution. ---> Tooironic (talk) 05:32, 9 March 2019 (UTC)
Update: Thanks to User:Erutuon, Lua memory in is now reduced to 44.98 MB by subsituting {{ja-r}} with {{ja-r/multi}} and {{ja-r/args}} as well as {{zh-der}} with {{zh-der/fast}}. KevinUp (talk) 09:19, 11 March 2019 (UTC)

Pseudo-X-isms by language[edit]

There's CAT:Pseudo-anglicisms by language (which incidentally should perhaps use a capital A for consistency with other such terms). There are pseudo-Latinisms like noli illegitimi carborundum, which could go in a subcat of CAT:Pseudo-Latinisms by language. There are pseudo-Gallicisms like quoi ci quoi ça and double entendre. There's also CAT:Pseudo-Italianisms by language (and some English pseudo-Italianisms discussed here). There must be others. I think these should be grouped into a category for "Pseudo-X-isms by language", similar to "Borrowed terms by language". What should they be called? "Pseudo-borrowings", "pseudo-loans"? - -sche (discuss) 17:20, 3 March 2019 (UTC)

I thought of "pseudo-foreignisms" at some point. Per utramque cavernam 17:38, 3 March 2019 (UTC)
Now that I've page through Google Books results, I think "pseudo-loans" is most common, followed by "pseudo-borrowings", followed by "pseudo-foreignisms". - -sche (discuss) 19:12, 3 March 2019 (UTC)
I set up Category:Pseudo-loans by language. At the moment both "Portuguese pseudo-loans‎" and "Pseudo-anglicisms by language‎" -type categories are subcategories of it; somebody may want to change that or privilege the latter to be at the start of the list (with *) or the like. - -sche (discuss) 22:46, 7 March 2019 (UTC)

Stress of compound words[edit]

Why isn't the stress(es) of compound words and the like reflected in their entries? E.g. ,bee's knees vs 'bee sting --Backinstadiums (talk) 19:24, 4 March 2019 (UTC)

@Backinstadiums: What do you mean by "reflected"? Stress is usually given in the Pronunciation section. You can include the stress by adding pronunciation transcriptions to bee's knees and bee sting. — Eru·tuon 22:47, 4 March 2019 (UTC)
@Erutuon: I meant addition. Is there any compound of the like that shows it? They're lexicalized sometimes and I am not a native speaker --Backinstadiums (talk) 01:50, 5 March 2019 (UTC)
@Backinstadiums: You can find examples of English words that are categorized as compounds and have a stress mark in an IPA template by searching : incategory:"English compound words" hastemplate:"IPA" insource:/\{\{IPA\|[^|}]+ˈ/. Some are very old compounds that are not felt as compounds anymore (like island), but others are probably more "fresh" compounds. — Eru·tuon 01:57, 5 March 2019 (UTC)

@Erutuon: According to the Longman Pronunciation Dictionary, " Usually, Compound words / phrases have early/late stress, respectively. Yet, among grammatical compounds pronounced with late stress are those where the first element names the material or ingredient (except for the terms cake, juice, water, so ˈorange juice), so a ˌpork ˈpie, a ˌrubber ˈduck, or a ˌpaper ˈbag (bag made of paper) but ˈpaper bag (bag for newspapers). --Backinstadiums (talk) 17:20, 6 March 2019 (UTC)

Other Germanic languages have the same stress distinction. The Dutch equivalents of those phrases all have stress on the same part as in English, but in each case the distinction is also visible in spelling because late stress has a space while early stress does not: ˈsinaasappelsap, ˌrubberen ˈeend (but ˈrubbereend has early stress), paˌpieren ˈzak (but paˈpierzak has early stress). Late stress is associated with adjective-noun phrases like wooden box, which suggests that "rubber duck" and relatives are in fact also syntactically an adjective-noun phrase and not compounds. —Rua (mew) 17:15, 9 March 2019 (UTC)

What is the policy on suprasegmental prosody? --Backinstadiums (talk) 17:08, 9 March 2019 (UTC)

africates: / d͡ʒ͜ɹ , ʃ͡ɹ /[edit]

Pondering about some pronunciations of words such as imagery /ˈɪmɪ.d͡ʒ͜ɹɪ/ or dangerous, I infer that IPA should recognize as such africates such as / d͡ʒ͜ɹ / (and even / ʃ͡ɹ / in shrub), just as currently is / t̠ɹ̠̊˔ /. What are the guidelines on this issue in Wiktionary? https://www.youtube.com/watch?v=mH5FbbusdkI --Backinstadiums (talk) 01:52, 5 March 2019 (UTC)

If the IPA doesn't recognize it, why should we blaze the trail? What's the distinction between /ʃ͡ɹ/ and /ʃɹ/? /ʃ͡ɹ/ is certainly going to confuse people, and I don't see a value add.--Prosfilaes (talk) 05:22, 5 March 2019 (UTC)
[d͡ʒ͜ɹ] and [ʃ͡ɹ] (they should be in square brackets because they are not phonemes) don't really look like affricates. Affricates are basically stops with a fricative release. They don't qualify as affricates: [d͡ʒ͜ɹ] has an extra approximant at the end (d, ʒ, ɹ) and [ʃ͡ɹ] has a fricative and an approximant, not a stop and fricative. About guidelines, guidelines for what? — Eru·tuon 06:10, 5 March 2019 (UTC)

Read-only mode for up to 15 minutes on 19 March 15:00 UTC[edit]

Hi everyone, a short notice. On 19 March 15:00 UTC your wiki will briefly be in read-only mode. That means that you’ll be able to read it, but not edit. This is because of network maintenance. It will last up to 15 minutes, but probably shorter. You can read more on Phabricator (phab:T217441, phab:T187960), or write on my talk page if you’ve got any questions. /Johan (WMF) (talk) 14:52, 5 March 2019 (UTC)

Unprotection of user scripts[edit]

User:Yair rand/newentrywiz.js is currently admin-protected. It's unnecessary (at least now, maybe not back when it was protected) because only admins and interface admins can edit user JavaScript pages. Could it be unprotected so that lowly interface administrators like me can edit it? — Eru·tuon 01:35, 6 March 2019 (UTC)

Done. It should be moved to MW namespace. Dixtosa (talk) 19:03, 6 March 2019 (UTC)

Order of etymologies[edit]

Does Wiktionary have a policy for the order etymologies should go in? I noticed that on fly, the first etymology listed is an obscure (relative to the other ones) dialectical word meaning "wing." It seems to me that when some etymologies are significantly more notable than others, they should go first. Is there a policy I'm unaware of that makes the current order correct, or should it be changed? Nloveladyallen (talk) 00:20, 7 March 2019 (UTC)

The Japanese entry ない has the same problem. Etymology 1 is an unproductive suffix only found in a small number of words. --Dine2016 (talk) 05:00, 7 March 2019 (UTC)
I am not aware of a policy, but I think you can be bold and re-order the sections to what you deem the most logical. If someone else disagrees they can bring it up for discussion. - TheDaveRoss 13:25, 7 March 2019 (UTC)
When I see obscure/dialect things as the first ety for an everyday word, I swap them around. Equinox 16:45, 7 March 2019 (UTC)
I agree, though I also tend to move up older etymologies, provided they still have at least one definition in common, widespread use. DCDuring (talk) 17:58, 7 March 2019 (UTC)
As to why it's like that on some entries: some people, at least historically, preferred to put the oldest / first-attested etymologies/words first. (And some users, at least historically, straight-up put uncommon or obsolete or dialectal etymologies/words first even when more common ones are equally old...) Please do re-order them. - -sche (discuss) 22:52, 7 March 2019 (UTC)
  • It's a matter of the editor's personal preference. AFAIK the order of etymologies (and definitions) has not been enshrined in Wiktionary policy. IMO we should put the most common usages first, and that's the way I have been editing. ---> Tooironic (talk) 05:27, 9 March 2019 (UTC)

[Japanese] Should historical Kanji readings always be noted whenever applicable?[edit]

For example, the historical inscription of 川 in Kunyomi is かは (which has since been reformed to かわ), so should every instance of 川 in a word being read as かわ have かは as a historical hiragana? I noticed that most entries do not bother adding it but a few of them do. --Four-fifths (talk) 04:29, 9 March 2019 (UTC)

You could start adding them, of course. It's just missing information. —Suzukaze-c 04:38, 9 March 2019 (UTC)
I would prefer for historical kanji readings/spellings to be added only if the historical reading is attestable in historical literature. KevinUp (talk) 08:19, 9 March 2019 (UTC)
A number of historical literature does not spell according to 歴史的仮名遣い. Here is an example where 全(まと)う, reduced from 全(まっと)う, is spelled またふ despite the correct historical spelling being またう. --Dine2016 (talk) 10:59, 9 March 2019 (UTC)

Language family trees in category pages[edit]

Hi, JohnC5 and I would like to add language family trees (generated by Module:family tree) to language categories, probably at the bottom of the text, directly above the lists of subcategories and pages in the category. This has been a plan for a while, but thanks to some HTML and CSS work by Suzukaze-c (and some work by me), the tree is finally in a presentable state.

As an example, the following tree would be added to Category:Proto-Germanic language. It shows the descendants of Proto-Germanic, based on the language data that is used in our entries. Click "Expand" to see the tree.

Some aspects of the tree are confusing. Etymology languages (such as American English) are shown as children of the languages, language families, or language variants that they belong to (in this case, English). This does not mean that they are descendants (like English is a descendant of Middle English); we simply don't have a better way to display them in the tree.

Currently, language families have a tree emoji after them (🌳) and etymology languages have a speech bubble (💭). This could be changed.

Some aspects of the style of the tree are not set in stone. One disagreement is the position of the tree icon: on the left side or the right side of the language family name. Currently it's on the right so that all the language and language family names line up. If you have an opinion on this either way, please let us know.

Is there any opposition to this idea? Also, any ideas for improvements? — Eru·tuon 08:14, 9 March 2019 (UTC)

Since nobody objected, the trees have been added to language categories. The icons have changed, though. Suggestions still welcome. — Eru·tuon 05:19, 17 March 2019 (UTC)

Encoding of apostrophe-like palatalisation marks in various languages[edit]

There are various languages written in the Latin alphabet that use a mark resembling an apostrophe or prime to indicate palatalization. The exact Unicode code point to use is often not specified in the language, or used haphazardly without regard to the function that Unicode designates for the character. As a result, there are many variations in use, often within the same language as well. The difficulty of producing the correct mark often leads language users to use the simple ASCII apostrophe ('), which is not well suited for that purpose. More generally, Unicode characters that are designated "punctuation" are often used as well, even though the palatalization mark is not a punctuation character and is sometimes considered a proper letter of the alphabet in question.

As far as the orthography of Skolt Sami is concerned, however, the codepoint to use is actually standardised: ʹ (U+02B9 MODIFIER LETTER PRIME). This character is intended in Unicode for use in linguistics to represent palatalisation, and we use it in our transliterations of Russian as well. More importantly, because it's considered a letter and not punctuation by Unicode, applications will not use it to separate words and will select it along with the rest of a word when you doubleclick on it. Therefore, this seems like the character we should use, and I hereby propose we make this the standard for all such cases across languages. This would affect various Finnic languages (Veps, Võro and Votic), but I'm sure there are others that I'm not familiar with. Spellings with alternative palatalization signs can become redirects to the spellings using the proposed symbol. —Rua (mew) 15:06, 10 March 2019 (UTC)

Seems right. Fay Freak (talk) 16:24, 10 March 2019 (UTC)
For people who can read French, you may be interested by Wiktionnaire:Apostrophes. We try to record all the apostrophe-like mark we should use for any languages. Pamputt (talk) 09:51, 11 March 2019 (UTC)
Please don't change the transliteration for Russian or other languages, which transliterate "ь" as "ʹ" (not a plain apostrophe), e.g. мать (matʹ, mother): "matʹ"! (Asking just in case). In Ukrainian in Belarusian a plain apostrophe is also a standard letter (different from "ь", which is also used) and Uzbek seems to use "ʻ".
Czech and Slovak uses a symbol, which is merged with the letter it palatises: e.g. ť as in mať (mother). --Anatoli T. (обсудить/вклад) 11:55, 11 March 2019 (UTC)
I think you misunderstood. —Rua (mew) 19:31, 11 March 2019 (UTC)
Yes, I did. --Anatoli T. (обсудить/вклад) 23:20, 11 March 2019 (UTC)
Seems OK, yes. Per utramque cavernam 19:37, 11 March 2019 (UTC)

: When Japanese Kyujitai and Traditional Chinese shapes for the same codepoint differ.[edit]

I've noticed that the Japanese Kyujitai form and the Traditional Chinese form while sharing a Unicode codepoint, differ in that the Japanese form has an extra stroke joining the two stacked rectangles whereas the Traditional Chinese form does not. (Do they officially differ by stroke count?)

I'm assuming this is systematic across the majority of Japanese vs Traditional Chinese fonts.

I know we have some mechanisms for documenting when Simplified and Traditional Chinese forms have differing appearances but a shared codepoint? Do we do the same with Kyujitai vs Traditional? If so should we add that to this entry? If not should we start doing so? And what to do if such variation is not systematic but varies from font to font? — hippietrail (talk) 06:32, 11 March 2019 (UTC)

@Hippietrail: perhaps like 蝉#Translingual? where the usage notes describe the difference, or 浅#Translingual, where the IDS describes the difference. @KevinUpSuzukaze-c 07:35, 11 March 2019 (UTC)
No, both Traditional Chinese (Taiwan/Hong Kong standard) and Japanese kyūjitai of have the same stroke number and glyph appearance. The difference occurs due to Xin Zixing (新字形) in mainland China, which substitutes all characters containing with which is one stroke less. KevinUp (talk) 08:40, 11 March 2019 (UTC)
The revised form of (containing rather than ) can be found in books published in mainland China, such as calligraphic books and modern dictionaries such as Xiandai Hanyu Cidian. The different glyph forms have been noted in this edit. KevinUp (talk) 08:40, 11 March 2019 (UTC)

Very interesting - thanks everyone! Should we do more to document these differences? Especially in the translingual section that shows the forms? This is a kind of variant form after all, even though sharing a codepoint. Do we have a category for characters affected by Xin Zixing? Are there cases where the Xin Zixing does get its own Unicode codepoint due to differences being regarded more significant?

  • The differences can be noted using IDS at the translingual section. I sometimes add detailed verbose descriptions under the "alternative forms" header (See for example). We don't have a category for characters that are affected by Xin Zixing.
  • We could create something such as Category:Traditional Chinese characters with Xin Zixing form, but characters have to be added on a case-by-case basis, because many of our IDS are incorrect (please check the official Unicode chart at https://unicode.org/charts/PDF/U4E00.pdf etc.) and some obsolete characters are encoded only in Taiwan and not mainland China.
  • Yes, there are many cases where Xin Zixing forms get its own Unicode codepoint. These are well documented at the Chinese Wikipedia article for 新字形. KevinUp (talk) 11:45, 11 March 2019 (UTC)

Also this means my photo is a bit of a quirk. I'm in Taiwan and I've taken photos of three different Japanese style "open for business" signs and all three actually use this Xin Zixing form rather than the Kyujitai or Traditional form. If anyone knowledgeable would like to alter the captions an descriptions here and/or over on Commons that would be great. I might upload the pictures of the other two signs too. — hippietrail (talk) 09:45, 11 March 2019 (UTC)

It turns out my photos are one of each form! Two are already at 営業中. I'll upload the third and I might make cropped versions of each too... — hippietrail (talk) 09:57, 11 March 2019 (UTC)
The characters in "commons:File:Japanese "open" sign in traditional characters.jpg" are written in nonstandard form:
  • The character (with instead of the orthodox ) is recorded as A02453-025 in 教育部異體字字典 (Dictionary of Chinese Character Variants).
  • If you look closely at the second character the bottom component of is written ⿳䒑一木 rather than its correct form ⿱䒑未.
  • I've modified the description of the file at Wikimedia Commons.
(By the way, this discussion belongs to the Tea Room) KevinUp (talk) 10:42, 11 March 2019 (UTC)
Thanks once more! Sorry I've been away from beer parlours and tea rooms for so long I've forgotten. Please feel free to move the discussion in case I do it the wrong way and make a mess. I've just uploaded the third variant too:
營業中
hippietrail (talk) 11:17, 11 March 2019 (UTC)

Wiktionary:Votes/2019-03/Defining a supermajority for passing votes[edit]

I've drafted this vote to define the supermajority we use, as well as what "fail" and "no consensus" mean. Please give me feedback, particularly regarding the higher standard for modifications to WT:CFI and WT:EL. —Μετάknowledgediscuss/deeds 00:29, 12 March 2019 (UTC)

Reminder to contribute to the discussion at Wiktionary talk:Votes/2019-03/Defining a supermajority for passing votes, particularly on issues like whether only admins should close votes, and the higher standard for CFI and EL mentioned above. —Μετάknowledgediscuss/deeds 14:44, 15 March 2019 (UTC)

Standardizing some template shortcuts[edit]

Can we pick a standard for the shortcuts of alternative forms templates? Currently we have:

I am constantly forgetting which ones use a hyphen, a space, neither, or some combination thereof. Personally I would prefer a space with no hyphen for all of them: alt sp, alt form, and alt caps (without of) seem like the clearest option to me. Ultimateria (talk) 17:44, 12 March 2019 (UTC)

I take it back, I would keep of to be consistent with {{form of}}, {{synonym of}}, etc. Ultimateria (talk) 19:43, 12 March 2019 (UTC)

Page-deleter role?[edit]

It's been suggested several times in the past few months that we break up admin responsibilities into smaller roles. Personally, I think one such role could be that of "page-deleter".

As I've written here, I see the blocking tool as "the most powerful tool, and the one requiring the most discernment"; this means that someone trusted with it can easily be trusted with all the rest and be made an admin (I'm not the only one thinking as much). The reverse is not necessarily true: one could be trusted to make a good job as a page-deleter, but not as a blocker. That's why I think having the possibility of granting page-deleting rights as separate from adminship could be useful.

A user entrusted with that role would be able to delete entries that failed RFD, vandalistic entries, spam entries or spam user pages, empty categories tagged for deletion, wrong bot entries, unwanted redirects, etc.

If this is accepted, another question would arise: on what basis should it be granted: a vote? A whitelisting-like nomination?

What do you think? Per utramque cavernam 18:14, 12 March 2019 (UTC)

While I feel slightly less strongly about this idea than the blocker version, it still feels like a solution in search of a problem, and I would still probably prefer that we just have a single role for blocking/deleting. Deletion is also a multifaceted function, which of the following functions would be included: page delete, page undelete, revision delete, revision undelete, view deleted/hidden revisions, delete logs entries, delete tags, mass delete? If it were to be a new role, I would suggest it be voted on. - TheDaveRoss 12:47, 13 March 2019 (UTC)

WMF proposes rebranding Wiktionary as a "Wikipedia project"[edit]

WMF conducted a study discovering that "Wikipedia" is the most recognized name and project, while "Wikimedia" is less recognized. It proposes rebrading Wiktionary as a "Wikipedia project". For public feedback, you should go to meta:Talk:Communications/Wikimedia brands/2030 research and planning/community review; for private, email to brandproject(a)wikimedia.org. --George Ho (talk) 21:12, 12 March 2019 (UTC)

My feedback is that Wikipedia is awful and I'd hate to be affiliated with it. DTLHS (talk) 21:16, 12 March 2019 (UTC)
Nothing like a needless "rebrand" to suck up volunteer donations :( Equinox 21:54, 12 March 2019 (UTC)

You may now become 'Wiktionary — A Wikipedia project'[edit]

According to this discussion at Meta, Wikimedia Foundation is considering rebranding. This means for you, that rather than Wiktionary being a Wikimedia project, it would become a Wikipedia project.

The proposed changes also include

  • Providing clearer connections to the sister projects from Wikipedia to drive increased awareness, usage and contributions to all movement projects.

While raising such awareness in my opinion is a good thing, do you think classifying you as a 'Wikipedia' project would cause confusion? Do you think newcomers would have a high risk of erroneously applying some of Wikipedia principles and policies here which do not apply? If so, what confusion? Could you please detail this. I have raised a query about that HERE in general, but I am looking for specific feedback.

Please translate this message to other languages. --Gryllida 23:05, 12 March 2019 (UTC)

@Gryllida: This is a terrible idea. We frequently have newcomers, both with and without actual experience editing Wikipedia, attempting to apply English Wikipedia policies like notability (which has a local, but very different lexicographical equivalent) and 3RR (which does not exist here). We try to patiently point them toward noticing the name of the website they are currently editing, and to acknowledge that they are separate projects. I can only imagine how much more confusion there will be if this were to go through. —Μετάknowledgediscuss/deeds 00:30, 13 March 2019 (UTC)
Thank you for these clarifications. Three questions:
  • Apart from notability and 3RR, is there anything else that is different?
  • Would you be willing to give examples of these confused newcomers and the communication with them?
  • I've found Wiktionary:Wiktionary_for_Wikipedians. It talks about the differences. Is it up to date? Is there any other relevant documentation that you would share in response to this question? Gryllida 01:09, 13 March 2019 (UTC)
    I find it confusing that in this discussion Wikipedia appears to stand for the English Wikipedia, and Wiktionary for the English Wiktionary. For each language, these have their own policies and customs. As to the respective English-language projects, it is easier to list the commonalities: (0) Like all Wikimedia projects, both use MediaWiki software; (1) Anyone can edit Wiktionary (but, unlike on Wikipedia, also anonymous IPs can create pages); (2) Users who are apparently not there to contribute to the project will soon find themselves blocked. (3) Only administrators, who get that role only after having been approved by the user community, can block users and delete pages. That’s about it.  --Lambiam 10:15, 13 March 2019 (UTC)
Currently we can speak of "Wikipedia policies" and "Wiktionary policies" (or "...votes", "editors", etc.). How are we supposed to distinguish these things, in speech and writing, after the word "Wikipedia" subsumes Wiktionary? Equinox 00:49, 13 March 2019 (UTC)
If the rebranding is approved, the name 'Wiktionary' will remain. As I understand, it will become named 'a Wikipedia project' (the new branding) instead of 'a Wikimedia project' (the current branding), that is all.
While at the moment we see the 'a Wikimedia project' only at certain pages (the main page; {{sisterprojects}}; documentation; these places are pretty hard to discover), if the rebranding is approved, the belonging of the project to the family of Wikimedia (to-be Wikipedia) projects may be featured more prominently. Gryllida 01:05, 13 March 2019 (UTC)
Today I can say "he edits Wikipedia but not Wiktionary". How would I say that afterwards? Equinox 01:07, 13 March 2019 (UTC)
The same phrasing and names would apply. Their first name (Wiktionary, Wikipedia, etc) would remain the same and their last names ('a Wikimedia project', which nobody sees now, but after the rebranding they may become 'a Wikipedia project' and become more prominently shown to readers) would change, so to speak. Gryllida 01:12, 13 March 2019 (UTC)
How would you think of renaming 'Wikimedia' to 'Wikimania'? To name it 'a Wikimania project'? Perhaps Wikimedia Foundation likes this brand, and it does not cause as much confusion as 'a Wikipedia project'. It is probably not too bad that there is a conference with this name, it is about the same movement anyway. Gryllida 01:16, 13 March 2019 (UTC)
  • The problem is that most of the world is confused about everything in WikiWorld except Wikipedia. So for outward-facing presentation purposes we probably benefit from a more explicit connection with WP. This seems to me to be a lot like what I have to do when I explain this project which has consumed much of my time for more than a decade. I have to say "Wiktionary is like Wikipedia, except it's a dictionary. It's supported by the same foundation that supports Wikipedia." Two sentences; two mentions of Wikipedia. To me this re-branding is almost a non-event. It seems like a simple recognition of where we stand in the eyes of the world. DCDuring (talk) 02:29, 13 March 2019 (UTC)
    Thank you DCDuring. Since your position is that Wikipedia brand would not cause harm, how do you think about the Wikimania brand? Do you think naming Wiktionary 'a Wikimania project' would make any harm? Do you think this change would be as good as the 'a Wikipedia project' name? Gryllida 03:29, 13 March 2019 (UTC)
I agree that this isn't really a thing. Really what is happening is that Wikimedia is rebranding itself as Wikipedia. We wouldn't need to change anything around here. - TheDaveRoss 03:06, 13 March 2019 (UTC)
I'd also welcome the opportunity to know your opinion about the 'Wikimania' brand as well. It is not confirmed by Wikimedia at this stage but knowing your views about it would be nice. Gryllida 03:30, 13 March 2019 (UTC)
I thought you were joking. The Wiki movement has worked quite hard to be taken seriously and has finally achieved the objective for many audiences. 'Wikimania' would undermine all that progress IMO. It seems to convey the image of the lunatics running the asylum. DCDuring (talk) 03:47, 13 March 2019 (UTC)
Yeah, I find "Wikimania" a bit harder to take seriously. Equinox 03:54, 13 March 2019 (UTC
I am not an expert but my gut feeling is that the way things are now is fine. There is the saying- If its not broken, don't fix it. As a frequent editor, if this change was made, it would not affect me too much. Geographyinitiative (talk) 05:20, 13 March 2019 (UTC)
It aint broke - so don't try to fix it. SemperBlotto (talk) 06:55, 13 March 2019 (UTC)
While I agree with some sentiments that this could marginalize smaller projects the decision to rebrand makes sense. As a word, Wikimedia is just too close too Wikipedia and gets easily confused, both in reading and speaking. The choice of Wikimedia as an umbrella term was unfortunate in the first place. – Jberkel 11:59, 13 March 2019 (UTC)
I think the branding is broken. The proposed change seems reasonable.
I still get confused navigating among Wikimedia Foundation, MediaWiki, and Meta-Wiki. I hope that my confusion is not an indicator of the confusion of others. DCDuring (talk) 12:26, 13 March 2019 (UTC)
The MediaWiki vs Wikimedia naming is unfortunate. Back in 2003 the naming committees really got stuck on a theme. I don't think Wikimania is a better brand or name than any of the alternatives, I don't think it has any cachet outside of a subset of the Wikimedia community and I don't think it is strictly worse at indicating what the thing is that it is naming. Wikicon would be a better name for Wikimania to begin with, at least that follows the form of the thousands of other conventions. - TheDaveRoss 12:39, 13 March 2019 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────

Redshirt WikiWorld.png

Assuming that the purpose of having a unified brand is facilitating publicity for all projects, a major consideration is how evocative and easy-to-remember the brand name is. While currently Wikipedia is the best-known name associated with Wikimedia, with the right approach any well-chosen name can quickly become widely recognized; it is just a matter of generating publicity. I agree that Wikimedia was an unfortunate choice: not appropriately evocative (“media” is not a unifying focus), and easily confused with Wikipedia or MetaWiki. Replacing it by Wikipedia will raise the confusion to an unmanageable level. Wikimania may seem cool but has bad connotations that are just too strong and is irresistably inviting of the derived term Wikimaniac, which is fine for internal use, but we would not be able to keep its use contained. Why does the WMF not open up a contest for a unified brand name in the style of WikiXXX for some suitable term replacing XXX, with (after a preliminary selection producing a shortlist) the user community selecting the winner. My submission: Wikiworld. That certainly covers everything and has a nice alliteration. (I know there used to be a WikiWorld, but that has now been defunct for over 10 years.)  --Lambiam 14:52, 13 March 2019 (UTC)

I'm skeptical that any change will improve whatever perceived problem there might be. If Dine Brands Global Inc. changed their name, would people eat at Applebee's or IHOP more often? I doubt anyone would notice. -Mike (talk) 16:51, 13 March 2019 (UTC)
I'm boycotting Google because they aren't Google on the stock exchange anymore. DCDuring (talk) 17:05, 13 March 2019 (UTC)
There is a proposal at meta to have a brainstorming for different names. (Thanks Lambiam) The names proposed so far are 'wikipedia', 'wikiworld', 'wikimania', 'wikiweb'. Please share your proposals either here or there, at your convenience. Gryllida 00:43, 14 March 2019 (UTC)

Not only will it cause confusion because of an old sense competing with a new sense, if you rebrand Wikimedia to Wikipedia or any other project’s name, but it will also be factually untrue if you call Wiktionary “a Wikipedia project”. Wiktionary isn’t a Wikipedia project, won’t become one, shouldn’t become one, even if you do hold the Wikipedia brand in higher esteem, which I do not, thinking that the abyss would stare back; the confusion and separation issue is enough of a reason. If you do a rebranding do it only if that is worth it and don’t mingle projects in so much as they are intentionally separate.
Currently your issues are that Wikimedia is not distinctive enough, being only different in one grapheme or phoneme, though this issue is minor and can be ignored as it until this proposal has been ignored, and that one the other hand the merits of Wiktionary, as a project being as much of higher quality as it works distinctly, – the analogous with other projects like Wikispecies – are not highlighted enough. If you show an attachment of Wiktionary to Wikipedia you will pull it down and achieve the opposite of what you want to achieve. The messages must be and stay: Wiktionary will give you an experience that is well above that on Wikipedia. Wikipedia has lost its chances to be taken seriously, I am sorry to blackpill you, though the usefulness of Wikipedia is of course not debated by anyone, and Wiktionary is currently above it, as is Wikispecies, but people do not know the difference, only know Wikipedia. It is important to make known for those who have, rightly, lost hope in Wikipedia, that Wiktionary is 1. made by other editors 2. editors working pursuant to dissimilar principles and workflows, even if they also edit Wikipedia 3. describes a wholly unlike subject matter, hence the resulting project should be put not all on one level with Wikipedia. Fay Freak (talk) 04:27, 15 March 2019 (UTC)

Who is the “you” implied in “your issues”, used above? Are you addressing the Wikimedia Foundation? I don’t expect them to be monitoring the discussion on this page.  --Lambiam 09:16, 15 March 2019 (UTC)
Yes, Wikimedia’s, and also like one’s, the editors who try not to confuse when explaining; though I am not sure if they don’t even monitor this where they have posted, this being a Wikimedia project. Well I could repost it under meta:Talk:Communications/Wikimedia brands/2030 research and planning/community review#Wikipedia I guess since I now do not discern a different place for it; it would be a comparatively long answer there though. Fay Freak (talk) 14:38, 15 March 2019 (UTC)

Deleting Template:redlink_category and Module:redlink_category[edit]

FYI, I started a discussion about deleting this feature on the talk page of the template. Noting here in case not everyone notices the discussion there. - TheDaveRoss 15:55, 14 March 2019 (UTC)

I hope this doesn't need a vote. It seems to me to merit a BP discussion, especially because the idea behind it is potentially of wider application. DCDuring (talk) 16:04, 14 March 2019 (UTC)
Although this is a really awful hacksaw-and-bailing-wire-and-duct-tape way of doing this (run a module from every linking template on every page that has linking templates, every time any such page loads, with an expensive parser function run every time if the template is linking to one of the target languages- really? To populate todo lists?), there are people who find the information it generates very useful, and no one seems to want to spend the time and effort to generate it by other, more sensible methods. Chuck Entz (talk) 03:19, 15 March 2019 (UTC)
It's not too much effort to implement something like this by analyzing the dumps. It would work for all languages, and could take other ways of linking into account (plain wikilinks), and perhaps even indicate orangelinks. – Jberkel 10:30, 15 March 2019 (UTC)

category: silent t[edit]

Please could somebody create a category for words with a silent t: moisten, often, thistle etc. --Backinstadiums (talk) 14:45, 15 March 2019 (UTC)

Pronunciation trivia of this sort seems more suited for an appendix really. --Tropylium (talk) 09:47, 18 March 2019 (UTC)

Translations in languages you don't know[edit]

In User talk:Panglossa#Translations in languages you don't know, we read "Please avoid adding these. It is very easy to make mistakes, and even if you get the content right, you may end up adding it in the wrong way, as you did at walk, thus requiring someone else to clean up after you."

The original poster to the user talk page added e.g. Czech pivní sýr, and admitted they do not know Czech.

As far as I know, multiple established editors add translations in languages they do not know. A very recent example is diff, where at least Estonian and Greek do not match any Babel box.

Do we want users to receive such messages on their talk pages? Do we want to introduce a policy or recommendation to the effect of that message on the user talk page? --Dan Polansky (talk) 09:59, 17 March 2019 (UTC)

The way the message is phrased makes it sound like the boilerplate language of Wikipedia warning templates (“When moving pages, please remember to fix any double redirects”). Warning users when they make a kind of mistake that they are likely to repeat is by itself a good thing. It would have been better (I think) if the mistake had been specified more, and I think I might have phrased the warning like “please be extremely careful when...”. Perhaps a readable essay for new editors with positive advice on how one can contribute to Wiktionary (focus on languages you are familiar with) works better than introducing guidelines on what to avoid,  --Lambiam 12:08, 17 March 2019 (UTC)
Elsewhere, I made the following proposal:
Editors can contribute new entries even for languages that they do not know and have not studied. However, in such case, they are strongly encouraged to work very carefully with sources, and get acquainted with the lemmatization practice of the English Wiktionary for the language. For instance, for Latin, some dictionaries use e.g. stare as the lemma while Wiktionary uses sto as the lemma.
Whether that should have the status of policy, guideline or advice is a little less important, I think. --Dan Polansky (talk) 12:17, 17 March 2019 (UTC)
The status is less important than how easy a read it is.  --Lambiam 13:56, 18 March 2019 (UTC)
I agree that there shouldn't be an absolute prohibition from editing or adding translations in languages one doesn't know. I think in such cases one should be very careful, but certainly there are cases when being pretty much sub-A1 level in a given language doesn't preclude one from being able to consistently add correct and useful content in that language. There is in my experience a gradation in the degree to which one can be unfamiliar with languages not listed on one's Babel. For example: I am absolutely lost when confronted with Chinese or Nahuatl texts, but if you give me a Romanian word I am confident that I could with some effort find out whether it is in use, whether it is SOP or what its lemma form is. Perhaps we could for convenience create a Wiktionary namespace page or a new section on a relevant extant page with advice and warnings regarding possible pitfalls when editing/translating in languages one doesn't know (with a shortcut à la WT:ATTEST or WT:EL like idk, WT:UNFAMILIAR or w/e), but it'd be undesirable imo to prohibit such editing entirely (not least because proficiency is self-reported anyway, making such a rule difficult to enforce). — Mnemosientje (t · c) 13:08, 17 March 2019 (UTC)
Again, Dan, this isn't a matter of policy. I don't leave these messages for everyone doing it, and I wasn't going to for Panglossa until they made a mistake that I had to clean up. At that point, their contributions became a slight waste of another editor's time, and I therefore wanted them to stop doing that. It's that simple. —Μετάknowledgediscuss/deeds 15:09, 17 March 2019 (UTC)
I add such entries from time to time, and while I usually make sure the entry is correct, either by checking a dictionary or by asking a native speaker, I understand I can make mistakes, especially regarding the form of the entry. I welcome Μετάknowledge's warning about the correct procedure, but I also understand this a collective project, we contribute what we can and more knowledgeable peers will correct it if necessary. I will certainly be more careful from now on, but whenever I find something worth including, I will do so. Panglossa (talk) 15:19, 17 March 2019 (UTC)
@Panglossa: Thank you. If you're willing to put in the care to check both correctness and that the lemma/spelling/etc. meets with Wiktionary's standards, then I am perfectly satisfied. —Μετάknowledgediscuss/deeds 17:08, 17 March 2019 (UTC)
What about adding these under "Translation to be checked"? Panglossa (talk) 15:22, 17 March 2019 (UTC)
That's not really the purpose of the "Translations to be checked" sections, which are for translations where it's not known which of the translation sections a translation belongs to. Instead, you should use the template {{t-check}} where you would use {{t}}. This automatically tags it for checking by someone who knows the language, and also displays a message saying that it needs to be checked. This will alleviate much of the problem, though it still requires someone spending time later to clean up.
The biggest potential problem is that someone may add a translation that's wrong and that goes unnoticed. We don't have a really convenient way of finding all the translations in a given language, so it could be a long, long time before it's fixed. Translations are very hard to patrol, since they involved language-specific knowledge that no one person has for every language, and there's no way to check where the contributors got them.
If I see someone add or change translations in a large number of unrelated languages, that immediately raises my suspicions. Yesterday, a Canadian IP completely reworked the translation tables at middle in a single edit, with changes in multiple languages that I don't know. Fortunately, one involved changing an uppercase German noun to lowercase, which no one who knows anything about German would ever do, so I reverted all their edits and blocked them. They could have been mostly right, but the difficulty of sorting through all of their changes in all of those languages made throwing all of it out the only practical option once I knew they were seriously wrong on one aspect. Chuck Entz (talk) 15:59, 17 March 2019 (UTC)

Pronunciation respelling for English[edit]

I propose adding the corresponding entries for the graphemes of some Pronunciation respellings for English, especially the one used by wikipedia, that is WIK-ih-PEE-dee-ə-Backinstadiums (talk) 14:51, 17 March 2019 (UTC)

Symbol oppose vote.svg Oppose. These aren't words, nor do they have any meaning in a language. —Rua (mew) 15:12, 17 March 2019 (UTC)
If you mean adding WIK-ih-PEE-dee-ə, then no. If you mean adding to e.g. ee that it's used to represent /i/, then maaaybe, but pronunciation respelling schemes are possibly too varied for us to want to try to include them all; they are as Rua says not words. (And in many works that use them, they're explained in appendices already.) - -sche (discuss) 19:43, 19 March 2019 (UTC)

Attestations of native toponyms mentioned in Latin texts[edit]

Many old toponyms of Europe are found only in the form of mentions within Latin texts. Because the text itself is Latin, it seems that our CFI would treat these words as Latin. However, they are generally not Latin grammatically (i.e. they lack Latin endings), and are by and large written down by native speakers of the area in question, not native speakers of Latin. Thus, it can be argued that this is simply code-switching, inserting for example an Old Dutch name in its native form into an otherwise Latin text. If they are considered an attestation of the native language, we can include them in etymologies of modern place names, which is great. It wouldn't make sense to say that a modern Dutch place name is descended from a Latin name merely because the Old Dutch name was quoted in a Latin text. There really isn't anything Latin about these other than the language of the text they happen to appear in.

My question is whether these toponyms count as attestations for the local language, rather than Latin. I'm not sure if CFI says anything about this either way, but it certainly seems like it would be desirable to be able to include these. —Rua (mew) 15:12, 17 March 2019 (UTC)

If an undeclinable Latin form and an Old Dutch form (code-switched into Latin) are indistinguishable by form then according to some one and the same occurence attests this form for both languages. Since seemingly still people fail to see criteria for the lexicographical quality of an occurence with regard to code-switching.
I’d argue for a “favour for the smaller language.” If you say it is Latin one expects a bit clearer evidence that shows that these are names used in Latin, otherwise one could add place names without end because they somewhere appeared in Latin, which would be insipid. Whereas if you see such a thing for Old Dutch, one naturally can’t expect pull out more.
In my view toponyms and personal names should not even get their own language sections. They should be under L2 headers called “Name” or similar, other spellings being soft redirects like سميث‎ being “Arabic spelling of Smith” for instance; also using own linking templates perchance. Things like Timișoara and its argument “is this Spanish?” will get on everybody’s wick at some point. Why do we need hundred entries for Srebrenica only because history books about the Yugoslav Wars have been written in hundred languages? Why is Karadžić according to Wikipedia an English name spoken /ˈkærədʒɪtʃ/? I don’t believe in the “pronunciation information” arguments. If a Turk lives in Germany his name will stay bare Turkish for seven generations and beyond. Eindeutschung according to peculiarities of law won’t help. Kowalski is still not a German name. And yeah, all the entries in Category:English surnames from German are German lexemes used in English discourses, if not the German spellings of Slavic names etc. Kaufman is German and not English. People just don’t realize that they don’t talk English any more when they use these names. No, this is not code-switching. Names work differently. In other words languages are sets that do not contain proper nouns, since, rightly observed, these stay if you switch the language. Fay Freak (talk) 17:30, 17 March 2019 (UTC)
Why don't English speakers need to know how Karadžić is pronounced by English speakers? Why is Kaufman German--as our entry points out, and w:Kaufman (surname) shows, it's not a name used in Germany. According to Kaufmann, the basic form is attested back to Old High German, so if it's not English, it's not German either.
Proper nouns do not necessarily stay if you switch language, as the translation table for Rome makes clear. Even a new city like Las Vegas has six Latin-script Wikipedias that chose a name for their article other than "Las Vegas" (or "Las Vegas, Nevada"), with Navajo ranking in as the most unusual with Naʼazhǫǫsh Hatsoh. Tokyo is named Tokyo, Tokio, Tòquio, Tokyô, Tōkyō, Tóquiu, Tókio and Tang-kiaⁿ-to͘. To go at it from another direction, Perth may be spelled the English way in many languages, but it is not pronounced the English way in most of them, the dental fricative being rare among the world's languages. It's a complex mess, and your rant bluntly ignores all the hard details.--Prosfilaes (talk) 09:22, 18 March 2019 (UTC)
“Perth may be spelled the English way in many languages, but it is not pronounced the English way in most of them, the dental fricative being rare among the world's languages” – does not seem so. This name does not appear in discourse in German and if a German tries to pronounce it he tries a dental fricative. There is currently no place in Australia or Scotland having a lexicalized German pronunciation. How would a Russian pronounce it? It would likely also be with a dental fricative, if the speaker knows about its existence in other languages.
Las Vegas being primarily inhabited by English speakers, it would of course be notable in its section. Or in general, if we have “Name” sections, we can put the English pronunciations at first; it would also include a German pronunciation, which also is lexicalized, /las ˈveːɡas/. But I would be we wary not to conjecture any like you do for “Perth”.
The spelling or pronunciation, or inflection, apparently does not say anything about nativization; it is not constitutive and can not be taken as an indication for a name being included in a language, not related to what lexicalization means. “Kaufman” is a German name, only and even though used in Germany in a different spelling. But the pronunciation information would not get lost, as I said. It is important to see that one can’t just talk about “names used in Germany”, “a name used in the United States” and the like. “Names” aren’t “used” this way. They aren’t used because they belong to a language but because they belong to a specific entity referred to; with rare exceptions. Only one in England for German, a few more in Italy (Rom, Mailand, Venedig, Florenz, Turin, Padua, Genua, Neapel and then it ends for any speaker, if I haven’t missed one, other places are perceived and spoken as if bare Italian, ignoring those in the now or once German-settled areas), and for Poland in former German-settled places both compete. Is Szczecin German because it is sometimes used in this form and not Stettin in German newspapers and the like? No, this is a wrong question, not even Stettin is German in this sense: “Rome” being different by language, even names calqued does not say anything, since names are changed even for one language: Са́нкт-Петербу́рг (Sánkt-Peterbúrg), Ленингра́д (Leningrád). See, place names and people can just be “renamed” pursuant to the law, this is also shows how names work differently: This question “is this of the language X” does not arise in such a form in nature for names but you ask this only because on Wiktionary you group all under a language, by grouping names independently you avoid such questions which are wrong.
I also want to emphasize that place names and personal names slant the statistics in the categories “X terms borrowed from Y”. One could go around in Germany and quote the local Russian journals for any commune in Germany, we get 11,000 “terms” borrowed from German into Russian this way (the approximate number of communes in Germany). No, this is underplayed, since the towns also have districts, so the number is actually higher, even if we count stretches of land of which seemingly no Russian has ever heard of. Fay Freak (talk) 14:26, 18 March 2019 (UTC)
A German who knows no English who tries a dental fricative does not in fact produce a dental fricative. Vowels will consistently get mangled, as you point out with Las Vegas. Names get mangled, both spelling and pronunciation, into various languages, particularly when the place or person doesn't speak the original language. Every major city of Europe has one or more cities named after it in the US, and all of those cities have their pronunciations anglicized. One of my friends grew up near Venice, Missouri, and it took her years to realize that the city was named after a place in Italy, as the cities, even in English, were not pronounced anywhere near the same.
Again, why is Kaufman German? If the Anglicization doesn't make it English, then the modernization doesn't make it other than Old High German.
Compare w:en:List_of_sovereign_states_and_dependent_territories_in_Europe, w:de:Liste_der_Länder_Europas, w:az:Avropa_ölkələrinin_siyahısı and w:lv:Eiropas_valstu_un_atkarīgo_teritoriju_uzskaitījums. Comparing the first and second lists make it clear that English and German disagree on the names of about half of the nations of Europe. An examination of the third and fourth list show that Latvian and Azerbaijani make a habit of changing the spelling of names.
Spellings are changed all the time by law, and language regulators like Académie française change the words for things. Places have different spellings and names depending on the language, and even pronunciations for the same name: /ˈkaʊ̯fˌman/ verus /kaʊfmæn/ for the name you brought up.
I understand that most place names just get adopted as is, with no real nativized pronunciation. But I don't think we can deal with that without recognizing that place names can be as intertangled with their language as any other noun.--Prosfilaes (talk) 01:43, 19 March 2019 (UTC)
Avoiding for a moment the question of what language to consider them, I'd note that they can still be mentioned in etymologies even if they're considered Latin, without saying the e.g. Dutch name is derived from Latin. Lüneburg#German uses the "First mentioned in 956, in Latin, as Luniburc"; a similar approach would be to say something like "from Old High German *Foo,"—(or Old Dutch, or whatever)—"attested in Latin in 632 as Fou". - -sche (discuss) 19:50, 19 March 2019 (UTC)
I suppose so, but the Lüneburg example is exactly what I'm referring to in this question. To me, it seems weird to treat Luniburc as Latin, it doesn't look at all like Latin to me and has no Latin grammatical endings. —Rua (mew) 19:55, 19 March 2019 (UTC)

Old Gutnish[edit]

I don't know if this has been discussed before, but I am wondering what people think of adding Old Gutnish as an etymology-only language, with its parent language being either Old Norse or Gutnish. It shows up in descendants sections of Old Norse entries, and in Gutnish etymologies. However, as I understand, it is a dialect of Old Norse as are Old East Norse and Old West Norse, which do not have their own codes, so I'm on the fence. Julia 04:44, 18 March 2019 (UTC)

Old West Norse and Old East Norse already mentioned in entries using {{label}} and have categories, so I think it wouldn't hurt to add etymology language codes for them. Jonteemil suggested adding them last year, but nothing came of it. I don't think their absence is a good reason not to add Old Gutnish. — Eru·tuon 05:59, 18 March 2019 (UTC)

Words who letters are in alphabetical order[edit]

Do we have a category for words (such as "biopsy, almost, chintz") whose letters are in alphabetical order? SemperBlotto (talk) 13:22, 19 March 2019 (UTC)

I'd expect such a category to be in Category:English terms by orthographic property, but there's only Category:English words that use all vowels in alphabetical order. — Eru·tuon 17:53, 19 March 2019 (UTC)
I don't know if there is a name for these, I call them "alphagram words". Here is a list of a bunch of terms we already have which qualify. - TheDaveRoss 19:16, 19 March 2019 (UTC)
Created a list, though not restricted to English entries if that was what you were thinking of, from the latest dump. — Eru·tuon 19:25, 19 March 2019 (UTC)
Your list is a lot more permissive than mine. You have a slutty list. - TheDaveRoss 19:29, 19 March 2019 (UTC)
Interesting, your list is more permissive in another way, because it allows letters to be repeated. — Eru·tuon 19:53, 19 March 2019 (UTC)
Not hard to do this in Module:en-headword. Would Category:English words with letters in alphabetical order be an okay category name? I suppose Category:English words whose letters are in alphabetical order is clearer. The function here is the one I used for the list above. Might want to exclude words with uppercase letters or with at least two consecutive uppercase letters (acronym-like). — Eru·tuon 19:41, 19 March 2019 (UTC)
Nice lists. What's the longest alphagram word, by the way? Interesting to know according to both ASSes (Alphagram Sluttiness Systems) - the DASI (TheDaveRoss Alphagram Sluttiness Index) and the EASI (Erutuon Alphagram Sluttiness Index). --I learned some phrases (talk) 13:33, 20 March 2019 (UTC)
For DASI; aegilops (which is what Wikipedia lists as the longest) and affinors are the only 8 letter options. Aegilops has the advantage of not having any repeated letters, so it exists in EASI as well. It is also not plural, so it just feels good as the winner. If capital letters are allowed you can add DDMMYYYY to the 8-letter list, but that isn't a word. - TheDaveRoss 14:15, 20 March 2019 (UTC)