Wiktionary:Beer parlour/2019/July

From Wiktionary, the free dictionary
Jump to navigation Jump to search

C'mon guys, it took us 12 days to revert a page move from español to Español. That's really lame. --I learned some phrases (talk) 21:23, 3 July 2019 (UTC)[reply]

I agree. Idea: most pages should be move-protected (admins only), seeing as they should never be moved. Any "basic" word in our best-covered languages (various online wordlists can supply these) or page with multiple languages on it ought to be protected this way. —Μετάknowledgediscuss/deeds 00:54, 4 July 2019 (UTC)[reply]
You may recall that I recommended this a while back, but no one with a bot thought it was worth the trouble. Every now and then I remember to do this when I visit an eligible page, so there are a few hundred done, at least. I added "Well-attested spelling, should not be moved" to the protection-reason menu, so it's really quite easy for any admin to do this while they're doing other stuff on a page- if everybody does a little bit, we can get a lot done.
My philosophy regarding this kind of thing is that you don't need to make everything vandal-proof (if that's possible): the idea is to do lots of little things to make vandalism more of a chore and less rewarding. Extreme countermeasures just increase the emotional rewards to getting around them- subtle and boring is the way to go. Chuck Entz (talk) 04:00, 4 July 2019 (UTC)[reply]
This was not a move out of vandalism but out of ignorance. Looking forward to little things we can do to make ignorance more of a chore and less rewarding :).  --Lambiam 18:05, 4 July 2019 (UTC)[reply]
It seems like some Wikipedias have a feature where edits must be verified before being shown. Could we enable that here? —Suzukaze-c 18:08, 4 July 2019 (UTC)[reply]
My experience with editing on such a Wikpedia was that my edits were reflexively reverted. Can we do some cost–benefit analysis? How serious is the problem of ill-considered edits? The level of actual vandalism seems to be low, compared to the English Wikipedia. The MediaWiki software allows any wiki – if its users want it – to turn on various protective features. There is the feature of semi-protection, which can be applied on a page-by-page basis to high-risk pages – mainly intended for temporary use. Then there is Flagged Revisions requiring edits by unconfirmed editors to be reviewed, with a more selective variant called Pending Changes used on the English Wikipedia. While these will reduce the volume of bad edits, they may also have a chilling effect on productive new editors.  --Lambiam 23:24, 4 July 2019 (UTC)[reply]
We simply don't have enough resources to do it right, and having the feature installed makes it look like we're endorsing any edit that we allow to display. Wikipedias' content is mostly in one language, while ours is in hundreds and potentially thousands, and our patrollers are qualified in only dozens. Most edits would have to be either passed with no scrutiny of content or left in limbo for long periods of time. Chuck Entz (talk) 02:39, 5 July 2019 (UTC)[reply]

Changing "dtp" language name to "Kadazandusun"[edit]

The ISO code "dtp" now refers to Kadazan Dusun as of 2016 according to Ethnologue. It reflects the widely used standard name for the language that has been official since 1995. The official spelling for the language is Kadazandusun actually. I would like to request a change from "Central Dusun" to "Kadazandusun" in Wiktionary. --Tofeiku (talk) 13:40, 5 July 2019 (UTC)[reply]

Ngrams suggests you are correct that "Kadazandusun" (and to a lesser extent "Kadazan Dusun") is more common than "Central Dusun" or the other names Wikipedia mentions (Bunduliwan, Boros Dusun); a Google Scholar search is even more lopsidedly in favour of Kadazandusun (with "Kadazan-Dusun" also quite common). The various references on the language which Glottolog lists seem to use either "Kadazan-Dusun" or just "Kadazan". I support a rename but it will require updating quite a few pages, which I don't think I have time to do right now. - -sche (discuss) 02:53, 11 July 2019 (UTC)[reply]
The native spelling of the language is Kadazandusun but it's up to what's the common spelling for English. Also, Kadazan only refers to Coastal Kadazan and Dusun refers to the Dusun dialect only and not Kadazandusun. The Wikipedia article should be moved to. --Tofeiku (talk) 07:56, 15 July 2019 (UTC)[reply]

Serbo-Croatian[edit]

Yugoslavia ceased to exist and as a consequence, Serbo-Croatian became also defunct. Still, if you type bs (Bosnian), hr (Croatian) or sr (Serbian), the end result is automatically Serbo-Croatian. How could we correct this error? Rajkiandris (talk) 04:49, 6 July 2019 (UTC)Rajkiandris[reply]

This is not some new revelation: it's been discussed and debated here several times. Serbo-Croatian as an official language may be defunct, but the fact remains that the standard forms of Bosnian, Croatian and Serbian are all derived from the same dialect and are all mutually intelligible and similar to the point that it's more practical to treat them as one language, and Serbo-Croatian is the best name for such a language. Chuck Entz (talk) 06:15, 6 July 2019 (UTC)[reply]
Further reading: Tomasz Kamusella, The Politics of Language and Nationalism in Modern Central Europe.  --Lambiam 13:44, 6 July 2019 (UTC)[reply]
@Chuck Entz I have been thinking that it may make sense to treat Kajkavian and Chakavian as a separate language. As can be seen at w:South Slavic languages#Comparison, Kajkavian is extremely similar to Slovene, while Chakavian is in between it and Shtokavian. It makes little sense to call all of these varieties "Serbo-Croatian", Kajkavian especially, while distinguishing Slovene. It would be valuable from an accentological point of view to include Chakavian, which is quite archaic in this respect. —Rua (mew) 11:40, 7 July 2019 (UTC)[reply]
Bad idea. Often one cannot even distinguish Macedonian and Bulgarian, like in the quote чу́тура (čútura) which fits to none of the two modern languages. The editor Константинъ Миладиновъ is according to the Bulgarian Wikipedia a Bulgarian and according to the Macedonian Wikipedia Macedonian. Also I have heard Kajkavian. It is like Berlinern. Probably extremily similar to Slovene like some accents of Russian are extremely similar to Ukrainian. Accentology is a weak argument, it bears little semantical load, and I do not see how splitting languages could help pursuing accentology. Fay Freak (talk) 13:23, 7 July 2019 (UTC)[reply]
I don’t like what @Rajkiandris proposes. It is an error to type “bs”, “hr”, “bs”. One does not type “at” either to find German, so those people who type “bs”, “hr”, “bs” are in error. Everyone knows that this is the same language. It happens often in Balkanized parts of the world that one language is known under multiple names. Confusion arises for remote parts of Africa but for Serbo-Croatian it is patent that it is the same language. Political unity is irrelevant, there is heavy exchange, and if I meet someone in Germany I can easily speak Yugoslav without knowing whether it would be Bosnian or Serbian or Croatian: people call themselves Yugoslav here. Have they not got used to Yugoslavia having broken up? No, they are conscious that a differentiation is over-differentiation. And probably someone speaking Kajkavian would act the same. One can come into the situation of speaking Serbo-Croatian without knowing whether it is Bosnian or Serbian or Croatian, and in most texts it is not distinct which it is: I find texts with words, I see they are Serbo-Croatian, I add words as Serbo-Croatian, I cannot see that it is Croatian or Serbian or Bosnian despite understanding everything, why would I need to undergo the hardship of examining whether a text in Serbo-Croatian is Bosnian or Croatian or Serbian? That’s why it is treated as one language. One should see easily by the text that something is in a language. If you need someone, the place of publication, or diacritics (“an accentological point of view”) to tell you that it is, it isn’t a separate language. Fay Freak (talk) 13:23, 7 July 2019 (UTC)[reply]
Your initial premise is a non-sequitur; Serbo-Croatian was standardized before Yugoslavia existed, and (at least as far as its Shtokavian dialects go) existed as a distinct abstand language long before its standardization. The death of a political entity doesn’t magically make a language extinct. — Vorziblix (talk · contribs) 21:16, 8 July 2019 (UTC)[reply]

Unified Japanese: a new proposal[edit]

Unified Japanese, the format that treats Classical Japanese and Modern Japanese under a single ==Japanese== header has been proposed before, but past proposals were unsuccessful because they failed to distinguish between regular phonological developments (such as 会ふ会う) and morphological changes (such as 変ふ変える). I propose that we apply Unified Japanese only to the former case: if the 文語形 and the 口語形 differ only by 仮名遣い, we unify them under the modern spelling:

会う

Pronunciation
Verb

会う (intransitive)

  1. to meet; to encounter
Conjugation
Conjugation of 会う (五段活用) in Modern Japanese
Conjugation of 会ふ (四段活用) in Classical Japanese
Conjugation of 安布 (四段活用) in Old Japanese

On the other hand, if the difference between the 文語形 and the 口語形 is a morphological one, we give them separate pronunciation sections (while still merging definitions). Note that just as the 口語形 can be spelled in historical orthography, the 文語形 can also be spelled in modern orthography and pronounced in Modern Japanese, so we still benefit from the entry layout of Unified Japanese:

変える

Pronunciation
Verb

変える (transitive)

  1. : to change; to alter; …
  2. , , , : to exchange; to replace; …
Conjugation
Conjugation of 変える (下一段活用) in Modern Japanese
Conjugation of 変ふ (下二段活用) in Classical Japanese
Conjugation of ??? (下二段活用) in Old Japanese
変ふ

Pronunciation
Verb

変ふ

  1. Premodern shūshikei of ()える (kaeru).

What do you think about such a proposal?

By the way, in the examples given above, the kana and rōmaji are moved from the POS headers to the pronunciation section. I think this is a step long overdue: it is more logical (think of kanji entries with multiple etymology sections), and it greatly simplifies the entry layout of entries (especially kango which tend to have multiple POS).

(Notifying Eirikr, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4, Huhu9001, 荒巻モロゾフ, Mellohi!): --Dine2016 (talk) 05:44, 7 July 2019 (UTC)[reply]

I think the example "変ふ" is not an entry but rather a soft redirect, like doth for do. -- Huhu9001 (talk) 06:13, 7 July 2019 (UTC)[reply]
Ah yes, you're perfectly right. I wanted to express that "変ふ" should have its own pronunciation section, but I didn't make myself clear. --Dine2016 (talk) 06:26, 7 July 2019 (UTC)[reply]

@Dine2016: Could you succinctly state the disadvantages, if any, of this new proposal which I do support? --Backinstadiums (talk) 09:29, 7 July 2019 (UTC)[reply]

@Backinstadiums: (1) All stages of the language are treated under the modern spelling, which is anachronistic. For example, Old Japanese apu, Classical Japanese afu, and Modern Japanese au are treated under the modern spelling 会う, but the spelling 会う did not exist during the time of Old Japanese, and before the sound changes concerning ɸ during the time of Early Middle Japanese. On the other hand, Unified Chinese works because Traditional Chinese is applicable to the modern dialects, Middle Chinese, and to an extent Old Chinese. (2) Students of Classical Japanese may benefit more from a Classical Japanese dictionary covering only Old Japanese, Early Middle Japanese and later elements incorporated into this classical written language, instead of a historical dictionary which cover all stages under the modern form in modern spelling. Compare the situation in Japan: Even if there is the popular 広辞苑, there are still many specialized 古語辞典. (3) In the first situation (e.g. 会う), the modern form and the classical form are unified both under the modern spelling; in the second situation (e.g. 変える), the modern form is lemmatized in modern spelling and the classical form (変ふ) is lemmatized in historical spelling. And if you conjugate both forms to for example the ren'yōkei, then the modern form (変え) and the classical form (変へ), then they would be again unified under the modern spelling. This is great inconsistency (although there will be soft-redirects when unified). This problem can be solved if we follow Japanese monolingual dictionaries and lemmatize wago under the modern kana spelling, even for classical forms:
い・ず いづ [1] 【出づ】 [1]
かど・う かどふ 【〈勾引〉ふ・拐ふ】 [2]
か・う かふ 【替ふ・換ふ・代ふ・変ふ】 [3]
Then everything will be in modern spelling, which is fairly close to modern pronunciation. --Dine2016 (talk) 10:09, 7 July 2019 (UTC)[reply]
I have a few concerns about this.
  • Pronunciations and conjugations
Thanks to the 1603 Vocabvlario da Lingoa de Iapam or Nippo Jisho, we have the Japanese of the time transcribed into the Portuguese spellings of the time, giving us a rough approximation of the sound values. These were sometimes substantially different from modern conventions. Consider the modern verb 買う (kau, to buy). The Nippo Jisho entry is here, right-hand-column, second entry down.
Modern 1603
終止形 / Terminal /kau/ /kɔː/
連用形 / Continuative, Stem /kai/
過去形 / Past Tense /katːa/ /kɔːta/
Or consider the modern verb 替える (kaeru, to exchange, to replace), seen here in the Nippo Jisho, right-hand column, second entry down.
Modern 1603
終止形 / Terminal /kaeru/ /kajuru/
連用形 / Continuative, Stem /kae/ /kaje/
過去形 / Past Tense /kaeta/ /kajeta/
Note here that the Terminal form (the so-called "dictionary" or lemma form) differs in 1603 from both the modern かえる (/kaeru/) and the ancient / pre-Ashikaga or Muromachi period かふ (ancient reading */kapu/, pre-1600s reading /kafu/, pre-modern reading /kɔː/, modern reading /kau/).
  • How far back to go
I think it's a mistake to include Old Japanese, for a few reasons. Linking through to the OJP entry is not a problem, but including OJP conjugations in the modern JA entry is too much detail -- we have the OJP language code, and we're already starting to build out our OJP content, so there's no good reason not to put the details in an OJP entry.
I also think it's a mistake to use man'yōgana spellings for OJP lemmata, such as the 安布 example above to spell canonical 会ふ (to meet, to encounter, ancient reading */apu/, pre-1600s reading /afu/, pre-modern reading /ɔː/, modern reading /au/). Man'yōgana spellings were wildly variable, sometimes changing even within a single poem. Also, so far as I know, there isn't any consensus view of what the "most common" man'yōgana spelling would be for a given word. Native Japanese sources generally list OJP terms under the modernized kanji and/or kana spellings. I think we should follow suit.
If we are to include Classical Japanese in our modern Japanese entries, we must explain somewhere prominently and clearly that this is Classical Japanese as found in XXX usage (replacing XXX with whatever time period we decide to target). For instance, if we include Classical Japanese as used today, that differs from the Classical Japanese recorded in the 1603 Nippo Jisho. That difference is (so far as I've studied to date) mainly in pronunciation, but it's an important distinction and we would need to point that out.
Nota bene: I'm not opposed to some key parts of this proposal, particularly 1) unifying pre-modern and modern terms as much as possible, and 2) using kana as the lemma spellings for wago (native-Japanese terms), given the structural constraints of the MediaWiki platform that make it impossible to replicate the functionality of native-Japanese electronic dictionaries (where a single entry may have multiple indexed spellings, any of which will get the user the desired entry). My points above are to argue that, should we unify, we need to be clear about scope (how much to unify, how far back to go), and about how we present the information to users (differences in pronunciation, conjugation, etc.). ‑‑ Eiríkr Útlendi │Tala við mig 17:41, 9 July 2019 (UTC)[reply]
Thanks for your replies.
Pronunciations and conjugations: Yes, you're right. The pronunciation section of 買う should be like this:
I have removed the ambiguous "Classical Japanese" and added specific stages like "Early Middle Japanese" (800-1200) and "Late Middle Japanese" (1200-1600). Similarly, the conjugation section should contain four tables, the table for Modern Japanese listing the terminal form as kau and the past form as katta, and the table for Late Middle Japanese listing the terminal form as /kɔː/ and the past form as /kɔːta/.
Strictly speaking, "Classical Japanese" refers to the classical written language and does not correspond to any particular stage. So Classical Japanese is used up until World War II, while Early Middle Japanese is used during 800 and 1200. And as you noted before, there is a modern pronunciation of Classical Japanese where 買ふ is pronounced , different from Early Middle Japanese where 買ふ was pronounced kafu, and different from Modern Japanese (the spoken language) where it's pronounced kau.
Similarly, the conjugation section of 替える should link to Early Middle Japanese 替ふ and Late Middle Japanese 替ゆ, the former being “Old Japanese and Early Middle Japanese shūshikei of ()える (kaeru).”, and the latter being “Late Middle Japanese shūshikei of ()える (kaeru).” cf. The 替ゆ entry in 精選版 日本国語大辞典
How far back to go: I don't think the presence of the OJP code is a problem. Even with Unified Chinese (zh), we still have code for the sublanguages like Middle Chinese (ltc), Old Chinese (och) and Mandarin (cmn), Cantonese (yue), etc. which can be used in templates like {{bor}}. So codes are no problem. As for where to build content, searching insource:/\|m_kana=/ reveals that we have Man'yōshū quotations under the ==Japanese== header of , , , etc. Given that we have Old Japanese content under both ==Japanese== and ==Old Japanese==, I suggest that we move the latter to the former, in order to show the historical continuity of senses and conjugations, and in line with large kokugo dictionaries like the KDJ.
Using kana as the lemma spellings for wago: Yes. If we go for Unified Japanese, then it's better to use the kana spelling instead of kanji-kana majiribun as the lemma spelling of wago. Because kanji and okurigana usage may change over time, the most common spelling today may not be the most common spelling used over history, but kana is consistent throughout. As for using modern kana orthography (e.g. lemmatizing Modern Japanese 替える at かえる, Late Middle Japanese 替ゆ at かゆ, but Early Middle Japanese and Old Japanese 替ふ at かう), that's for consistency (for example, the etymological relationship between 買う and 替ふ is clear if both are lemmatized at かう). --Dine2016 (talk) 07:30, 10 July 2019 (UTC)[reply]
I fear that I do not know enough about historical stages of Japanese to have a strong opinion on the matter at the moment. —Suzukaze-c 09:47, 10 July 2019 (UTC)[reply]

@Dine2016: "And as you noted before, there is a modern pronunciation of Classical Japanese where 買ふ is pronounced kō, different from Early Middle Japanese where 買ふ was pronounced kafu, and different from Modern Japanese (the spoken language) where it's pronounced kau"

What is that modern pronunciation called? Neoclassical? --Backinstadiums (talk) 13:24, 10 July 2019 (UTC)[reply]

I don't think it has a name, though it is taught in many Classical Japanese textbooks published in Japan. --Dine2016 (talk) 05:05, 11 July 2019 (UTC)[reply]

@Dine2016: According to Prof. Victor Mair,

From my colleague Linda Chance, who is a specialist on Classical Chinese, the technical term for this is ハ行転呼音・はぎょうてんこおん.

It refers to the fact that from sometime in the Heian period the "ha" line changed to the same pronunciation as the "wa" line, but the "ha" line spellings continued in use. (Interesting examples--if you write these in modern Japanese with 'u' for 'fu,' 惟うに is still pronounced omō ni, but 失う becomes "ushinau" (except in some dialects.) This "modern pronunciation" is potentially centuries old. We read classical texts this way because we can't retrieve that original early Heian pronunciation. --Backinstadiums (talk) 14:19, 12 July 2019 (UTC)[reply]

@Backinstadiums: Thank you for your research, but ハ行転呼音 only accounts for the change of "ふ → う". For example, 今日 is originally pronounced けふ (as shown by the historical spelling) so by ハ行転呼音 it becomes けう, but it's now pronounced きょう, which means that there is another sound change which changed けう into きょう. In fact, if you compare historical spelling and modern spellings, you'll find that after ハ行転呼音 changes ふ into う, this う fused with the preceding vowel to make a long sound:
Historical spelling After ハ行転呼音 Modern spelling
あう あう おう
あふ
いう いう ゆう いふ
えう えう よう
えふ
おう (ou) おう (ou) おう (ō)
おふ (ofu)
I suspect the sound change that turns the second column into the third column is called “/Vu/ monophthongization”.
For Modern Japanese verbs ending with the vowel combination あう or おう, the う is treated as a separate element from the verb stem. For example, 思う is pronounced omo-u instead of omō, and 会ふ stopped at あう and instead of evolving into おう. The "Neoclassical" pronunciation may be just a hypercorrection, an over-application of “/Vu/ monophthongization” to classical verbs ending with (あ)ふ or (お)ふ. Another possibility is that the "Neoclassical" pronunciation is a descendant of Late Middle Japanese. As Eirikr noted above, 買う was pronounced in 1603 as /kɔː/, so in Late Middle Japanese the verb-final う was not treated as a separate element from the verb stem. The Neoclassical pronunciation may simply have followed that. --Dine2016 (talk) 16:02, 12 July 2019 (UTC)[reply]
@Dine2016: According to David Lurie:
The technical term for these changes is tenko-on 転呼音, but they are not applied consistently in words where the first mora ends in 'a.' I don't know if there is a specific term for those exceptions. --Backinstadiums (talk) 23:07, 15 July 2019 (UTC)[reply]

@Wyang What do you think of the proposal above? I really hope someone with a bot can carry out changes to the Japanese entry layout, for example moving the reading to the pronunciation section. --Dine2016 (talk) 06:23, 14 July 2019 (UTC)[reply]

As before, I don't think wago should be lemmatised on kanji-containing forms for modern Japanese. But I'm not informed enough about Old/Classical/Middle Japanese to know whether this proposal is the most appropriate solution. I really encourage you to create a bot and test it out; it is not complicated. Wyang (talk) 08:43, 14 July 2019 (UTC)[reply]

Pali transliteration of Nikkahita and velar nasal[edit]

This relates to the transliteration of non-Roman text to the Roman script.

The issue is that the choice in writing between a nigghita and the velar nasal ('nga') is not always the same way as it is when writing unlocalised Pali in the Roman script. I would like confirmation that I am applying the correct principle.

My principle is that where the writing system uses distinct symbols for the two, the transliteration should reflect which character was used.

The Burmese and Tai Tham scripts have a special form of nga that sits above the normal layer of base characters. It is called 'kinzi' for the Burmese script, and 'mai kang lai' for the Tai Tham script. I am not distinguishing between them on one hand and ordinary nga on the other in transliteration. A complication is that some writing styles use mai kang lai where the usual Roman spelling would write a niggahita (ṃ) before non-plosives. -- RichardW57 (talk) 00:36, 9 July 2019 (UTC)[reply]

Standard German IPA[edit]

I'm finding that there's a lot of variation in the IPA transcriptions for german words and some standard should be settled on so that it's consistent across entries. This is mainly for the case of syllable-final r's, should they be transcribed as /ɐ/~/ɐ̯/ or as /ʁ/? Both can be seen. Either, I think, the most widespread vocalic pronunciation /ɐ/ should be used in all these cases or the more phonemic /ʁ/ should be used as a more unifying transcription allowing it to represent both those dialects which do not reduce it to a vowel in this position and those that do, for the latter just applying the allophonic pronunciation (which could be included alongside in square brackets if desired).

My feeling is maybe to go with the latter but I think it should be discussed. It is sort of like how for french entries one pronunciation is usually listed unless there is an unpredictable regional pronunciation in Canada or Belgium or Louisiana etc., the point is that the other accents' pronunciation are predictable given the base transcription.

I also sometimes see /ʀ/ being used both syllable-initially and -finally, I would say that only /ʁ/ should be used though as the trilled is a regional variant.

Finally, one last option is suppose would be to treat German more like English and list two or more pronunciations qualified by region (northern and southern? northen and Austro-bavarian?) I feel like this could be over-the-top though. Please let me know your thoughts and let's decide on some sort of standard so there can be consistency! 2WR1 (talk) 06:20, 9 July 2019 (UTC)[reply]

A small prior discussion of this is at Wiktionary talk:About German/Archive_1#R, where one standard was proposed. Probably 'non-standard' transcriptions will continue to be entered, no matter what we choose, by people who either learned different standards or are basing their additions on (/copying from) works using different standards (de.Wikt vs the Duden, etc). - -sche (discuss) 02:59, 11 July 2019 (UTC)[reply]
I have a feeling that some of the inconsistency is due to a policy change at de.wikt which wasn't applied here (/ʁ/ instead of /ʀ/), so a lot of the /ʀ/ we have in our IPA were probably copied from before the policy change. We could just run a bot to apply the same changes here. Jberkel 21:41, 11 July 2019 (UTC)[reply]
@-sche @Jberkel There have been standards established for things like English and French (i.e. /aɪ/ instead of /aj/ etc., /ɹ/ instead of /r/) and those are followed pretty well. i think something should just be decided on so there is a standard adn if people don't always follow it precisely, it can be fixed up easily. Maybe a module like the fr-pron one should be made, in that case a standard would be really needed. I think there's arguments in different directions but maybe it should be discussed and decided, it feels a bit sloppy to have it be inconsistent. It would be a good idea as a start at least to remove all instances of /ʀ/ though. 2WR1 (talk) 02:02, 14 July 2019 (UTC)[reply]
@-sche @Jberkel @2WR1 I don't like having /ɐ̯/ in the phonemic representation for final-R (e.g. mehr is given as /meːɐ̯/). Whatever is used as initial-R should be the same as final-R. But the initial-Rs are also inconsistent across entries: currently Reh has /ʀeː/, Rache has /ˈraxə/ and Rahm has /ʁaːm/. /mof.va.nes/ (talk) 18:15, 18 July 2019 (UTC)[reply]
@Mofvanes Exactly my point with the variability with the initial r, I think it's something that should be standardised and discussed, but unfortunately it seems that not many others are too interested in this, haha. In regards to the final r, I see what your saying, but as the actual pronunciation in many dialects is to vocalise the final r, I think that should be represented to avoid confusion. Maybe for this a /.../ transcription followed by a [...] transcription with the vowel would be best. But I don't think anything can be established/implemented if no more people weigh in... Thanks for your response though! 2WR1 (talk) 01:25, 1 August 2019 (UTC)[reply]

Visibly Untrue Pali Etymologies[edit]

A lot of Pali etymologies contain "From {{inh|pi|sa|...}}". The problem with this text is that the template (plus modules) does not expand "sa" to "Proto-Indian Indo-Aryan", but expands this to a hyperlink to "Sanskrit", which defines "Sanskrit" with the normal meanings of the term. These words are not inherited from Sanskrit in the normal sense of the word 'Sanskrit'.

How should we fix it so that what is presented to the reader is not a lie? I'm charitably assuming that anything parsing the links should understand that 'sa' as the source parameter of {{inh}} refers to a reconstructed language.

The best I've come up with is to use "Cognate with {{inh|pi|sa|...}}" instead. -- RichardW57 (talk) 07:24, 9 July 2019 (UTC)[reply]

Can you type a qualifier in front? From (reconstructed / Proto / ?) {{inh|pi|sa|...}}. Or should we create a new etymology-only code for this? DTLHS (talk) 16:11, 9 July 2019 (UTC)[reply]
Perhaps that is the answer, e.g. "From Pre-Sanskrit याति (yāti), from Proto-Indo-European *yeh₂-" currently at yāti#Pali. Another possibility would to prefix something like "From recent ancestor of ". Presumably there was a good reason for not wanting to have an explicit intermediate stage between Sanskrit and Proto-Indo-Aryan. As Wiktionary makes it difficult for European dilettanti to look up Sanskrit words, I have been wondering if I should add a template for this case in which the editor only has to enter the IAST transliteration. There's also the case where the common ancestor is clearly different to Sanskrit.
For old enough words, should we also trace the ancestry back to PIE in the Pali entry, or expect the interested user to click on to the Sanskrit entry? — This unsigned comment was added by RichardW57 (talkcontribs) at 03:24, 9 July 2019.
It’s a longstanding convention, which implicates the grammarians’ conversion schemas (which themselves have somewhat stylized Pali and the other “high” Prakrits beyond the actual MIA vernaculars) and generally makes life simpler, except when forms that cannot be synchronically derived pop up, which is not uncommon. Sanskrit indiscriminately collapsing thorn clusters into >ks is one of the most irritating. Hölderlin2019 (talk) 18:22, 14 July 2019 (UTC)[reply]
You can find, many, many discussions on this, like this one here. Right now, we treat Sanskrit as a dialect continuum. --{{victar|talk}} 18:18, 16 July 2019 (UTC)[reply]

Russian consonant voicing assimilation[edit]

Many Russian entries, especially those of words with long consonant clusters, don't seem to have the consonants assimilated to their actual pronunciation. For example, currently the pronunciation for исправлять is [ɪsprɐˈvlʲætʲ] instead of [ɪzbrɐˈvlʲætʲ]. Note that at the same time the vowels are phonetic not phonemic. Adding to the confusion, voicing assimilation sometimes is reflected in the IPA, for example совсем [sɐfˈsʲem], and нож [noʂ].

The pronunciations given are all correct, as spoken. Better go listen to more Russian. Fay Freak (talk) 00:20, 11 July 2019 (UTC)[reply]

Priority of terms already used in definitions[edit]

For example, at second hand is used in the definition of apud, or on-topic in germane's, therefore adding such terms is especially pressing --Backinstadiums (talk) 17:46, 11 July 2019 (UTC)[reply]

If they are entryworthy.
  1. on-topic”, in OneLook Dictionary Search. suggests that most dictionaries find that it has no meaning apart from on + topic, using a standard technique (a hyphen insertion) to prevent alternative readings (ie, a prepositional phrase).
  2. Looking up second hand should fully address any dictionary user's uncertainty about the meaning of at second hand. Unfortunately we don't seem to have an entry as good as MWOnline (4 definitions) for it.
DCDuring (talk) 19:23, 11 July 2019 (UTC)[reply]
@DCDuring: thanks for replying. In any case, they're to be dealt with before the rest as they're already being used in entries, even more so if just to complicate matters --Backinstadiums (talk) 19:57, 11 July 2019 (UTC)[reply]
I think you did not understand the reply. Consider the phrase “at room temperature”, used in the definition of water. We have no entry for this phrase, for a very good reason: we do have entries for at and for room temperature. For the rest it is a matter of X + Y = Z.  --Lambiam 10:58, 12 July 2019 (UTC)[reply]
There is another way to deal with them, which is to either amend where the red link is pointing (point all of "at second hand" to "second hand") or modify the bracketing ("[[at second hand]]" to "at [[second hand]]"). Cleaning up a red link is not always creating an entry, if the entry should never exist then the link should point to an entry or entries which should exist or do exist. - TheDaveRoss 12:29, 12 July 2019 (UTC)[reply]
@Lambiam, TheDaveRoss: Correct me if I am wrong, but the preposition at governs the noun secondhand, whose entry does not show any nominal meaning. Regarding second hand, the only noun meaning reads: On a clock or watch, the hand or pointer that... which is not the meaning inteded for at second hand. Then how is it that X + Y = Z? --Backinstadiums (talk) 13:28, 12 July 2019 (UTC)[reply]
Correct. To repeat myself: "Unfortunately we don't seem to have an entry as good as MWOnline (4 definitions) for [ second hand ]." DCDuring (talk) 15:01, 12 July 2019 (UTC)[reply]

Use of the heading "Synonyms" on the pages of unbound morphemes.[edit]

By happenstance, I have recently noted that the heading "Synonyms" is often included on the pages of unbound suffixes, and want to take this opportunity to highlight what I believe to be the impropriety thereof. The term "synonym" only applies to lexemes; only a lexeme may be synonymous with another lexeme. I understand an unbound suffix, however, to be a morpheme, rather than a lexeme, and a morpheme cannot represent a synonym. I am of the thought, then, that the term "analogue" is a better one for describing two morphemes such as two suffixes, which are near in meaning or effect, and that the heading "Analogues" is preferable to "Synonyms" on the pages of unbound morphemes. I would like to begin a discussion here, to test whether there can be any consensus regarding the use of "Analogues" as opposed to "Synonyms" on such pages as a matter of policy. I am not a Wiktionarian (yet), meaning that I have no Wiktionary account. My name is Michael, and I look forward to reading the thoughts of all you Wiktionarians about this. — This unsigned comment was added by 68.162.223.164 (talk) at 18:46, 11 July 2019 (UTC).[reply]

We have the header "Coordinate terms", if that's what you mean. DTLHS (talk) 18:51, 11 July 2019 (UTC)[reply]
  • I suspect that the strict formal limitation of synonym to apply only to fully independent lexemes is not well known by most English-language readers, our target audience. It's certainly not a restriction of usage that I'm acquainted with, as an educated native speaker of English. Conversely, I think that most English-language readers are familiar with the sense of synonym as in "these things have roughly the same meaning". I also think that few readers will understand what is meant by an "Analogues" header.
As such, I cannot support the suggested change: it will likely confuse users. ‑‑ Eiríkr Útlendi │Tala við mig 19:53, 11 July 2019 (UTC)[reply]
Many affixes have meaning and are lexemes. As I understand it, inflectional affixes are not lexemes. IOW, I don't think morphemes and lexemes are disjoint categories, as the complaint above seems to imply. DCDuring (talk) 20:43, 11 July 2019 (UTC)[reply]
-er is synonymous with more, yet one is an inflectional affix and the other is an independent word. —Rua (mew) 21:29, 11 July 2019 (UTC)[reply]
To confirm, what I hear you @68.162.223.164 saying is that synonym can only apply to independent words, whereas @DCDuring, Rua, you seem to be saying that synonym can and should apply to affixes as well as independent words. Is this a correct restatement? ‑‑ Eiríkr Útlendi │Tala við mig 21:42, 11 July 2019 (UTC)[reply]
Yes. I should have said I agree with your user-oriented arguments, which are more important than the definitional matters, about which I could be wrong, eg, about -er. I don't view comparative, superlative, diminutive, and natural gender affixes as on all fours with case, grammatical gender, number, tense, mood, and aspect inflectional affixes, but I am not speaking from a position of multilingual learning. DCDuring (talk) 01:21, 12 July 2019 (UTC)[reply]
The meaning of lexeme may seem clear when considering a single language, but viewed across the spectrum of human language it becomes fuzzy. In Turkish, basically the same entity can be a stand-alone word one instant but turn effortlessly into a suffix the next one (e.g., ile-le). To me, a “name” is any term of some language to which we can assign some meaning, and a synonym is then, to me, any other name in that language with the same or very similar meaning. From this point of view that name will smell as sweet, whether it is a single word, a phrase, or a morpheme.  --Lambiam 10:48, 12 July 2019 (UTC)[reply]
DCDuring is correct in noting that morphemes and lexemes are not disjoint as categories. Morphemes, however, fall into two general types: those which may stand alone and so are called "roots", and those which depend upon combination with another morpheme to express an idea, then called "affixes". "Roots" may indeed join with lexemes in being referred to as synonyms, but I think that affixes may not, since they serve only a grammatical function. Of course, the suffixes which are the instant topic of conversation represent the second type. Even so, the argument that general user accessibility is more important than a strict adherence to definition, especially within a generalized resource such as Wiktionary, has great validity, and so probably ends the debate. I was simply unsure of whether the use of the header term had been considered in such instances, and was not at all thinking like a lexicographer. Thanks, guys.

Wikidata feedback[edit]

Dear Wiktionary community, you are the only community that I am aware who asked for enabling Wikidata access on this Wiktionary. I am currently preparing a presentation that I will give at the next Wikiconvention francophone that will be held in Brussels at the beginning of September. I want to talk about the relationship between Wiktionary and Wikidata. I have the feeling that it started in a bad way, at least from the French point of view.

So I would like to have your feeling about Wikidata in general and more specifically about your experience with Wikidata. How did you use Wikidata on the English Wiktionary so far and how you would like to use Wikidata, if you want to, here in the future. If some of you contribute on the lexicographic data on Wikidata, I would also be interested in your feedbacks.

Thanks in advance. I am eager to read you :D Pamputt (talk) 22:07, 13 July 2019 (UTC)[reply]

My only concrete experience was that it enabled a large number of images to be added to taxonomic name entries, without captions; sometimes in simple error; and almost always not in accord with the notion of trying to select images of type species for entries for genera. The Wikidata project box is displayed much too prominently in view of the limited value of Wikidata to ordinary dictionary users. DCDuring (talk) 03:06, 14 July 2019 (UTC)[reply]
@DCDuring thank you for your comment. Could you give a link to a page where this Wikidata project box is used? Pamputt (talk) 07:19, 14 July 2019 (UTC)[reply]
See Special:WhatLinksHere/Template:wikidata. DCDuring (talk) 12:48, 14 July 2019 (UTC)[reply]
Also Special:WhatLinksHere/Template:Wikidata entity link.
I suppose that if the reward from following the links were sufficient, we would redesign the templates. I would prefer {{Wikidata entity link}} for my purposes. DCDuring (talk) 12:55, 14 July 2019 (UTC)[reply]
One of the more frequent uses of Wikidata in entries: The data tables for most languages (7764 out of 8069 as I write this) and language families include the Wikidata item. This in turn is used by the Language:getWikipediaArticle() function in Module:languages, the EtymologyLanguage:getWikipediaArticle() function in Module:etymology languages, and the Family:getWikipediaArticle() function in Module:families to retrieve the name of the Wikipedia article, so that etymology templates such as {{derived}} can display linked language or language family names. — Eru·tuon 05:15, 15 July 2019 (UTC)[reply]

Not tested here, but a possibility is to obtain taxonomic hypernyms, see w:ca:User:Vriullop/proves/Balaena mysticetus. --Vriullop (talk) 09:06, 17 July 2019 (UTC)[reply]

How does Wikidata handle conflicts between sources, eg, Paleontology Database vs. NCBI or APG vs Ruggiero et al? DCDuring (talk) 15:58, 17 July 2019 (UTC)[reply]

Hāʾ-like letters in the Kurdish languages[edit]

In the Kurdish languages on Wiktionary (ckb [Central Kurdish], kmr [Northern Kurdish], ku [Kurdish], lki [Laki], sdh [Southern Kurdish]), the letters that resemble Arabic ه (hāʾ) need some standardization, because they are not used according to the recommendations in the Wikipedia section on the Sorani alphabet. See a census of the frequency of the letters in various link templates below.

The Wikipedia section indicates that the letter ە (U+06D5 ARABIC LETTER AE) is preferred as a vowel and ھ (U+06BE ARABIC LETTER HEH DOACHASHMEE) as a consonant. These two are visually distinct in all positions, whereas ه (U+0647 ARABIC LETTER HEH) resembles ە (U+06D5 ARABIC LETTER AE) in isolated or final position. This practice seems to be followed on the Southern Kurdish Central Kurdish Wikipedia.

On Wiktionary, ه (U+0647 ARABIC LETTER HEH) is often used on Wiktionary for both the consonant h and the vowel e. As a vowel, it is usually followed by U+200C (ZERO WIDTH NON-JOINER, ZWNJ), which forces the joining behavior of the correct vowel character ە (U+06D5 ARABIC LETTER AE), which is right-joining (that is, it joins to the preceding letter if possible, but not to the following letter). The transliteration modules assume that all cases not followed by ZWNJ are the consonant.

ه
(U+0647 ARABIC LETTER HEH)
ه‌
U+0647 (ARABIC LETTER HEH),
U+200C (ZERO WIDTH NON-JOINER)
ھ
(U+06BE ARABIC LETTER HEH DOACHASHMEE)
ە
(U+06D5 ARABIC LETTER AE)
ckb (Central Kurdish) 344 283 76 1002
kmr (Northern Kurdish) 0 0 0 0
ku (Kurdish) 2397 2100 2 117
lki (Laki) 3 0 10 76
sdh (Southern Kurdish) 42 35 25 201

On Wikipedia, the three Kurdish transliteration modules Module:ckb-translit, Module:lki-translit, and Module:sdh-translit include ە (U+06D5 ARABIC LETTER AE) as a vowel in their tables, but ه (U+0647 ARABIC LETTER HEH) as a consonant. They also treat the sequence of ه (U+0647 ARABIC LETTER HEH) and U+200C (ZERO WIDTH NON-JOINER, ZWNJ) as equivalent to ە (U+06D5 ARABIC LETTER AE).

I propose replacing the sequence of ه (U+0647 ARABIC LETTER HEH) and U+200C (ZERO WIDTH NON-JOINER, ZWNJ) with ە (U+06D5 ARABIC LETTER AE), and ه (U+0647 ARABIC LETTER HEH) on its own with ھ (U+06BE ARABIC LETTER HEH DOACHASHMEE) in all Kurdish text on Wiktionary. The first change could be done immediately, but the latter would require modifying the transliteration modules to recognize ھ (U+06BE ARABIC LETTER HEH DOACHASHMEE). It would be easiest to make these changes by bot, because there are a lot of entries to edit. (User:Erutuon/bad ZWNJ contains some of them.)

What do other editors think about the proposed edits, particularly those who work on Kurdish? — Eru·tuon 21:30, 15 July 2019 (UTC)[reply]

You write This practice seems to be followed on the Southern Kurdish Wikipedia, linking “ckb”, while the Southern Kurdish Wikipedia is seemingly in the incubator. Fay Freak (talk) 21:59, 15 July 2019 (UTC)[reply]
@Fay Freak: Whoops, corrected. Thanks for pointing that out. — Eru·tuon 22:12, 15 July 2019 (UTC)[reply]
@Calak Fay Freak (talk) 21:59, 15 July 2019 (UTC)[reply]
Nice job Eru! Please do few edit tests to check everything is OK. Thanks.--Calak (talk) 20:12, 3 December 2019 (UTC)[reply]
@Calak: This seems like a task that will be annoying if I don't do it automatically, so I've started a vote to allow me to use a bot (User:ToilBot). I'll work on making some test edits, hopefully soon. — Eru·tuon 19:46, 7 December 2019 (UTC)[reply]
A sample of changes that the bot would make can now be viewed at User:ToilBot/edit logs/2019-12/Kurdish hāʾ. — Eru·tuon 22:36, 29 December 2019 (UTC)[reply]

Further etymologies of borrowed terms[edit]

@Julia has repeatedly removed most of the etymology section from Riosi, saying "the long ety is redundant esp because it's a borrowed term". Is it an actual policy not to include the ultimate origin of borrowed terms? Does this only apply to non-English languages, since most English entries do have this? --Lvovmauro (talk) 05:57, 16 July 2019 (UTC)[reply]

The long etymology should be there for etymological categorizations. Theoretically each entry is independent from other language entries of the same spelling, and nothing is “redundant”. — TAKASUGI Shinji (talk) 10:55, 16 July 2019 (UTC)[reply]
I agree with keeping the long etymology. I wonder, will we be able to use cross-page fetching in the future to avoid so much typing, inconsistencies, and gaps? It would be nice in the case of, for example, the etymology of an English term borrowed from French, derived from Middle French then Old French then Latin then Greek. Ultimateria (talk) 22:44, 16 July 2019 (UTC)[reply]
Yes and no. There might be an interest to include the full etymology but the only real reason I see is the categorization “terms derived from X”, since one can just click once to see more. But a common-sense approach is not to let define your whole formatting by that. The categorizations will not lead me into adding a lot of material that I deem otherwise to belong to an other entry. It’s not too thrilling to derive an Arabic term from a Proto-Indo-European root and the statistics are skewed anyway. All is a click away and it’s normal to refer people to other loci instead of repeating oneself, in anything written whether book or website. One makes it otherwise if the term in question is not created yet; I sometimes keep a lot of information that belongs more closely to a certain language at another language because only this language I create, the other language I do not work in: the material is to be moved when someone creates the entry. And there is even less reason to add a random selection of cognates or “somehow related terms” if the ancestors are already created. Note that the etymology chains are not everywhere easy and secure though it seems to be so in Riosi: There are often assumptions in etymologies that can later change. What if the ancestral word for X will be reconstructed a wee bit differently after some time, what if one adds additional data? Update the many terms that derive some word from Spanish? Or what is if something is just mistyped? We have the typos copied X times then but the person who corrects it might not realize that the content is copied to X other places. Also, every template invocation takes RAM and makes the page fetching slower.
Do not derive rules from how you observe the entries to be. The English entries are more wrong than the other entries because they have been created earlier, and they contain for the reason I just stated – foreign language work, not to speak reconstructon work, to a great part started only after English was complete – a lot of material that since has got a better place. Think rather about durability of the work, maintainability. Stating a full chain is not really a requirement for anything because the statement of where the Spanish derives from is already there, the chain is there cross-page. Fay Freak (talk) 00:19, 17 July 2019 (UTC)[reply]
+1 to "a common-sense approach is not to let define your whole formatting by that [categorization]". It's a real shortcoming out our current system that people feel compelled to engage in such massive duplication, with all its potential for things falling out of sync and becoming contradictory, just for categorization - the tail wagging the dog, to use a phrase DCDuring sometimes uses. - -sche (discuss) 22:30, 17 July 2019 (UTC)[reply]
I've always been in support of removing redundant deep etymologies and moving to a better technical solution, like {{Module:term etymology}}. --{{victar|talk}} 23:00, 17 July 2019 (UTC)[reply]

Korean classifiers / counters / measure words in noun entries?[edit]

I'm back in Korea after seven years so I'm brushing up a bit on my Korean, and using Wiktionary of course.

I've noticed that we don't seem to list the classifier(s) for nouns in Korean entries like we do for other East- and Southeast Asian languages.

Do we lack a mechanism? If we have it, can you show me some noun entries which use it? If we do lack it can we add it?

Also I'm interested in a way to request the classifiers for particular nouns. I'm assuming I can use the {{attention}} template if need be. — hippietrail (talk) 13:23, 16 July 2019 (UTC)[reply]

So I've found that Module:ko-headword supports a "counter" parameter which adds a counter to a noun headword.
There aren't many Korean counters that have entries though and the couple I found seem to use different formatting. One uses {{ko-noun}} and another uses {{head}}. The module doesn't support a "counter" POS. I can't find any per-counter categories like there are for other languages. — hippietrail (talk) 12:09, 17 July 2019 (UTC)[reply]
Wikidata Logo
Wikidata has structured data related to:

Wikidata

Greetings. I'm interested to know what the community thinks of this template, particularly the usage of this template for Han character entries under the "Further reading" header. KevinUp (talk) 13:30, 17 July 2019 (UTC)[reply]

It looks like an advertisement for a barcoded product of some kind. Something more like {{pedialite}} should be made available. DCDuring (talk) 16:22, 17 July 2019 (UTC)[reply]
(Q3595028) is just for inline(?) use or embedding in, eg, tables, not for display under "References" or "Further reading", as {{comcatlite}}, {{specieslite}}, and {{pedia}} are. DCDuring (talk) 16:29, 17 July 2019 (UTC)[reply]
Something analogous to {{projectlink|quote}} and {{projectlink|source}}. {{projectlink|data}}?  --Lambiam 21:28, 17 July 2019 (UTC)[reply]
Update: I have created {{wikidatalite}} as a replacement for {{wikidata}}. KevinUp (talk) 15:22, 18 October 2019 (UTC)[reply]
Well, I like it but that should probably be obvious. I think that if we have more than a couple of sister project links, we should probably have a local equivalent of w:en:Template:Sisterlinks. —Justin (koavf)TCM 19:42, 18 October 2019 (UTC)[reply]

Accent in initialisms[edit]

I've just come across SUV and noticed no accent(s) is indicated. --Backinstadiums (talk) 15:45, 17 July 2019 (UTC)[reply]

I’ve added them and also removed the long sign on the /u/, as I think this vowel is not pronounced extra long in normal speech.  --Lambiam 19:32, 17 July 2019 (UTC)[reply]
@Lambiam: The length mark doesn't mean the vowel is especially long; it's just part of the conventional symbol for that phoneme in Received Pronunciation (see Appendix:English pronunciation). — Eru·tuon 22:42, 17 July 2019 (UTC)[reply]

Non-existent translation hubs[edit]

In Mesoamerican cultures, the comal is traditionally elevated above the fire by three stones, and each Mesoamerican language has a word for these stones. The problem is, there is no English term for it. (Our entry for the Spanish term tenamaste glosses it as "hearthstone" which is not correct.) If this was the Spanish wiktionary, we could just put these translations under tenamaste, but because this is the English wiktionary we need to find a common English expression for it, and if there isn't one, then.. what? We just can't share this information?

--Lvovmauro (talk) 09:11, 19 July 2019 (UTC)[reply]

Guess people make up an English term then. Like yes-no. (A bad title, no page views. It was in a template before placed at each language like at هَل (hal)). Fay Freak (talk) 10:21, 19 July 2019 (UTC)[reply]
Comal riser?  --Lambiam 12:38, 19 July 2019 (UTC)[reply]
Not bad. We also need something for سُلَاف (sulāf), where I collected some terms because I found no English – for wine jargon we miss terms. I searched and searched and found all kinds of words from Germany eastwards but no English one, unless you consider Ausbruch or aszú an English word, which is a notion odd to me; such terms seemingly only refer to the wines of the specific regions anyhow, not to the genus. Greek-English and Latin-English dictionaries only explaining the terms but not giving equivalents is a strong indication that there is no English term at least for England’s greatest times. Fay Freak (talk) 17:40, 19 July 2019 (UTC)[reply]
Ausbruch and aszú are botrytized wines. Does that also hold for سُلَاف?  --Lambiam 10:18, 20 July 2019 (UTC)[reply]
The term Ausbruch for wine is older than botrytization of wines, as also most of the other German terms I gave, you cannot even attest the terms “noble rot” and “Edelfäule” before 1830. In EU times some people managed to narrow down the meaning by legal means, so it has on Wikipedia a crippled definition useless for dictionary purposes about “a certain wine from Austria according to the Austrian wine statute”. You see the quote I gave is … pre-Islamic. Fay Freak (talk) 20:56, 20 July 2019 (UTC)[reply]
Well, "Ausbruch" has the advantage of being used in English of wines from several countries (Austria, Germany and Hungary at the least, but I suspect there are American producers), whereas e.g. sulaf is not used in English at all of anything and is barely even mentioned as an Arabic word... even if Ausbruch#English is not a perfect fit, it might be the best available English entry, unless e.g. the Latin word is also attested in English. Of course, if it's not a good fit, something like "(sweet) wine from unpressed grapes" seems like a fairly concise, good name for a translation hub. - -sche (discuss) 03:29, 26 July 2019 (UTC)[reply]
I'd say the word is "tenamaste". It's marginal in English; google books:tenamaste stones shows a few hits, but people are going to argue about italics. But I'm pretty sure with enough searches you could get the clear 3 cites needed for English.--Prosfilaes (talk) 05:21, 20 July 2019 (UTC)[reply]
Yeah, I was also going to say, my first reaction would be to try to cite tenamaste or a major indigenous language's term in English; I'll try that in a second. Btw, if the definition of Spanish tenamaste needs to be fixed, let's remember to do that, too. I have seen a couple foreign-language entries collect other languages' words in their "see also" sections for lack of somewhere else, but this does seem a bit substandard; at this point, with translation hubs on firmer footing around here, we should probably try to find such entries and see about making translation-hubs. - -sche (discuss) 00:49, 26 July 2019 (UTC)[reply]
So far I've found two citations that don't italicize the word and one which probably doesn't (because it seems to have been written on a typewriter...) but where the snippet Google shows doesn't include enough of the page to reach the word. I've created an English entry and added Lvovmauro's translations. The alcoholic terms mentioned by Fay still need to be sorted out. - -sche (discuss) 01:22, 26 July 2019 (UTC)[reply]
Btw, a language mentioned here apparently uses yoxec, but I can't see enough of the snippet to tell which language. - -sche (discuss) 01:12, 26 July 2019 (UTC)[reply]
Jacaltec. The full text is here: [4] --Lvovmauro (talk) 04:09, 26 July 2019 (UTC)[reply]
Another, possibly less circuitous solution would be to allow translation sections in other languages for terms with no English equivalents. We could just choose a reasonable language-term as the host page and either transclude that page's translation section or point to it so that we avoid syncing issues and folks can still find the information. We are allowed to break our own rules if they prevent us from presenting useful information. - TheDaveRoss 12:45, 26 July 2019 (UTC)[reply]
I don't disagree with your last point, but now that WT:THUBs are on firmer footing, I think we should be able to just use them as the preferred approach. Short English "names" for both things mentioned above proved easy enough to come up with. - -sche (discuss) 17:40, 26 July 2019 (UTC)[reply]
It does confuse things when we have English sections on pages for things which are not English words. Also very few people will ever land on the page unless they search one of the translation words, so we are just adding an extra step without adding any real benefit to the user. - TheDaveRoss 18:09, 26 July 2019 (UTC)[reply]
Anishinaabe: This page also needs to be addressed. DTLHS (talk) 04:40, 27 July 2019 (UTC)[reply]
Pretty sure it's attestable in English, so the translations should be moved to the English entry. This is the name that was used for them (in English) when I was in school. Andrew Sheedy (talk) 05:05, 27 July 2019 (UTC)[reply]
Yeah. Given that someone added a translation into Ojibwe ... to the ==Ojibwe== entry's ====Translations==== ... I'm going to guess whoever added the table didn't even notice that it wasn't an ==English== section. - -sche (discuss) 22:23, 27 July 2019 (UTC)[reply]

Oriya/Odia[edit]

The new official name of the Oriya language is Odia. Is there a way we can change the language name in the Translations section? — This unsigned comment was added by Rajkiandris (talkcontribs) at 05:01, 20 July 2019 (UTC).[reply]

Using this search: "Oriya insource:/Oriya\:\ /" I identified only 857 entries to be changed. At a rate of one entry every 15 seconds, it would only take a bit more than 4 hours to change the name manually. DCDuring (talk) 03:41, 21 July 2019 (UTC)[reply]
We're a descriptive dictionary: actual usage carries much more weight than any official pronouncement. There are a number of languages where the name we use doesn't match the official one. Changing the name in the translation sections is only part of the picture. It would have to be changed in the language data modules, the categories, and the language headers in the entries. It can be done, but it has to be decided for the whole site by the community. Chuck Entz (talk) 06:27, 21 July 2019 (UTC)[reply]
I have also thought about this for some time. If we really want to change Oriya to Odia, a bot could finish it in no time. I used to do some languages before. --Octahedron80 (talk) 09:35, 25 July 2019 (UTC)[reply]

Are supersenses normal senses?[edit]

Let's call it a "supersense" when there is a top-level sense that contains other senses beneath it. For example, space has a supersense that says "(heading) A bounded or specific extent, physical or otherwise" (lol it actually says heading... that's an unrelated problem), and then there are several individual senses underneath, which are subsenses belonging to that general sense.

Are supersenses normal senses? Can they, should they, have synonyms, usage examples, citations, etc.? Equinox 03:08, 21 July 2019 (UTC)[reply]

In short, not always. Some of our "supersenses", eg, at least some of those marked "heading" are not definitions; they are labels to group related definitions. Looking at our competitors, MWOnline often has definitions labeled a, b, etc under a number, but there is no separate definition that corresponds to the number. Unfortunately, we don't seem to be able to do that with subsenses using "#" an "##". I assume that we need to use "#" to mark definition lines to do things like analyze our defining vocabulary. But seriously, there may be some dump-processing code that needs to distinguish definitions from other content, so we are stuck with "#" and "##" and the consequent behavior. DCDuring (talk) 03:32, 21 July 2019 (UTC)[reply]
There's nothing in WT:ELE that says definition lines and only definition lines have to start with "#". Extracting definitions from entries automatically is super LOL given that you would think it would be our most important content. DTLHS (talk) 15:58, 22 July 2019 (UTC)[reply]
How much other principal namespace content appears on lines that begin with "#"? I wonder how those who copy Wiktionary content extract our definitions. DCDuring (talk) 18:14, 22 July 2019 (UTC)[reply]

Bizzare pharmacy board warning[edit]

Bing flags English Wiktionary pages as being dangerous based on https://safe.pharmacy/not-recommended-sites/ (look under "e" for "en.wiktionary.org") I have no idea how we/you ended up on there or what the communitie wants do to do about it. 100.40.131.182 10:49, 21 July 2019 (UTC)[reply]

How did you find this? Did you get a warning going to a specific page? DCDuring (talk) 11:09, 21 July 2019 (UTC)[reply]
Since they seem to have an appeal process, I doubt there is little the community can do; the WMF is the only party able to properly "represent" the site. — surjection?12:48, 21 July 2019 (UTC)[reply]
I sent a brief e-mail. I wonder whether we had some spammy external links to sites that offer drugs that safe.pharmacy disapproves of. DCDuring (talk) 14:55, 21 July 2019 (UTC)[reply]
This list was probably not generated by a human. DTLHS (talk) 16:10, 21 July 2019 (UTC)[reply]
All this from a website whose name ( (bìng)) means "disease" in Chinese... ;p Chuck Entz (talk) 16:27, 21 July 2019 (UTC)[reply]
Looking up sesquipedalian using Bing I get this for Wiktionary: “Warning. The National Association of Boards of Pharmacy (NABP) includes this site on its Not Recommended list. We recommend you learn more and verify your pharmacy before making online health purchases. The FDA has more information at BeSafeRx — Know Your Online Pharmacy.” All I can say is that I wouldn’t recommend anyone to buy their sesquipedalian from Wiktionary either.  --Lambiam 19:39, 21 July 2019 (UTC)[reply]
After all, we offer "elixirs", "nostrums", "panaceas", "patent medicines", "quackery" and "snake oil". My guess is that they search websites not in their whitelist for keywords associated with pharmacy websites. "all words in all languages" of course also means "all keywords in all languages". Wikipedia would be an obvious choice for their whitelist, but they probably have never heard of us. Skimming through their list, I see a few other ringers: a university in Oklahoma, a preschool in Virginia, and a website with Linux software. Their website's user interface design doesn't exactly inspire confidence in their tech smarts... Chuck Entz (talk) 03:57, 22 July 2019 (UTC)[reply]
Does this need to be kicked upstairs to WMF or one of the other pages? Purplebackpack89 15:16, 22 July 2019 (UTC)[reply]
And to Microsoft, too. DCDuring (talk) 15:56, 22 July 2019 (UTC)[reply]
If someone knows a journalist who might be interested in publishing bizarre items... Negative publicity tends to work faster than complaining.  --Lambiam 20:35, 22 July 2019 (UTC)[reply]
  • This time mere notification worked:

"Good Afternoon Mr. During,

Your email has been forwarded to us for response. We have removed en.wiktionary.org from our list of Not Recommended websites and have notified Bing of the removal.

Thank you for your inquiry.

Internet Drug Outlet Identification program staff National Association of Boards of Pharmacy (NABP) 1600 Feehanville Drive Mount Prospect, IL 60056

It is removed from that list, but Microsoft hasn't updated Bing yet. I searched for absquatulate and the warning was still there. I like User:Lambiam idea of a reporter pillorying them for their incompetent quality control. You'd think pharmacists would appreciate the need for that. 100.40.132.13 02:23, 25 July 2019 (UTC)[reply]

A Very Clever Butterfly[edit]

Clouded Yellow

While trying to troubleshoot some unpleasantness with date formatting in quote modules, I followed one of the links, and found this absolute gem of a dangling participle:

  • "We have taken another Colias Edusa and it was also captured with a straw hat while out partridge shooting."

For background, it helps to know that Colias edusa (now Colias croceus) is a pretty little butterfly known in England as the clouded yellow. It's migratory, so it would be a prized find for the sort of amateur insect-collectors that wrote to this publication back in 1949.

Many insects are known for blending in to their surroundings, but this is the first time I've heard of one adopting the local attire and the local pastime in order to avoid notice... Chuck Entz (talk) 08:07, 25 July 2019 (UTC)[reply]

And not just one! There was apparently another one that also allowed itself to be captured thusly.  --Lambiam 21:16, 25 July 2019 (UTC)[reply]
😂 - -sche (discuss) 17:45, 26 July 2019 (UTC)[reply]
Strange, I would have thought this gave them more chance of being Eton... Equinox 17:49, 26 July 2019 (UTC)[reply]

African country names[edit]

Many African nations have been known by the same name under two or more different regimes (for example, in the pattern precolonial, colonial, communist, and ex-communist). Currently many entries end in “official name: Republic of [City/People/Place]istan.” But this inaccurately restricts the definiendum to a contemporary (2019) constitutional regime. I see three options:

  1. remove the “official name” part of the definition
  2. add senses that refer to specific historical and contemporary regimes
  3. add subsenses that refer to these under the main sense “country in [a part of] Africa”

I chose option (3) in edits to Algeria Angola Botswana and Benin; I'm happy to change these to accord with consensus. —Piparsveinn (talk) 07:09, 31 July 2019 (UTC)[reply]