Wiktionary:Beer parlour/2023/September: difference between revisions

From Wiktionary, the free dictionary
Jump to navigation Jump to search
Content deleted Content added
→‎Do we need an inflection line when we have a conjugation box?: -- Thank you so much, Soap, for your critique.
Line 208: Line 208:
:I think we should keep the inflection line both for consistency's sake (people will be looking there) and because it's much more convenient. The ''run'' page is a particularly long one, with the conjugation box only at the very end, which on smaller devices might be ten screens from the top of the page. But even if we were to move the conjugation box up top, I still think the inflections should stay in the header. because it's where people are more likely to look for them based on the patterns set by other entries. [[user:Soap|—]]<span style="background-color: #a6ffe0; padding: 3px; border-radius: 6px 6px 6px 6px;"><b>[[user talk:Soap|Soap]]</b></span>[[Special:Contributions/Soap|—]] 18:02, 4 September 2023 (UTC)
:I think we should keep the inflection line both for consistency's sake (people will be looking there) and because it's much more convenient. The ''run'' page is a particularly long one, with the conjugation box only at the very end, which on smaller devices might be ten screens from the top of the page. But even if we were to move the conjugation box up top, I still think the inflections should stay in the header. because it's where people are more likely to look for them based on the patterns set by other entries. [[user:Soap|—]]<span style="background-color: #a6ffe0; padding: 3px; border-radius: 6px 6px 6px 6px;"><b>[[user talk:Soap|Soap]]</b></span>[[Special:Contributions/Soap|—]] 18:02, 4 September 2023 (UTC)
::I was thinking of having the inflection line read "see conjugation box" with a link to it, for convenience's sake. [[User:CitationsFreak|CitationsFreak]] ([[User talk:CitationsFreak|talk]]) 18:09, 4 September 2023 (UTC)
::I was thinking of having the inflection line read "see conjugation box" with a link to it, for convenience's sake. [[User:CitationsFreak|CitationsFreak]] ([[User talk:CitationsFreak|talk]]) 18:09, 4 September 2023 (UTC)
::: {{ping|Soap}} Made a little mockup of how I think it should look at [[User:CitationsFreak/conjugate]]. Lemme know what you think.

Revision as of 18:32, 4 September 2023


Nuqtaless forms in Hindi

Nuqtaless terms like अर्ज are treated only as alternative spelling here.Those words without nuqta are not just existing only because of poor typset, but they are also pronounced without nuqta sounds. 'arz' is also pronounced as 'arj'. So in those entries native pronunciation should be given preference, and in declension sections transliteration reflecting non-nuqta variant be used or perhaps both the variations. कालमैत्री (talk) 02:38, 1 September 2023 (UTC)[reply]

No. The transliterations should be distinguished by the spelling. Where it may make sense to automatically include and prioritise the nuktaless forms is the pronunciation sections. --RichardW57m (talk) 10:44, 1 September 2023 (UTC)[reply]
this is what i said अर्ज should be transliterated as arj, which it isnt't just like in other nuqtaless entries. कालमैत्री (talk) 11:35, 1 September 2023 (UTC)[reply]
@RichardW57m कालमैत्री (talk) 11:35, 1 September 2023 (UTC)[reply]
@कालमैत्री I am inclined to agree with Richard here if I understand what you say correctly. I think the way it's currently done is correct; dictionaries should show the forms with nuqta except in the pronunciation sections (where the pronunciation as 'arj' is already given as an alternative). The only case I think it makes sense not to have the nuqtaless form be a soft redirect is if it's taken on meanings other than the nuqta-full form. Benwing2 (talk) 20:19, 2 September 2023 (UTC)[reply]
@Benwing2 I agree with you but there are already many entries without nuqta.So should they not show transliteration of non-nuqta form.Perhaps misunderstanding; i am saying about those non-nuqta forms to include it and and not the nuqta forms, the former shows the transliterations of nuqta form. कालमैत्री (talk) 02:29, 3 September 2023 (UTC)[reply]
@कालमैत्री Since it appears from the pronunciation that the nuqtaless forms are mere spelling variants of the forms with nuqta, I don't agree that the translit should be based on the nuqtaless form. Unless the pronunciation is consistently different between nuqta-full and nuqtaless forms, the translits should be the same. This is analogous to how we handle Russian written forms with е in place of ё. Benwing2 (talk) 02:43, 3 September 2023 (UTC)[reply]
@Benwing2 They are not mere spelling variants. But pronunciation one too, as regional hindi speakers use the pronunciation of nuqtaless variant.So both transliteration can be used in non-nuqta entry.Or is this unnecessary? कालमैत्री (talk) 02:52, 3 September 2023 (UTC)[reply]
@कालमैत्री I don't think it's necessary to include both, as the nuqtaless pronunciation is optional. Benwing2 (talk) 02:56, 3 September 2023 (UTC)[reply]
@Benwing2 Well the nuqta pronunciation is similarly optional in those entries.अंग्रेज entry uses pronunciation audio without nuqta sounds कालमैत्री (talk) 03:11, 3 September 2023 (UTC)[reply]
@कालमैत्री, @Benwing2, @RichardW57m: The situation with nuqta and nuqta-less forms are indeed very similar to Russian ё (jo) / е (je) words. Regardless of the pronunciation, the spelling with е (je) is more common in regular running Russian texts for native speakers.
  1. ё is standard, е is non-standard or just a relaxed spelling of ё: свёкла (svjókla, beetroot) and свекла́ (sveklá)
  2. е is standard, афе́ра (aféra, shady deal) and афёра (afjóra)
What can be done for Hindi, is provide alternative entry lines where both both transliterations and spelling are nuqtaless. Please take a look at this revision with my new changes of अर्ज (arj) with nuqtaless and alt. form handling. Also drawing attention of @AryamanA. Anatoli T. (обсудить/вклад) 06:28, 3 September 2023 (UTC)[reply]
@Atitarev, Benwing2, कालमैत्री: IMO this format is a bit cluttered. I would prefer just giving nuqtaless form as the definition, both pronunciations in IPA, but only the nuqtaless transliteration in the headword. This is just a special case of alt form so having both alt form and nuqtaless form as defns is redundant. —AryamanA (मुझसे बात करेंयोगदान) 21:03, 3 September 2023 (UTC)[reply]
@AryamanA, @Benwing2, @कालमैत्री: Thanks for your response, Aryaman. I can revert my edit later but I've got some obvious questions:
In case of अर्ज़ (arz) vs अर्ज (arj), the nuqtaless form is not only alternative spelling but a spelling, which matches the pronunciation. Is it always the case? And are the words or specific nuqta letters where this is not true? For example, is फिल्म (philm) ever pronounced as /pʰɪlm/, not /fɪlm/ as opposed to फ़िल्म (film)?
Since I don't know enough Hindi to judge, I'll use another analogy in Russian "ё" vs "е" spellings.
Unlike свёкла/свекла, афера/афёра where one pronunciation is proscribed but is acceptable, in case of самолёт (samoljót), it's ALWAYS pronounced as if it's spelled so [səmɐˈlʲɵt], even if it's spelled самолет (samolet) (not to confuse with spellings and pronunciations in other languages, such as Bulgarian).
So, in the case of свёкла/свекла, афера/афёра - two definition lines with two distinct pronunciations are appropriate.
In case of самолёт/самолет, only a soft-redirect is used.
Hope it's not confusing, please advise your thoughts. Anatoli T. (обсудить/вклад) 01:52, 4 September 2023 (UTC)[reply]
@Atitarev Yes film is pronounced as philm in villages and also by those who speak different dialect(however adding seperate entry for other might be worthless).And as of whether it should nuqta or nuqtaless transliteration, i don't know.कालमैत्री (talk) 04:20, 4 September 2023 (UTC)[reply]
@कालमैत्री @Atitarev @AryamanA Correct me if I'm wrong but I don't think फिल्म vs. फ़िल्म ever really represent distinct pronunciations. As the last comment says, the word film (spelled either way) can be pronounced philm in villages and some dialects. So it is correct to indicate one as an alt form of the other. Benwing2 (talk) 04:50, 4 September 2023 (UTC)[reply]
  1. @Benwing2. The question was, should nuqtaless फिल्म (philm) be {{hi-noun|g=f|tr=film}} or just {{hi-noun|g=f}} (automatically transliterated as "philm") or should have two definition lines, to which AryamanA opposes. AryamanA simpler suggestion to have it both ways in the pronunciations section and no manual translit in the headword will work for me as well. I've made अर्ज (arj) simpler in this revision.
Anatoli T. (обсудить/вклад) 05:04, 4 September 2023 (UTC)[reply]
@Atitarev I see, yes I agree with not having two POS headers or definition lines. I would probably rather include the manual translit since the pronunciation is not determined by whether there's a nuqta or not, and having a difference in translit could wrongly lead someone to believe this. Benwing2 (talk) 05:22, 4 September 2023 (UTC)[reply]
@Benwing2: But having the same transliteration for two different spellings could lead people to believe there was a speck of dust on the screen. The punctilious would use the different spellings to indicate whether /f/ was permitted or not. --RichardW57m (talk) 10:39, 4 September 2023 (UTC)[reply]

Automatic transliteration of katakana and hiragana

(Notifying Eirikr, TAKASUGI Shinji, Atitarev, Fish bowl, Poketalker, Cnilep, Marlin Setia1, Huhu9001, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233, Alves9, Cpt.Guapo, Sartma, Lugria, LittleWhole, Chuterix, Mcph2): Is there any reason we don't have automatic transliteration of katakana and hiragana? It seems silly that we have to add manual transliterations to things like {{l|ja|アメリカ}} and {{l|ja|すし}}. —Mahāgaja · talk 07:44, 1 September 2023 (UTC)[reply]

Probably because of potential word boundaries / spacing. —Fish bowl (talk) 07:45, 1 September 2023 (UTC)[reply]
Is that more of an issue for katakana/hiragana than it is for hangeul, which does have automatic transliteration? —Mahāgaja · talk 07:48, 1 September 2023 (UTC)[reply]
Hangeul has spacing and word boundaries are clearer. AG202 (talk) 05:29, 3 September 2023 (UTC)[reply]
Instead of {{l|ja|アメリカ}}, I just use {{ja-r|アメリカ}}, which gives アメリカ (amerika). Mcph2 (talk) 07:48, 1 September 2023 (UTC)[reply]
OK, but in translation tables we (have to?) use {{t}}, which also doesn't support automatic transliteration. —Mahāgaja · talk 07:50, 1 September 2023 (UTC)[reply]
Oh, @Theknightwho has been working on automatic Japanese transliteration these days, he might have a solution. Mcph2 (talk) 07:59, 1 September 2023 (UTC)[reply]
Spacing can be manually added, e.g. {{ja-r|あい うえお}}: あいうえお (ai ueo). Mcph2 (talk) 08:03, 1 September 2023 (UTC)[reply]
Well, exactly. Are there any issues to having automatic transliteration in {{t}} (for example) that {{ja-r}} hasn't already solved? And it could work for the hiragana transliteration of kanji terms in {{t}} as well. At the moment, we have to write {{t+|ja|子猫|tr=こねこ, koneko}} at kitten#Translations, but surely it should be doable to just write {{t+|ja|子猫|tr=こねこ}} and have a module generate the romaji koneko automatically. —Mahāgaja · talk 08:13, 1 September 2023 (UTC)[reply]
@Mahāgaja -- Please don't use both kana and romaji in translation tables. The layout is already very tight, kana is unusable and potentially confusing to much of our readership, and kana text adds nothing useful anyway that we can't get from romanization.
If we don't have automatic kana → romaji conversion, please use {{t+|ja|子猫|tr=koneko}} instead.
If we do have automatic kana → romaji conversion, @User:Theknightwho, for translation tables especially, please don't use ruby -- again, the layout of translation tables is very tight and ruby text above kanji pushes things around in unhappy ways, much of our readership cannot read kana and would find this confusing, there are other usability problems (such as cut-and-paste issues discussed elsewhere), and ruby doesn't add any useful information anyway that cannot be gleaned from the romanization. ‑‑ Eiríkr Útlendi │Tala við mig 17:52, 1 September 2023 (UTC)[reply]
@Eirikr There's no ruby there at the moment, and your concerns are exactly why I think we need to discuss things first before implementing big changes like that. I don't think it's insurmountable, but I haven't had the time to look into it yet. That being said, I'm not sure I completely agree with you re the value of rubytext, but that's a separate issue that we've already talked about before. Theknightwho (talk) 17:59, 1 September 2023 (UTC)[reply]
@Eirikr: I almost never add Japanese translations myself, but the facts on the ground are that almost all Japanese lines in translation boxes that involve kanji have both hiragana and romaji in the transliteration field. —Mahāgaja · talk 20:36, 1 September 2023 (UTC)[reply]
I haven't made any comprehensive effort to check EN entries for JA translations. Those that I've encountered have been scattershot, with kana present more frequently in what appeared to be older edits.
At any rate, I am strongly opposed to including kana in the parens in translation tables -- these are not useful for most readers, and the romanization suffices. I am baffled that people add the kana; it seems editors get lost in the "cool" factor of another script, and don't consider usability / usefulness. By way of counterexample, we don't include bopomofo for Chinese, for instance. ‑‑ Eiríkr Útlendi │Tala við mig 21:31, 1 September 2023 (UTC)[reply]
They're pretty useful to me... AG202 (talk) 05:31, 3 September 2023 (UTC)[reply]
@Mahagaja I would wait till User:Theknightwho comes back on line, he is in the middle of implementing this. I think it works already if you explicitly specify the script as Hrkt. Benwing2 (talk) 08:49, 1 September 2023 (UTC)[reply]
@Fish bowl @Mahagaja @Mcph2 @Benwing2 It is actually already enabled if you manually specify the script as Hira, Kana or Hrkt: {{l|ja|^アメリカ|sc=Kana}} gives アメリカ (Amerika). However, this is a stopgap measure and I would prefer if we don't use it generally in entries, as adding script codes to everything would clutter up entries; it's only there so that non-Lua templates can use {{xlit}}. For now, it's best to stick with {{ja-r}} - not least because spaces aren't supported yet.
The reason for this is because I recently split Module:ja-translit in two: the old kana_to_romaji function has been replaced by Module:Hrkt-translit (Hrkt being the ISO code for all kana combined). I then moved the old module to Module:Jpan-translit, which works by scraping pages for readings in a similar fashion to the way Chinese transliteration works. The reading it generates is then given to Module:Hrkt-translit. Jpan-translit is not currently enabled, because it's pending further discussion about how we handle terms with multiple readings.
The reason for this new system is because the two modules work in very different ways, and it means we can avoid wasting resources if we know for certain that a given term is going to be in kana. There's also the fact that some languages (e.g. Ainu) don't use kanji at all, and so it makes sense to have kana transliteration be handled in a standalone way.
Just as a word of caution: don't confuse Kana (the code) with Kana (the script name). Unfortunately, the ISO picked Kana as the script code for katakana, and Hrkt for what they call "Japanese syllabaries" (i.e. hiragana + katakana, with hentaigana grouped under hiragana). I've given Hrkt the name "Kana" because it's the most accurate name for what it actually refers to, and I don't . It won't make any difference 99% of the time, but it's good to be aware just in case. Theknightwho (talk) 09:01, 1 September 2023 (UTC)[reply]
@Theknightwho Thanks for the summary. Can you answer the question of when we can expect {{l|ja|^アメリカ}} to work right without explicitly specifying the script code, and what needs to be done and what issues resolved in order for this to happen? Can't you just either rely on the autodetection of the script or make the translit module check the contents of the text being transliterated, so that if it sees it's all Kana (Hiragana or Katakana), it goes ahead and transliterates, and otherwise fails? Benwing2 (talk) 09:08, 1 September 2023 (UTC)[reply]
And is it possible to add romaji automatically to the hiragana transliteration in cases like {{t+|ja|子猫|tr=こねこ}} that I mentioned above? BTW, I had never heard the term hentaigana before, and I have to say it doesn't mean what I was expecting it to mean!Mahāgaja · talk 09:11, 1 September 2023 (UTC)[reply]
@Benwing2 My intention was that it'd be as soon as Module:Jpan-translit is enabled. At the moment, anything entered as hiragana, katakana (or a mix of the two) will always be detected as Jpan. We could use Module:Hrkt-translit for Jpan as a stopgap, and any incomplete transliterations should return nothing. Alternatively, we could make a specific code override a general code if there's a tiebreak, which would have the same result. That may be preferable, as it means script codes will be more accurate in general.
@Mahāgaja Not yet - the transliteration module can't override manual transliterations. We should be able to integrate the features of {{ja-r}} into the general link modules pretty soon, though, and at that point we should be able to update everything via bot (like we did with Mandarin). There'll need to be a few minor changes to make the syntax compatible, though, which is the main barrier at the moment. That will need to wait until I've finished my major rewrite of Module:languages and Module:links, and I don't want to add any new features to the current versions because they're already too complicated/messy as it is. That should hopefully be done by the end of the month, if not sooner, and at that point I can start working on this. No promises, though. Theknightwho (talk) 09:27, 1 September 2023 (UTC)[reply]
@Theknightwho I thought about this a bit. Changing the script detection to return Hrkt or something else other than Jpan is likely to break people's .CSS files that customize based on the Jpan script code. I would use Module:Hrkt-translit as Module:Jpan-translit and have it fail for now if it encounters Kanji. That puts a placeholder for when you resolve the issue of how to handle cases with more than one pronunciation. Benwing2 (talk) 20:15, 2 September 2023 (UTC)[reply]
Transliterating Japanese kana (both hiragana and katakana) is long overdue. It is in fact, even simpler than Korean hangeul but the following considerations should always be made:
  1. Spacing, capitalisations and irregular reading for particles (wa) (spelled as "ha") and (e) (spelled as "he") 東京(とうきょう)日本(にほん)首都(しゅと)です (Tōkyō wa Nihon no shuto desu.), どこ() (doko e iku no?). Notice spacing in kana spellings, ^ and separation of particles.
  2. Morpheme boundaries and diphthong readings: 昨日(きのう) (kinō) vs (あらそ) (arasou) and 新潟(にいがた) (Nīgata) vs (あたら)しい (atarashii). Notice the use of "." in kana. Please compare "ō" vs "ou" and "ī" vs "ii", the difference in pronunciations/transliteration mostly depends on morpheme boundaries.
Anatoli T. (обсудить/вклад) 06:40, 3 September 2023 (UTC)[reply]
There are many languages (Yiddish is an notable example) where automatic transliteration has to be overridden for some words, so that shouldn't be a problem. We can use |tr=wa with templates like {{l}}, {{m}} and {{t}}, and use |subst=は//わ with {{ux}} and the "cite-" and "quote-" families of templates. —Mahāgaja · talk 19:25, 3 September 2023 (UTC)[reply]

Splitting Quechua

Honestly, handling an entire family of mutually unintelligible languages which have their own ISO codes for a while now as four languages (based on the country and historical period they are/were spoken in) doesn't seem like a good idea in general. We'll need most of the codes mentioned here, but probably with slightly different names. If nobody has any fundamental issues with the split itself, I could start drawing up a list of codes and (proposed) names.

Related to this, I also believe we should prohibit the creation of lemmata of Standard Kichwa, as this case is almost identical to Standard Moroccan Amazigh: There are no speakers, it is an artificially created mix of used Ecuadorian Quechua varieties that only accomplishes to make speakers unconfident in their own language use. Thadh (talk) 10:38, 1 September 2023 (UTC)[reply]

But are there not readers and writers? --RichardW57m (talk) 10:50, 1 September 2023 (UTC)[reply]
But so are there of Klingon and Na'vi. That doesn't make it a language worthy of inclusion in the mainspace. Thadh (talk) 13:08, 1 September 2023 (UTC)[reply]
@Thadh Are there any native speakers of Standard Kichwa per se, or are they all native speakers of one of the languages it aims to standardise? Theknightwho (talk) 16:31, 2 September 2023 (UTC)[reply]
@Theknightwho: They are all speakers of the distinct dialects, and according to the literature I've read, the speakers suffer quite a lot from the prescriptive nature of the standard (i.e. think their language 'isn't correct'). Thadh (talk) 01:15, 3 September 2023 (UTC)[reply]
Bokmål and MSA have no native speakers either; the Klingon and Na'vi comparison is fatuous. Do you have evidence of the stated effects of Standard(ized) Kichwa? In the meantime, I stand with @AG202's stance. ~ Blansheflur 。・:*:・゚❀,。 21:24, 3 September 2023 (UTC)[reply]
See the introduction of Aschmann's A reference grammar of Ecuadorian Quichua. I'll cite a couple of passages:
"“Unified Quichua” is a special form of Quichua which has been devised in recent decades, a certain amount of literature has been produced in it (including a Bible translation called Pachacamacpac Quillcashca Shimi), and educational programs have been carried out in it. [] Unified Quichua was in its origin an artificial language, a mixture of features from various Quichua languages, with all the Spanish borrowings replaced with old (obsolete) Quichua words which the people do not know or whose meanings have changed. (Many of these obsolete words are still contemporary in other regions, some being used in other Ecuadorian Quichua languages, others being Peruvian Quechua words, and others being coined based on existing Quichua forms.) [] One unfortunate effect of Unified Quichua has been to make those who speak Quichua as their native language feel like they do not speak it well, because they don’t speak it like the academicians say they should! In reality, the native Quichua speakers represent the continuous, native tradition of the language. Another negative result has been that the Quichua young people, who are in some cases being taught the Unified Quichua in school, feel like their grandparents speak the language incorrectly, whereas in reality their grandparents are the ones who speak the language best!"
Aschmann also references the paper by Grzech et al., which write the following in their conclusion:
" [] At the same time, the linguistic features of Unified Kichwa fail to adequately represent the language which these speakers – acutely aware of linguistic micro-variation and reliant on it for constructing social belonging – perceive as their own. [] The standard currently in place is divisive and remains largely unused, mostly due to the purist ideology from which it is derived."
There are more comments of these sorts but I believe this is more than enough to conclude that Unified Kichwa is pretty similar to Standard Moroccan Amazight and also not something we'd want in our mainspace. Thadh (talk) 22:18, 3 September 2023 (UTC)[reply]
If there's been an agreement not to include SAM lemmas (as it sounds), then I stand with you, as those two cases are most comparable. So yes, I support everything you've put forth. ~ Blansheflur 。・:*:・゚❀,。 22:41, 3 September 2023 (UTC)[reply]
Support splitting Quechua. Vininn126 (talk) 18:00, 1 September 2023 (UTC)[reply]
Support - no reason why Quechua should be handled as a single language. Theknightwho (talk) 16:31, 2 September 2023 (UTC)[reply]
For context for others, see: the prior discussion on Kichwa. I support splitting Quechua, but I don't think I'd support prohibiting the creation of Standard Kichwa. Even if it's not necessarily spoken, it's still written and read, and it seems like a similar situation to Modern Standard Arabic or any other created standard variety created specifically to try and "unite" other lects, for better or for worse. The fact that it makes speakers unconfident in their own usage is unfortunate, but that shouldn't stop us from including the entries if they are cited in usage (similar to how it's been made clear that we include derogatory terms). At best, we could add some kind of label or usage note to disambiguate "standard" terms. AG202 (talk) 05:43, 3 September 2023 (UTC)[reply]
I am convinced by User:AG202's argument that we should not prohibit adding Unified Quichua/Kichwa lemmas. Instead I think we should have a label "Unified Kichwa" or similar to identify them. It reminds me a bit of Rumantsch Grischun and Standard Basque, each of which is somewhat controversial and for which similar complaints have been made to the complaints being made here about Unified Kichwa, yet we don't prohibit them. In general we are a descriptive dictionary, and prohibiting a language because some people don't like it seems very prescriptivist. Benwing2 (talk) 04:38, 4 September 2023 (UTC)[reply]
I guess that is fine. I also found Category:Moroccan Amazigh language which makes me think I have misunderstood something of previous discussions? The naming made it difficult to find, and it links to an empty Wikipedia page. I am still not sure if the language can be considered a natural language or even in the same way that MSA or modern Hebrew is, but I guess it's fine to keep it, provided we lable it "Unified Kichwa" and keep an eye on new editors adding terms in the other Kichwa varieties. Thadh (talk) 13:02, 4 September 2023 (UTC)[reply]
Support ~ Blansheflur 。・:*:・゚❀,。 22:42, 3 September 2023 (UTC)[reply]

I've just got around to publishing the draft for Wiktionary:About Icelandic, which had been in the request pile. It'd be good to get the feedback of any regular contributors to Icelandic, so if there's anything missing or misrepresented feel free to add it or let me know - I haven't as yet contributed much to the language on here and don't have much familiarity with the language-specific editing norms or templates. In particular, I couldn't find anything at all in the discussion pages about the cut off date we use for when Old Norse ends and modern Icelandic begins or whether in practice it's not much of an issue. Helrasincke (talk) 17:28, 1 September 2023 (UTC)[reply]

@Helrasincke Hi, I'm not an Icelandic editor, but about the content of that page: there's this here sentence, "Following is a simplified entry for the German word orðabók (“dictionary”). It shows the fundamental elements of an Icelandic entry:", but should it say "Icelandic" instead of "German"? Did you perhaps adapt this from About German? Anyway, nice work! Kiril kovachev (talkcontribs) 00:38, 2 September 2023 (UTC)[reply]
@Kiril kovachev Whoops, well spotted! Helrasincke (talk) 15:54, 3 September 2023 (UTC)[reply]
@Helrasincke No problem! And also, I apologize to split hairs, but you may also want to check the "Spelling" section:
there looks to be a sentence starting, "Letters such as"... that goes unfinished. I guess there was meant to be a passage about symbols that changed in their usage in some way after that. Otherwise, looks good :) Kiril kovachev (talkcontribs) 18:43, 3 September 2023 (UTC)[reply]

Dingal language add request

should the language be added? कालमैत्री (talk) 20:18, 1 September 2023 (UTC)[reply]

or should it be treated under rajasthani language already on Wiktionary कालमैत्री (talk) 20:27, 1 September 2023 (UTC)[reply]
I know nothing about Dingal, but Wikipedia suggests it's ancestral to both Rajasthani and Gujarati, which as far as I'm concerned is reason enough for it to be a separate language with a code of its own. —Mahāgaja · talk 20:45, 1 September 2023 (UTC)[reply]
how to add the language in Wiktionary कालमैत्री (talk) 13:06, 2 September 2023 (UTC)[reply]
The Wikipedia article shows signs of having been puffed up by editors who may or may not know what they're talking about, an issue many articles on Indian topics suffer from, so I would feel better if I could find information about the language in other sources. So far I haven't been able to find much. Glottolog doesn't seem to have it. Searching Google Books turns up little. There is a mention in an essay in Language Versus Dialect: Linguistic and Literary Essays on Hindi, Tamil, and Sarnami (ed. by Mariola Offredi, 1990), page 68, which says "The Caran were numerous above all in Marvar, whose regional language (Marvari) was later known as Dingal.6 The Dingal language entered the court thanks to the Caran and became the standard literary language in the vast Marvar region, ..." where the footnote 6 (on page 88) is "There is much discussion about the meaning of the word 'dingal'. This has been used since the nineteenth century, with reference to the literature in western Rājasthānī, known also as Marubhāṣā and Mārvārī. For the various interpretations, see MOTILAL MENARIYA 1949, 15-24. It is, however, futile to go into this discussion, since scholars have not yet come to any definite conclusions." Rajendra Kumar Dave, Society and Culture of Marwar (1992), page 103, says "Dingal—The literary form of Marwari was called Dingal. The word 'Dingal' was used for the first time by Kushallabh in his work Pingal Siromani composed in V.S. 1607-18. The word has been defined in various ways by scholars. Tessitori calls it a language of rustics and a language without grammar." (That seems harsh, considering other works call it the language of poetry.) I can't find anything on how intelligible or not it is with modern Marwari or other forms of Rajasthani. - -sche (discuss) 18:34, 3 September 2023 (UTC)[reply]
effect of prakrit on dingal literature.; dingle literature both in hindi might be of importance.Other such work exist but all in hindi.There are also dingal words in a hindi dictioanry here. कालमैत्री (talk) 11:02, 4 September 2023 (UTC)[reply]
MCgregor says" an archaising form of early Mārvāṛī language, as used in Rājasthānī bardic poetry". कालमैत्री (talk) 11:06, 4 September 2023 (UTC)[reply]

Inclusion policy regarding given names?

I noticed we have no concrete policies regarding names (not individuals but given, middle and surnames), such as how many people must bear a name in order for it to qualify for inclusion and such. I am asking because I discovered we have very few Afrikaans-language names and so I wanted to add some. However, some of the names I had in mind—those I know from my family tree—appear to be fairly rare, several yielding less than 10,000 or in extreme cases less than 1,000 or even 100 results on FamilySearch's vital records (including duplicate records for the same person). I am still a fairly new editor here and so I obviously do not wish to accidentally create numerous unnecessary or non-notable entries as it could be annoying to clean up. I recently created Sarel which has 2,555 results on said genealogical website for South African (1600–present) records, but others like the surname Heystek / Heijstek give me less than 900 search results and I wonder if there should be a limit to what can be added. I notice we have hundreds of entries like Odajyan that just say "According to data collected by Forebears in 2014, Odajyan is the 1988048th most common surname in the United States, belonging to 1 individuals", but IMO this is not good practice. I would love to hear your opinions. Kindest regards, LunaEatsTuna (talk) 22:39, 1 September 2023 (UTC)[reply]

@LunaEatsTuna This is a good discussion we need to have. Really rare names shouldn't be present, e.g. Odajyan should maybe be given as an Armenian last name but that's all. Otherwise we'll get a Cambrian explosion of useless entries. Benwing2 (talk) 22:49, 1 September 2023 (UTC)[reply]
My thoughts exactly. Wiktionary is not a database of surnames, after all. LunaEatsTuna (talk) 23:15, 1 September 2023 (UTC)[reply]
I see it has the boilerplate statistic line with "Odajyan is the 1988048th most common surname in the United States, belonging to 1 individuals" (sic). The ordinal seems practically meaningless if only one person has the surname. —Al-Muqanna المقنع (talk) 23:35, 1 September 2023 (UTC)[reply]
I also find that it puts an utterly disproportionate focus on a statistic that’s essentially random noise at that point, too. I really don’t know why we need it. Theknightwho (talk) 21:35, 2 September 2023 (UTC)[reply]
This has been discussed before- even the person who added them doesn't care for them much, but their reasoning can be summarized as "it's better than nothing". Chuck Entz (talk) 22:42, 2 September 2023 (UTC)[reply]
We've had a couple of RFV's for surnames lately, including Klingon (which surprised me by passing), Nazndah (which failed), and Mozela (which didn't really complete). The standard for both given names and surnames that we followed in those RFD's is to treat them like ordinary words, meaning that they need three cites. In this case, a document with a list of names can be a cite, but it must be in the language we're looking for.
If Odajyan fails RFV, could we continue to list it as a transcription of the Armenian, or do we only write Armenian words in the Armenian alphabet? Thanks, Soap 22:50, 1 September 2023 (UTC)[reply]
About this name specifically ... i see now that it's a spelling variant of Odadjian, which would be trivially easy to cite, as it is the surname of a musician. The -ian spelling of Armenian names in general is the traditional one, at least in the United States, having been overtaken in recent years by -yan perhaps because it's more true to the Armenian pronunciation. For this name, and perhaps others, we have the -yan spelling as the standard and -ian as a variant. If we can cite Odadjian, does that mean Odajyan also passes? Im guessing not, because even though the Armenian name is the same, we're in some sense creating a new name by Romanizing it. Soap 23:01, 1 September 2023 (UTC)[reply]
I think surnames being treated as regular words is a good idea. Also, would I be correct in assuming that vital records like birth certificates would not count as ordinary citations towards the inclusion of names? I do not recall reading about whether or not such citations were even allowed on Wikt or not, and I would agree if they do not count for inclusion but I am just asking for clarification. (If they were allowed this would essentially make the de facto policy for inclusion that at minimum three people bear a name). LunaEatsTuna (talk) 23:06, 1 September 2023 (UTC)[reply]
As far as Armenian surnames are concerned, this is what is going on. According to Armenian law, all passports record the owner's first name and surname in Armenian and in an English transcription. There is a strict scheme for automatically replacing each Armenian letter with an English letter or digraph, without regard to the actual pronunciation of the Armenian surname or the resultant English transcription. Օդաջյան (Ōdaǰyan) becomes Odajyan, Քարտաշյան (Kʻartašyan) becomes Kartashyan, Սարգսյան (Sargsyan) becomes Sargsyan, Պետրոսյան (Petrosyan) becomes Petrosyan, no variants are possible. If a person from the current iteration of Armenia (AD 1991–) emigrates to England or an English colony or becomes famous in English-language media, he will be recorded under the legally transcribed English name. This is what Forebears did for the one recent US citizen Odajyan.
The situation with the older diaspora is different. They are not bound or influenced by the Republic's transcription rules. They usually adapt their Armenian name to the local language according to their taste and with more regard to pronunciation and euphony. I can sympathize as foreigners distort my Petrosyan to things like /petroʃan/, /petroʒan/, /petrozian/. It is better to adapt it as Petrossian in English and French lands to approximate the correct /petrosjan/. Odadjian and Odajian are adapted versions of Odajyan. Many adaptation variants are possible, look at the forms of Hakobyan. Sometimes adaptation goes so far that I can't even figure out the native form: compare Bilzerian.
Since the local passport transcription system is predictable and fixed, I had chosen its form as the main one (Odajyan) and listed the old diasporan adaptations as variant forms (Odadjian, Odajian). Admittedly, the old diasporan spellings are easier to attest in English because their bearers had better chance to be recorded in English.
Because Wiktionary's policy on names is undeveloped, I do not create foreign entries for Armenian surnames anymore. Instead, I list the passport transcription and all diasporan spellings I can find in the Descendants section of the Armenian entry as in Համամչյան (Hamamčʻyan). Vahag (talk) 10:33, 3 September 2023 (UTC)[reply]

Ban cross-family comparisons from EDAL

Self-explanatory. Any cross-family comparison sourcable only to EDAL or affiliated sources should be banned or at the very least worded in such a way that makes it clear that the "Altaic" family is a pseudolinguistic fringe theory. — SURJECTION / T / C / L / 15:21, 2 September 2023 (UTC)[reply]

Support. We need to remove macro-level Altaic comparisons. I wouldn't be surprised if some lower-level connections are established but that's far outside the standard linguistic view of things and we don't need to be hosting fringe. Fringe is cringe bro. Vininn126 (talk) 15:25, 2 September 2023 (UTC)[reply]
I wouldn't be opposed to this, either. It often feels like many of our active Proto-Turkic editors are sneakily adding in Altaic comparisons and references (including lists of comparanda of supposed regular sound correspondences!) whenever they think they can get away with it. — SURJECTION / T / C / L / 15:45, 2 September 2023 (UTC)[reply]
Let me clarify what I mean: perhaps in the future smaller connections between languages in this area will become more established within the linguistic mainstream, but until that time we shouldn't host it. Vininn126 (talk) 15:49, 2 September 2023 (UTC)[reply]
Support Over at the Proto-Turkic page we've already been slowly phasing out Altaic reconstructions and comparisons are made with Mongolic if they cannot be explained through conventional borrowing. Yorınçga573 (talk) 15:41, 2 September 2023 (UTC)[reply]
Support. AG202 (talk) 05:46, 3 September 2023 (UTC)[reply]
Support BurakD53 (talk) 13:01, 3 September 2023 (UTC)[reply]

Bulgarian name dictionary reference template

Hello to all Bulgarian editors, I don't know if this resource has been used before, but today I found a dictionary that documents Bulgarian personal names, which can be viewed online, just like the etymological dictionary. I've written a reference template at {{R:bg:LIFUB}}; the syntax is {{R:bg:LIFUB|page_number|entry_name}}, where the entry name can be omitted 9 times/10 if it's the same as the page title. @SimonWikt @Chernorizets @Bezimenen. This is a good help in adding accents to names with unclear stress, as well as expanding small entries: check out Апостолов, for example. Hope this helps! Kiril kovachev (talkcontribs) 18:17, 2 September 2023 (UTC)[reply]

In general, you don't need a BP thread for this. Perhaps alerting other editors on a talk page for the template or something similar will suffice. Vininn126 (talk) 18:20, 2 September 2023 (UTC)[reply]
@Kiril kovachev: This is great. I think we need pages documenting these resources, although Category:Bulgarian reference templates is a good start. As for where to post this, maybe WT:About Bulgarian and pinging the relevant users? (Although I wouldn't have seen this as you didn't ping me.) Benwing2 (talk) 20:11, 2 September 2023 (UTC)[reply]
@Vininn126 Yes, that's fair enough; my apologies. (I partly posted it here because I don't know for sure who else might be a Bulgarian editor, or be interested in the template anyway.) @Benwing2 Sorry for not @ing you, I wasn't sure whether you still edit Bulgarian these days and I didn't want to spam you with it in that case. So, fortunate that you check this place often enough to see. :)
I may well update WT:About Bulgarian with some information about our templates for this.
Thanks, Kiril kovachev (talkcontribs) 20:23, 2 September 2023 (UTC)[reply]
Don't apologize! I'm just informing. Vininn126 (talk) 20:24, 2 September 2023 (UTC)[reply]
Thanks for the heads-up! Kiril kovachev (talkcontribs) 22:29, 2 September 2023 (UTC)[reply]
@Kiril kovachev very cool! Chernorizets (talk) 21:41, 2 September 2023 (UTC)[reply]

User:Dragonoid76 requested equivalents of {{PIE root see}} for Proto-Indo-Iranian, Proto-Indo-Aryan and Sanskrit. I realize that there isn't a proper template currently for this. It should be {{rootsee}} but (a) that template doesn't quite do it, (b) it is a total mess. I am going to redo {{rootsee}} to work similarly to {{root}}:

|1=
Destination language of category Category:Destination terms derived from the Source root *root-. If left out or set to the value +, you get the umbrella category Category:Terms derived from the Source root *root-.
|2=
Source language of category Category:Destination terms derived from the Source root *root-. If left out or set to the value +, or equal to the destination language, you instead get Category:Destination terms belonging to the root *root-. However, if both source and destination language are left out or set to +, and the current page is in the Reconstruction namespace, the source language is inferred from the pagename and you get Category:Terms derived from the Source root *root- (otherwise you get an error). If the destination language is a family code and not a valid language code, the family code is converted to the corresponding proto-language. This means you can write ine for Proto-Indo-European, iir for Proto-Indo-Iranian, inc for Proto-Indo-Aryan, etc.
|3=
Root. If left out or set to the value +, it is taken from the subpage name (i.e. after a slash in the case of Reconstruction namespace items). If the source language is reconstruction-only, you can leave out the initial *. In addition, a hyphen may be added according to the following algorithm:
  1. If there is a space or hyphen in the root already, no hyphen is added.
  2. If the root is in a non-Latin script, no hyphen is added.
  3. Otherwise if the source language is Navajo, a hyphen is added onto the beginning, otherwise onto the end.
|id=
Sense ID of the root; needed especially for Navajo.

This means, for example, that you can write {{rootsee}} by itself on a reconstructed root page and get Category:Terms derived from the Source root *root- automatically. This should make {{PIE root see}} totally unnecessary. Current uses of {{rootsee}} that default to PIE will have to be changed to add ine as the second argument, so that e.g. {{rootsee|en|*gʷem}}, which currently gets you Category:English terms derived from the Proto-Indo-European root *gʷem-, will change to {{rootsee|en|ine|*gʷem}}. Benwing2 (talk) 23:16, 2 September 2023 (UTC)[reply]

I have written the module underlying this, see Module:User:Benwing2/rootsee and User:Benwing2/test-rootsee, as well as the bot script to convert existing uses of {{rootsee}} and {{PIE root see}}. If no one objects, I will do the conversion in the next couple of days. Benwing2 (talk) 02:45, 3 September 2023 (UTC)[reply]
Thanks, this looks good. It was always a bit odd seeing a template with a generic name like rootsee being specifically bound to descendants of PIE. Soap 15:38, 4 September 2023 (UTC)[reply]

lemmas

hi how to find what languages in wiktionary have most lemmas 31.7.113.40 16:57, 3 September 2023 (UTC)[reply]

There is a sortable list at Wiktionary:Statistics. Einstein2 (talk) 23:22, 3 September 2023 (UTC)[reply]
Remember the phrase 'lies, damned lies and statistics'. In languages whose users command deep, morphologically marked derivations, synchronically derived terms may be marked as lemmas. --RichardW57m (talk) 08:26, 4 September 2023 (UTC)[reply]

Listing taxonomical names in Derived terms sections

I'd like to find out if there is a policy about how much detail should go into listing taxonomical names in Derived terms sections. See rigó. Currently the Hungarian name is followed by the English translation and the taxonomical name. I wonder if this is all considered useful by other editors. Should I just list the Hungarian names? Panda10 (talk) 17:38, 3 September 2023 (UTC)[reply]

I don't think there is a policy. I think they are useful to indicate which taxon is indicated by the vernacular name and whether multiple taxa are indicated. Taking English as an example, many really common one-word English vernacular names cover multiple species and multiple higher-level taxa, sometimes even kingdoms. It doesn't do a user much good to have to go on merry chase through other references to disambiguate the term. OTOH, it can be time-consuming for a contributor to do so. If we save two users from such a merry chase at the cost to one of us doing it once, there is a net social gain. DCDuring (talk) 19:03, 3 September 2023 (UTC)[reply]

May I place a wikilink within a usex or quote when I believe it is helpful to the reader?

On the house entry, we have an extremely large collapsible listing all of the derived terms in alphabetical order. There are two phrases, on the house and the house always wins, that are specifically bound to sense 6, subsense 2, and make no sense in any other context. If there existed an inline version of the derived terms template, I would want to use that underneath s6:2 so that readers would know that it is specifically tied to this narrow definition. But so far as I know there is no such template, and if there were one, we would probably discourage widespread use, since it would take up space and merely duplicate terms that already appear below. We have collocations, but my understanding is that they are typically used for phrases which do not have their own entries and therefore cannot be links either. So perhaps the best way to help the reader is to use wikilinks within the use-examples we currently give. This is currently forbidden by our Manual of Style page, which specifically says that use-examples must

not contain wikilinks (the words should be easy enough to understand without additional lookup).

However, as I read it, this is intended to discourage introducing difficult, unrelated words into use-examples which would need wikilinks in order to be understood. That is, if my word were house, I would not do well to add a use-example such as

Next to the galamander was a small grey house.

Where the unrelated word galamander both distracts the reader and tells them nothing about houses. By contrast, linking to the expressions on the house and the house always wins underneath the one specific sense of house that they are bound to seems like the best solution for this rare situation.

Ideally, if we can agree this exception to the policy is valid, I would like to see a small change to the Manual of Style to reflect this, rather than just tolerating a few exceptions here and there, so that this won't be a source of conflict in the future.

Best regards, Soap 12:54, 4 September 2023 (UTC)[reply]

Good luck with the wording. DCDuring (talk) 16:47, 4 September 2023 (UTC)[reply]

Do we need an inflection line when we have a conjugation box?

In a Beer Parlour discussion, it was pointed out that English doesn't use conjugation boxes that much, instead using the inflection line. However, we do use conjugation boxes for certain verbs, mostly those with archaic endings, like run. In these cases, do we really need the inflection line? The conjugation box provides so much more information in this case, and the same type as the inflection line. Why repeat ourselves? CitationsFreak (talk) 17:34, 4 September 2023 (UTC)[reply]

I think we should keep the inflection line both for consistency's sake (people will be looking there) and because it's much more convenient. The run page is a particularly long one, with the conjugation box only at the very end, which on smaller devices might be ten screens from the top of the page. But even if we were to move the conjugation box up top, I still think the inflections should stay in the header. because it's where people are more likely to look for them based on the patterns set by other entries. Soap 18:02, 4 September 2023 (UTC)[reply]
I was thinking of having the inflection line read "see conjugation box" with a link to it, for convenience's sake. CitationsFreak (talk) 18:09, 4 September 2023 (UTC)[reply]
@Soap Made a little mockup of how I think it should look at User:CitationsFreak/conjugate. Lemme know what you think.