Wiktionary:Beer parlour/2024/May

From Wiktionary, the free dictionary
Latest comment: just now by Ioaxxere in topic English anagrams
Jump to navigation Jump to search

Arabic and Hebrew transliteration[edit]

Wiktionary currently transliterates the glottal stop in both Arabic and Hebrew as ʔ and the voiced pharyngeal fricative in both languages as ʕ. Would it be possible to correct these to respectively transliterate the glottal stop as ʾ and the voiced pharyngeal fricative as ʿ so they would be in line with Wiktionary's transliteration of other Semitic languages, which all use ʾ and ʿ?

Wiktionary also currently transliterates the Arabic voiceless velar fricative as . However, an alternate transliteration as is also used for this sound. Since is used for the transliteration of voiceless velar fricative for most Semitic languages except for Hebrew and Aramaic, I would like to request that Wiktionary's transliteration of the Arabic voiceless velar fricative be changed from to as well. Antiquistik (talk) 13:08, 1 May 2024 (UTC)Reply

@Antiquistik: No, we switched the other day. As for to , I don’t know, perhaps it’s better if you want to make an etymological statement that is fricativized k which we keep in begadkefat affected languages while organic. Fay Freak (talk) 18:08, 1 May 2024 (UTC)Reply
@Fay Freak In this case, I will add my opposition to the discussion regarding that change.
Concerning to , should I make another request, or should I add it to this one itself? Antiquistik (talk) 18:55, 1 May 2024 (UTC)Reply
@Antiquistik IMO the opposite change should happen and other Semitic languages should use ʔ and ʕ. The problem with the forward and backward quotes is that they're too small and too easily confused in many fonts. I also think ḵ is better than ḫ; ḫ is easily confused with the pharyngeal fricative. Benwing2 (talk) 23:49, 1 May 2024 (UTC)Reply
Personally, I agree with Benwing, although I am sympathetic to the idea that we should use whatever is most widely used, and I am also sensitive to the issue of words being findable by people who search for them using other transliteration systems. I would like us to implement having the templates/modules produce (but then potentially set to be invisible / display:none by default) other common transliterations so the entries can be found if people use our site search or Google and search for ʾiʿlān etc, as discussed in the 2022 discussion, unless that would cause problems. Then we could probably also set different CSS classes for the different transliterations so people could select whether they see ʾiʿlān or ʔiʕlān, similar to the way people can choose to see or not see {{,}} (and we could debate which one would be most helpful to have on by default for the average lay reader). - -sche (discuss) 02:33, 2 May 2024 (UTC)Reply
@-sche I think this is a good idea. AFAICT it would require some changes to Module:languages (which handles transliteration) so that a given transliteration method can return multiple transliterations rather than just one, each transliteration associated with properties such as CSS class, with one of them identified as "canonical" (meaning it is displayed while the others aren't). The only tricky thing here is manual transliterations; ideally, there would be method to convert a manual transliteration in the canonical system into each of the other systems, so that users have to specify only one transliteration rather than multiple. In the examples here, that conversion isn't hard, but sometimes it may not be possible (e.g. the current Hebrew transliterations are based on modern Hebrew pronunciation, which has several mergers compared with Biblical Hebrew, so we couldn't convert modern to Biblical Hebrew transliterations). Benwing2 (talk) 02:45, 2 May 2024 (UTC)Reply
@Benwing2: I believe that some of the existing manual transliteration entries may need to be reviewed in order to see whether their use was actually justified in the first place. Some of them are there only to workaround various technical issues, which ceased to exist. For example, this manually added transliteration for a Belarusian quotation became unnecessary after this fix. And I definitely support the idea of having multiple transliteration schemas, because this would allow introducing Belarusian Łacinka in addition to the current WT:BE TR scholary transliteration. As @-sche mentioned, the primary motivation is that words should be preferably searchable via Google or via the search box from the Wiktionary front page. Belarusian entries currently solve the searchability problem via manually added "Alternative forms" sections with red links, but this isn't ideal. So the proposed improvement has uses even beyond Arabic and Hebrew. --Ssvb (talk) 16:41, 2 May 2024 (UTC)Reply
Yes, I'm also in favour of having multiple transliteration schemes for this reason. Theknightwho (talk) 11:44, 8 May 2024 (UTC)Reply
@-sche This is a good proposal.
@Benwing2 I understand that ʔ and ʕ are more visible than the small half-rings, but I question how useful using them would be for the average reader since they are barely used in current transliteration schemes. If it hinders readers' ability to find these entries, we should avoid using them. Additionally, when is ḫ confused with the pharyngeal fricative? Antiquistik (talk) 05:42, 2 May 2024 (UTC)Reply
@Antiquistik I'm not sure what you mean by "barely used in current transliteration schemes". Are you referring to transliteration schemes outside of Wiktionary? If so, why do you think the average reader will be familiar with them, but won't be familiar with IPA? As for using ḫ, my point is that this is easily confused with ḥ (the transliteration for pharnygeal fricative), and having all three of h ḫ ḥ is going to make for endless confusion. Benwing2 (talk) 05:47, 2 May 2024 (UTC)Reply
@Benwing2 While I don't think that the average reader will be more familiar with the IPA signs, I doubt that they will be searching Arabic terms with signs from the current standard transliteration schemes substituted by IPA signs that are rarely used for Arabic transliteration.
And, as pointed out by @Ssvb, the entries need to be searchable. Using the more widely employed transliteration is the better option for this.
As for the transliteration of /x/, I strongly disagree with your position. The transliterations for other Afroasiatic languages like Old South Arabian, Ugaritic and Ancient Egyptian use both ḫ and ḥ without any problem, and I don't see why should the organic /x/ in Arabic be represented through a character used for sounds affected by begadkefat. Antiquistik (talk) 11:19, 3 May 2024 (UTC)Reply
@Antiquistik: Your premise of the signs being but used in IPA transcriptions before having been adopted by Wiktionary is wrong. We realized that there are lots of linguistic books, more or less traditionally Semitist, with them as their editorial choice for transcription. I have doomsurfed the philologies enough in the last 1½ decade to know that this is by far not so uncommon as to be stunting someone’s dictionary use. I also want to raise your attention towards pertinent languages without native writing system that can only be entered in an academic transcription, the Modern South Arabian languages, which have suffered some variations in transcription styles over the decades and native countries of researchers but I think are amenable as written down at أَيْدَع (ʔaydaʕ), whereas with all their diacritics the rings would strain the readers’ tempers. Fay Freak (talk) 11:37, 3 May 2024 (UTC)Reply
@Fay Freak How prevalent are Arabic transliterations using the IPA signs compared to the half-rings? Antiquistik (talk) 13:06, 3 May 2024 (UTC)Reply
@Antiquistik: No one, or at least not me, can do stats on such thing. There’s is also a qualitative difference in the kinds of resources that use them. In purely Semitist sources due to tradition the rings hold their ground. I have clicked around in my Semitics folder for you. I wanted to say that Leonid Kogan uses MODIFIER LETTER GLOTTAL STOP ˀ a lot, which is a bit more conspicious and between the two extremes, but the second work by him I opened ({{R:tig:Kogan:2011}} after {{R:sem-pro:GC}}), goes the whole hog and uses ʔ for Arabic and the other Semitic languages. {{R:sqt:CSOL}} and {{R:sem-pro:SED}} uses ˀ, anything published in the Journal of Semitic Studies such as doi: 10.1093/jss/fgt038 the rings, we may see it as a publisher decision, in more relaxed journal pieces he seems to prefer the IPA letters? In the old and long series Perspectives on Arabic linguistics you got the IPA letters all around. There is a lot of socialization behind letter choices, you just need to get used them, but not lose aesthetic sense. University docents may teach something specific but there is a point where one shan’t believe other people. Younglings learn and adults function by imitation but science by organized skepticism, a dilemma.
The complicated part: I can hold you a lecture how it is has to do with spatial-temporal memory, again the first chapter of the handbook of memory, ASD and the law I mentioned. Everything normal in the head, you guys tribally react to relations previously experienced with and from other people, in spite of the meatspace effecting the worst selection bias, contrary to universalism of science. You underestimate the psychological background behind all this. I did hardly positively respond to what teachers required or expected from me in terms of organizing a treatise, by some internal logics which aren’t strictly rationally evident, writing points of a paper in this and that order and not missing out a super-influential fashionable nonsense in the field I mean, which is detrimental to exams, and self-portrayal in job applications, however exquisitely able to judge the merits of the matter in isolation, and I am now very aware how strong feelings about signs come about, without sustaining them myself. We don’t just count voices together to let the loudest party win, this is not how creating good stuff works, only a working hypothesis. Fay Freak (talk) 14:09, 3 May 2024 (UTC)Reply

Descendant tree design[edit]

Here's my idea for a horizontal tree style that could be generated by {{etymon}}. I've switched up the colour scheme, since this is a descendants tree rather than an etymology tree. We can also include question marks or labels just as in the etymology tree. Let me know what you think! @Vininn126, Equinox, Sławobóg, -sche, 0DF Ioaxxere (talk) 21:24, 1 May 2024 (UTC)Reply

How would you represent borrowings and morphological reshaping in this format? Also I think I prefer Design 2, because in Design 1 the single right-branching node might be interpreted as somehow different from the below-branching nodes (and in addition, in Design 1 someone might e.g. interpret the juncture where Proto-Italic branches off as its own node, a daughter of PIE rather than just an artifact of the design). However, even better than either IMO would be one where the parent is centered vertically among all of its children rather than being at the top. Benwing2 (talk) 02:55, 2 May 2024 (UTC)Reply
@Benwing2: Probably with the same label system that {{etymon}} already uses. I like your idea for centering the node, although for trees with a huge number of lines it might lead to the ultimate ancestor being far down the page. Possibly the ultimate ancestor could be given some kind of special status where it always goes at the top left of the page. Ioaxxere (talk) 05:31, 2 May 2024 (UTC)Reply
I think Design 2 is also my preference, at least on desktop. Vininn126 (talk) 13:25, 3 May 2024 (UTC)Reply

Design 1

Proto-Indo-European *ph₂tḗr

Proto-Germanic *fadēr

Proto-West Germanic *fader

Old English fæder

Middle English fader

English father

Scots faither

English faeder

Proto-Italic *patēr

Latin pater

Old French pere

Middle French pere

French père

English père

English pater

Tok Pisin pater

Proto-Celtic *ɸatīr

Old Irish athair

Manx ayr

English ayr

Design 2

Proto-Indo-European *ph₂tḗr

Proto-Germanic *fadēr

Proto-West Germanic *fader

Old English fæder

Middle English fader

English father

Scots faither

English faeder

Proto-Italic *patēr

Latin pater

Old French pere

Middle French pere

French père

English père

English pater

Tok Pisin pater

Proto-Celtic *ɸatīr

Old Irish athair

Manx ayr

English ayr

Design 3

Proto-Indo-European *ph₂tḗr

Proto-Germanic *fadēr

Proto-West Germanic *fader

Old English fæder

Middle English fader

English father

Scots faither

English faeder

Proto-Italic *patēr

Latin pater

Old French pere

Middle French pere

French père

English père

English pater

Tok Pisin pater

Proto-Celtic *ɸatīr

Old Irish athair

Manx ayr

English ayr

Design 4

Proto-Indo-European *ph₂tḗr

Proto-Germanic *fadēr

Proto-West Germanic *fader

Old English fæder

Middle English fader

English father

Scots faither

English faeder

Proto-Italic *patēr

Latin pater

Old French pere

Middle French pere

French père

English père

English pater

Tok Pisin pater

Proto-Celtic *ɸatīr

Old Irish athair

Manx ayr

English ayr

Design 5

Proto-Indo-European *ph₂tḗr

Proto-Germanic *fadēr

Proto-West Germanic *fader

Old English fæder

Middle English fader

English father

Scots faither

English faeder

Proto-Italic *patēr

Latin pater

Old French pere

Middle French pere

French père

English père

English pater

Tok Pisin pater

Proto-Celtic *ɸatīr

Old Irish athair

Manx ayr

English ayr

At the risk of stating the obvious, only a small fraction of the descendants are being shown here. Is this focussed on English? Nicodene (talk) 21:49, 1 May 2024 (UTC)Reply
@Nicodene: This is just a mockup. I created all the HTML by hand, but the full (automatically-generated) tree will have all the descendants. Ioaxxere (talk) 22:11, 1 May 2024 (UTC)Reply
How would they all fit? Some of the ‘nodes’ have dozens of direct descendants. Nicodene (talk) 22:16, 1 May 2024 (UTC)Reply
@Nicodene: The tree would be extremely tall in that case. Either way, it would still be significantly more readable than something like what we currently have at Reconstruction:Proto-Sino-Tibetan/s-la#Descendants. Ioaxxere (talk) 22:19, 1 May 2024 (UTC)Reply
I have to agree with Nicodene. With etymology trees and the vertical format, it makes more sense to me because the tree will be much more compressed, but for descendants, I can't really see it working as well. It'll get really unwieldy and fast. The list you've pointed too isn't good either, but I don't like replacing one problem with another one. Looking at the link you've sent, how would this interact with etymology-only languages or the situation with Chinese? AG202 (talk) 03:06, 2 May 2024 (UTC)Reply
Etymology-only languages shouldn't be too difficult to handle in general. For Chinese, I feel like including dozens of dialectal pronunciations in Reconstruction:Proto-Sino-Tibetan/s-la is excessive and we should reduce that to only those forms which were borrowed into other languages. It's also possible that descendants trees will end up having less automation than etymology trees in general. Ioaxxere (talk) 05:31, 2 May 2024 (UTC)Reply
One thing that needs to be addressed is alternative forms. In Middle English, there are loads of them for everything. They can't always be ignored, because there are enough cases like catch and chase from Old French: chacier, chacer; cachier, flour and flower from Middle English: flour, fflour, fflowr, fleur, flor, floure, flower, flowr, flowre, flowyr, flur or even morrow and morn from Middle English: morwe, morewe, morowe, morow, morrou, morue, morw, morȝe, morewen, morowen, morȝen, morwen, morwyn, morwhen, morwoun, morun, moron, moryn, morn; morgen, marhen, mareȝen, morghen, moruwe, where different alternative forms have different descendants. Chuck Entz (talk) 18:31, 3 May 2024 (UTC)Reply
Love it. After a quick glance at the HTML, is the only difference alignment? I think that since this could appear early on in a number of entries that have right-floating tables of contents, I think left-alignment makes the most sense to avoid some of the inevitable bunching. —Justin (koavf)TCM 22:14, 1 May 2024 (UTC)Reply
@Koavf: No, the difference is whether there are connectors on the bottom of the boxes. I have no idea why the alignment is different, actually... Ioaxxere (talk) 22:16, 1 May 2024 (UTC)Reply
Ah, I see that now. —Justin (koavf)TCM 22:17, 1 May 2024 (UTC)Reply

get rid of noun and adjective plural form categories once and for all[edit]

There appears to be consensus established here, here and here, as well as in this diff, to not categorize noun and adjective non-lemma forms in separate 'noun plural forms' and 'adjective plural forms' categories. Yet when I made such a change for newly added Chadian Arabic terms, my favorite editor User:Fenakhay went on a revert spree. By longstanding consensus, we do not in general categorize non-lemma forms as e.g. Category:Russian noun prepositional case forms etc., so I don't see why an exception needs to be made for noun plural forms. However, I'd like to get clear consensus here to remove all such categories and delete the entries from Module:category tree/poscatboiler/data/non-lemma forms that allow such categories to be recognized. We have already done this for some languages; for example, there is intentionally no Category:English noun plural forms, and that page is protected against re-creation by bots or non-admins.

The alternative is to outline a clear rationale for why we need such categories and a rule for which situations they are allowed and which situations they aren't allowed. Either way, the current haphazard situation, where some languages have such categories and some don't, and the categories are incomplete, is unmaintainable.

Benwing2 (talk) 23:45, 1 May 2024 (UTC)Reply

And a stronger consensus at Wiktionary:Requests for deletion/Others#Category:Adjective plural forms by language. It seems that Fenakhay is the only editor who supports the retention of these categories. Consensus is against them. This, that and the other (talk) 02:55, 2 May 2024 (UTC)Reply
I support getting rid - trivial category intersections like this are a waste of time. Theknightwho (talk) 03:24, 2 May 2024 (UTC)Reply
I don't see any rationale for this kind of category either and so am in favour of deleting them. Nicodene (talk) 14:13, 2 May 2024 (UTC)Reply
I agree as well. Ioaxxere (talk) 17:57, 2 May 2024 (UTC)Reply
Support deleting these. Ultimateria (talk) 17:21, 6 May 2024 (UTC)Reply
If we have this kind of thing, it should be with a clear rationale for when/where and why (as Benwing says) and it should be added automatically, probably by whatever headword- or definition-line templates we're using to declare something as a noun plural form, paucal form, etc in the first place — I say this because as far as I saw in the prior RFDs, the categories were populated haphazardly and manually with handfuls of entries, which is not useful. The usefulness of categorizing non-lemma forms by their specific non-lemma-ness seems small (though not nonexistent) to me; I suppose if I wanted to know what kinds of endings Foobarian noun plural forms had, a category would be useful, but the array of endings which Foobarian noun plural forms have could alternatively be mentioned on the About Foobar page, or on the Foobarian equivalent of Appendix:English grammar. Can anyone articulate something these categories would be useful for? (Absent that, I have no objection to deleting them, and indeed voted to do so in some of the prior RFDs.) - -sche (discuss) 19:21, 2 May 2024 (UTC)Reply
Personally, I find these categories very useful from a navigational standpoint, so I'd like to see them kept. That said, they should be added automatically as part of templates like {{infl of}} and {{plural of}}, not added manually by users. Binarystep (talk) 11:26, 5 May 2024 (UTC)Reply
@Binarystep Do you realize this is simply an intersection category? In general we don't usually include intersection categories because you can search for any combination using the Search feature. In this case, e.g. to do the equivalent of CAT:Chadian Arabic noun plural forms, you can search for the combination of category CAT:Chadian Arabic noun forms and template Template:plural of. Adding them automatically using templates like {{infl of}} and {{plural of}} has already been tried, but it turns out to be difficult from a programmatic standpoint in some cases and a maintenance headache, which is the reason I want them removed. Benwing2 (talk) 20:02, 5 May 2024 (UTC)Reply

template similar to Template:alt or Template:desc for Derived terms, Related terms, etc.?[edit]

Hi. User:Fay Freak and I have been having a discussion about using {{alt}} or {{desc}}, or a creating a similar template, for Derived terms and the like. This came up because Fay Freak has been using {{desc|nolb=1}} in Derived terms sections. (Note: |nolb=1 disables the language name at the beginning. FF proposes renaming |nolb= to |nolang= to avoid confusion with |lb= for labels and because what's being suppressed is a language name, not a label.) Both {{alt}} and {{desc}} let you specify a series of terms along with per-term properties plus overall labels for the whole set of terms, although the syntax of the two templates is different and {{desc}} has some extra features specific to descendants. Note that we also have {{syn}}, {{ant}}, etc. for inline synonyms/antonyms/etc., which likewise have support for specifying a series of terms with both per-term properties and overall labels. The current syntax for Derived terms, Related terms and such involves manually listing each term with {{l}} and using {{q}} to add qualifiers as needed, but compared with {{alt}} and {{desc}} this is both more cumbersome and less standardized, meaning that different people format things differently. I think we ought to have some way for Derived terms sections and the like of specifying a list of terms plus labels, similar to {{alt}} and {{desc}}. The question is, should we just reuse e.g. {{alt}} for this purpose, or create another template? (If the latter, I'd maybe call it {{terms}}.) Potentially we could rename {{alt}} to {{terms}} or something similarly generic and keep {{alt}} as an alias, since there isn't really anything about {{alt}} that is specific to Alternative forms.

I'm omitting mention of {{col3}} and the like; while these are useful especially for long lists of similar terms, they don't provide the ability to specify a set of labels at the end of the list of terms, as {{alt}} and {{desc}} do.

Benwing2 (talk) 05:22, 2 May 2024 (UTC)Reply

That'd be quite nice. All I have to add is that it'd help to have the option to split derived terms into columns or put them in collapsible boxes, as people have been doing with a variety of other templates (cf. cado). Nicodene (talk) 14:01, 2 May 2024 (UTC)Reply
I think we'd be able to scrape this to be honest. All it'd need is an etymology section for most terms... Vininn126 (talk) 16:20, 5 May 2024 (UTC)Reply
@Vininn126 I don't quite understand what you mean, can you clarify? Benwing2 (talk) 20:03, 5 May 2024 (UTC)Reply
Sorry, misinterpreted. Not sure I have a strong opinion. Vininn126 (talk) 07:17, 6 May 2024 (UTC)Reply

Plurals on head lines and declension tables[edit]

Is there any point in having both plurals on the head line and a declension table showing the plural for a noun lemma? I would be inclined to omit the plural(s) when there is a declension table. --RichardW57m (talk) 16:36, 2 May 2024 (UTC)Reply

@RichardW57m it would perhaps help to specify which language you're thinking of and give an example. This, that and the other (talk) 03:07, 6 May 2024 (UTC)Reply
The specific language where this has come up is Lithuanian, avìdė, which currently only displays the plural through the declension table. A similar specific is with the Lithuanian adjective headword template, where until recently many ordinals' neuter form was wrong and contradicted the following declension table. --RichardW57m (talk) 11:27, 7 May 2024 (UTC)Reply
IMO it depends on how regular the inflections in question are. If they serve as something like principal parts, I think it's useful to put them on the headword line as well as in the declension table, because then someone with some familiarity with the language will know how to inflect the term without needing to look through the whole declension table to figure out what the most important parts are. This is similar to how we list the past historic and past participle for Italian verbs. OTOH if they are largely predictable, putting them in the headword line is less useful. Benwing2 (talk) 23:27, 8 May 2024 (UTC)Reply

A way to more easily connect with readers: a follow-up[edit]

Following Wiktionary:Beer_parlour/2024/March#A_way_to_more_easily_connect_with_readers, I wrote to WMF in an attempt to figure out how to best resolve this issue. @Johan Jönsson replied and has given us an option, I think. He suggests we create a new mailing list for admins and for us to put enwiktionary in the name somehow. What do people think of this solution? Vininn126 (talk) 16:03, 3 May 2024 (UTC)Reply

Support Ioaxxere (talk) 16:56, 3 May 2024 (UTC)Reply
Support This, that and the other (talk) 08:30, 5 May 2024 (UTC)Reply
Support Binarystep (talk) 12:45, 5 May 2024 (UTC)Reply
Support Thadh (talk) 11:09, 6 May 2024 (UTC)Reply

Volga Türki language[edit]

Greetings, I'd like to propose giving Volga Türki an L2.

It is a significant member of the Middle Turkic literary languages, and is as important as Ottoman Turkish, Chagatai and Karakhanid, all of which already have their own Wiktionary categories: Category:Ottoman Turkish language, Category:Chagatai language, Category:Karakhanid language. Volga Türki is considered a descendant of Karakhanid, together with Chagatai, however they all are roughly contemporary.

It was in wide use in the Volga-Ural region from 15th century (if including Qissa-i Yosof poem by Qul Ghali, then from 12th century) until adoption of Cyrillic and Latin scripts for Tatar and Bashkir languages under Soviet rule. Even though before Soviet rule, at late 18th-early 19th century the written languages for Tatar and Bashkir started to slightly diverge from Volga Türki, it remained a common standard for international affairs, especially between other Turkic groups.

Its addition would not only help with etymological sections, but also help connect the cognates with other Turkic languages, similarly to other Middle Turkic literary languages' sections.

As for Unicode characters, numerals and readings, I already have prepared all of this, and will work on adding them as soon as the category is created. The sources of lemmas are going to be taken from books, dictionaries and other written resources from that time period. I will try to list a source for each lemma whenever possible.

The only issue, however, is that the language does not have its own ISO 639-2 code yet. I propose one of the following codes to be used for the language: iut (for İdil-Ural Turkic); tui (Turkic of İdil-Ural). I deprecate codes like vut (Volga-Ural Turkic) and ott (Old Tatar) firstly due to the name Volga not being used by the locals, especially during the era of Volga Türki, and secondly due to the name Volga/İdil/İdel Türki being neutral, and Old Tatar primarily referring to the diverged variant of Volga Türki that was used specifically for Tatar. Bababashqort (talk) 16:06, 3 May 2024 (UTC)Reply

What is the Volga Turki corpus and how accessible is it? Qissa-i Yosof poem by Qul Ghali should definitely not be included, as it is covered by Khorezmian Turkic [1]. Allahverdi Verdizade (talk) 20:40, 3 May 2024 (UTC)Reply
Support BurakD53 (talk) 17:57, 4 May 2024 (UTC)Reply
Its corpus mostly isn't digitalised, but practically all Bashkir and Tatar literature from at least 16th century until late 19th century is written in Volga Türki. The books, manuscripts and magazines are still preserved in a lot of libraries in Tatarstan and Bashkortostan. As for Qissa-i Yusuf, that is somewhat debatable, but given the timeframe it probably suits Khorezmian, as one of the ancestors of Volga Türki. Bababashqort (talk) 07:24, 5 May 2024 (UTC)Reply
@Bababashqort: for the last issue, we generally make up our own codes using the code for the group it belongs to (probably "trk") followed by a hyphen ("-") followed by some sequence of letters that's not already in use by us. That way there's no chance of our code conflicting with an ISO code. Since this is strictly for internal use and our modules and css/jss code convert everything for browsers, we don't have to use existing ISO codes. Chuck Entz (talk) 18:24, 4 May 2024 (UTC)Reply
Yes, I've been told that wiki uses a placeholder, but didn't exactly know how it worked. Thank you for explaining!
In this case I'd suggest trk-iut Bababashqort (talk) 07:25, 5 May 2024 (UTC)Reply
@Bababashqort We try to use the first three letters of the lect in the second part of names like this. What do you think of trk-idi or trk-vol? Benwing2 (talk) 08:04, 5 May 2024 (UTC)Reply
trk-idi includes only the Volga part, as well as trk-vol. The name itself, however, is taken from the most widespread naming of the language, which unfortunately is shortened to Volga Türki, omitting Ural. And speaking of İdil, it is actually spelled as İdel in Tatar itself, İdil is just more Common Turkic. Therefore the only solution seems to be trk-iut, it's not that hard to deduce I think. Bababashqort (talk) 11:54, 5 May 2024 (UTC)Reply
@Allahverdi Verdizade suggested to make a Turki category instead, which I'd very much prefer. It would remove the need to add more distinct subvariants of it, such as North Caucasian Turki, Nogay Turki and others. This would also allow to use derivation template for all languages that used it: Crimean Tatar, Kumyk, Nogay, Bashkir and others. Bababashqort (talk) 13:21, 5 May 2024 (UTC)Reply
@Bababashqort Sure, that works. What language is this a category of? Benwing2 (talk) 19:56, 5 May 2024 (UTC)Reply
I think he meant he wants Türki as a language code, not specifically Volga Türki Bortkastningskonto (talk) 07:01, 6 May 2024 (UTC)Reply
@Bortkastningskonto @Bababashqort OK, I need more information then. Is "Türki" supposed to be an L2 language? This is an awfully generic name for a language, and I would likely oppose this name for this reason. And I will repeat my assertion that the code for Volga Türki should be 'trk-vol' in keeping with the name. The code should reflect the first three letters of the lect name barring extraordinary circumstances (usually due to ambiguity when there are multiple lects sharing the first three letters, which is not an issue here). @Allahverdi Verdizade can you weigh in here? I am not qualified enough to tell whether this should be an L2 language, an etym-only language or just a label of some other language (the last two being rather similar). Benwing2 (talk) 07:11, 6 May 2024 (UTC)Reply
I didn't actually suggest making Türki a L2, rather I wondered whether it wouldn't be better to do so depending on how different Volga Türki is from, say, North Caucasian Türki. I can't answer that question myself, and I think, in general, very few people can give a well-informed opinion on that. Reading this book on North Caucasian Turki (in Russian) might help a little. Considering that Bababashqort is likely only going to work with sources written in the Volga variety, maybe it is the safest to create a Volga Turki L2, in which case you would circumvent the problem with "awfully generic name". Documents in North Caucasian Turki are terribly inaccessible (not digitized or normalized), so I don't think anyone is going to work with them.
In any case, there is also the problem of classifying "literary languages" and fitting them into genealogical tree schemes. It is often said that this or that language "is moslty X, but also incorporates elements of Y", at the same time as it "continues the literary tradition of Z". I can't exactly tell you what it means that "Volga Turki continues the tradition of Khorezmian Turki", which in turn "continues the tradition of Karakhanid", as it oftentimes is put in Russian books on the matter. Too much arbitrariness for my taste. So my opinion is that these "literary languages" maybe should not have ancestors and descendants. Allahverdi Verdizade (talk) 17:35, 7 May 2024 (UTC)Reply

Request for a new language[edit]

Yet again, I request for Old Lombard to be listed separately, as for now Old Lombard is listed as a dialect and not a language. That Northern Irish Historian (talk) 17:30, 4 May 2024 (UTC)Reply

I notice that Old Italian is currently an etym-only variant of Italian. Why can't Old Lombard be the same? How different are Lombard and Old Lombard? Benwing2 (talk) 18:59, 4 May 2024 (UTC)Reply
Old Lombard:
  • Faremo preg a Deo a Questi cominzament
  • et a la soa mather ke preg l’omnipotent.
  • Ke n’des a dir et a far tute l so placiment
  • Ço ked is la scritura si se conven a dir
  • De la pasin de Christ a ki ne plas hodir
  • La qual per nu katif je plase sostegnir
  • Bene questi paroli de panzer e da stremir
  • Qui longa fis e dis del pasio del fy de la rayna.
  • La qual si m’dia gratia et a mi sia vesina
  • Ke parlo dritament de la pasion divina
  • St’apreso si me scampo da la infernal pena.
Modern Lombard:
  • Ambiaróm con ‘na preghiéra a Dio
  • e a sò madèr che la préghes l’Onipotent
  • Che nómes a dì e a fa töt de so gradimènt
  • E per bontà sò el vègnes a compimènt
  • Chèl che la dis la Scritüra isé come l’è giöst a dìl
  • De la pasiù de Cristo a chi che öl sintìl
  • Pasiù che per notèr pecadùr la sèrf a soportà
  • Con rasegnasiù chèste parole de pianzer e dè dulùr
  • Ché se parla e se dìs del fiöl de la regina
  • Che la me dàghes gràsia e la me stàghes vizìna
  • ‘Ntat che parle drit de la pasiù divina
  • Semài che scamparó de la pena infernal.
That Northern Irish Historian (talk) 22:35, 8 May 2024 (UTC)Reply
@That Northern Irish Historian That's not what I was looking for; you have pasted in two different translations which naturally will be different. If you try to match up the corresponding words, they are IMO marginally different enough to maybe be considered different L2's (although they differ less e.g. than the current Occitan dialects). I notice however that there are 0 lemmas currently listed as Old Lombard; are you actually planning on adding some? Benwing2 (talk) 23:16, 8 May 2024 (UTC)Reply
Yes, but see zinqui, Jesu, and other pages. It is not working. That Northern Irish Historian (talk) 23:24, 8 May 2024 (UTC)Reply

That's how we enter these words. If you have any objections, please write here. BurakD53 (talk) 14:29, 5 May 2024 (UTC) wordsReply

lol. Yes, I have objections. Allahverdi Verdizade (talk) 16:11, 5 May 2024 (UTC)Reply
As I said before, I want the {{trk-ogz-pro}} code to be removed and replaced with {{trk-ogz}}. Since we have already reconstructed them all under the {{trk-pro}} pages, Proto-Oghuz is quite unnecessary. If anyone still wants to reconstruct Proto-Oghuz, you can reconstruct it using the * sign on the Oghuz page. (Which is quite unnecessary) Likewise, {{trk-klj}} can also refer to the Arghu language, but the data in this language consists of a few words. {{trk-ogz}} is the direct ancestor of all Oghuz languages, in short, it is the same as Proto-Oghuz {{trk-ogz-pro}}. However, we cannot enter these Oghuz or Proto Oghuz words recorded in the Diwan into the site as entries. It requires reconstruction in order to be entered to us. However, these Proto Oghuz words, also Proto Khalaj words, are not a reconstruction. I think that both of them should be entered as input on the site, the biggest reason is that these languages cannot be assumed to be dialects of other languages. But since the Arghu language consists of only a few words, it can be entered under the name Proto. Oghuz language is mentioned many times in the Diwan and even information about its grammar is given. A few Proto Khalaj, i.e. Arghu, words may be added as exceptions. But since this is the case for Oghuz, there is no need to create a language code called Proto-Oghuz. This is my opinion. I firmly reject the addition of these Oghuz words to Old Anatolian Turkish. Not every word mentioned in the Diwan has been witnessed in Old Anatolian Turkish, and the place where Kashgarî shows the Oghuzs on the map in the period he mentions is not Iran, but Central Asia. Also the words here are more archaic than the form in which they are found in Old Anatolian Turkish. BurakD53 (talk) 18:22, 5 May 2024 (UTC)Reply

Lemma categories[edit]

Discussion moved from WT:Beer parlour/2024/April#Lemma categories.

I've been cleaning up Special:UncategorizedPages, and I've run across a number where @Nicodene has disabled categorization for alternative forms. My understanding is that all mainspace entries should be in either Category:[Language] lemmas or Category:[Language] non-lemma forms. While an alternative form is supposed to be a stub that links to the main form, as far as the categories are concerned, it's a lemma. It's certainly not a non-lemma form, because it has its own non-lemma forms. Leaving it out of both categories raises the question of why we have the entry at all, if we feel we need to hide it: if we don't link to it in the main entry, there's no way to navigate to it.

This has come up before over the years, and we've more than once decided to do it this way. As far as I can tell, Nicodene is the only editor who's doing otherwise. Has anything changed? Chuck Entz (talk) 03:13, 6 May 2024 (UTC)Reply

Why should Category:Franco-Provençal lemmas be clogged with twelve different renditions of ôtro, seventeen of ôtrament, and ten of solament? Why should Category:Old French lemmas (not to mention Category:Old French adverbs) be clogged with two hundred seventy one renditions of iluec? The whole point of a lemma is to provide a citation form to cover the variants. That is how altforms and altspellings are handled by the vast majority of dictionaries. Nicodene (talk) 03:23, 6 May 2024 (UTC)Reply
I'm of two minds here. Yes, we generally include alternative spellings and forms as lemmas; otherwise, for example, we'd end up including only one of oxidi{s,z}e as a lemma, and the other would go nowhere. At the same time, however, including 171 alt variants of iluec seems like serious overkill. Maybe we need a separate policy for non-standardized languages vs. standardized ones. Benwing2 (talk) 07:16, 6 May 2024 (UTC)Reply
At a minimum, every entry should be in some category. As far as how that's been accomplished up to now, my understanding matches Chuck's, that every entry is supposed to be categorized as either a lemma or a nonlemma (or both) and that alternatively-spelled nouns are still nouns (and lemmas, from the category / grammatical perspective). We could change that, e.g. add a parameter which, instead of turning categorization off, moves the entries from "Category:Foobarian nouns" to at least a POS-agnostic catchall "Category:Foobarian alternative forms and spellings", or something more specific like "Category:Foobarian alternative forms and spellings of nouns", "Category:Foobarian alternative forms and spellings of lemmas", but I do think we should continue to regard a completely uncategorized entry—an entry that cannot be accessed from any part of our category tree—as a problem.
There was support for not putting just any alternative spelling into topical categories in this 2022 discussion, but that didn't leave the entries categoryless.
FWIW, the issue of terms having tons of spellings isn't strictly limited to overall-nonstandardized languages, e.g. English has lots of spellings of kinnikinnick, Muhammad, voivode... but I think Benwing's suggestion of handling this on a per-language basis (and just accepting that the English categories will have a few cases like Muhammad where there are a bunch of spellings) is probably more workable than e.g. trying to decide (in a way that can be maintained over time with any consistency) on a per-spelling basis what counts, in a mostly-standardized (but standards-body-less "ungoverned") language like English, as a "standard" spelling. (E.g., several of the alternative spellings of Muhammad are used mainly in scholarly works, so dismissing them as nonstandard seems hard; and in the other direction, for a largely dialectal word, determining why any one spelling should be considered more standard than another seems hard.) - -sche (discuss) 13:54, 6 May 2024 (UTC)Reply
We could change that, e.g. add a parameter which, instead of turning categorization off, moves the entries from "Category:Foobarian nouns" to at least a POS-agnostic catchall "Category:Foobarian alternative forms and spellings"
I would be quite happy to use that if it were available as an option.
My main concern is keeping the categories clear and usable. When I look up 'Foobarian feminine nouns', for instance, I'd rather not have to wade through 5–10 (+) duplicates for every distinct noun. That is a serious headache with languages like Franco-Provençal or Romansch. Nicodene (talk) 07:54, 7 May 2024 (UTC)Reply

Edit with "username removed"[edit]

This edit has the user name removed. How can one see (if not who the user is), which user removed it and why? [2] Equinox 09:33, 6 May 2024 (UTC)Reply

I removed it because it was an accidental IP/logged-out edit by an editor (the same as did a similar change to unrapable). — SURJECTION / T / C / L / 10:17, 6 May 2024 (UTC)Reply
I'm officially saying: don't do that. You can revert, delete, but do not wipe content unless it's real serious stuff like child porn. Thank you. Equinox 23:01, 7 May 2024 (UTC)Reply
Re how to see which admin performed the revdel: it's technically in the "View logs for this page" link on the edit history page, [3]. If there were a lot of revdels and they did not follow so closely after the time the edits themselves were made, e.g. if I now went to the page and hid a revision from two months ago, and then Surjection hid a revision from one month ago as well as your edit just now, it might be hard for non-admins [who don't have "diff" links] to discern from that log who hid which thing... I guess in that case they'd just have to say "hey, who revdel'd X" and admins could check.) - -sche (discuss) 14:28, 6 May 2024 (UTC)Reply
@Surjection: I want you to understand how it looked to me: I saw that someone had made an edit, they had no name, I couldn't see them, or talk to them, or discuss, it was like a GHOST DID IT. And I couldn't see who removed their name either. If you ever spent time on WP:OFFICE then ...well. Equinox 22:50, 7 May 2024 (UTC)Reply
I would, personally, be happy to see text like "edit made by a user whose name is hidden by this admin: Surjection". What I think is wrong and bad and goes against our free openness is just that MYSTERY NO-NAME. Equinox 22:51, 7 May 2024 (UTC)Reply
Side point: I know Chuck Entz (for example) likes to "clean the graffiti wall" so that vandals can't see their names. But I don't like that. The wiki should be a public space and we should only hide the history in real serious situations like "doxxing" (real name-addresses) or... am I wrong? @-sche @Chuck Entz @Surjection (and even worse, are there Wikipedia rules we are supposed to obey as children.) Equinox 22:55, 7 May 2024 (UTC)Reply
AFAIK it's global WMF policy to suppress this kind of thing (the IP addresses of users who've accidentally edited logged out), and indeed to suppress it way harder than a mere revision-deletion like Surjection did: "oversighters" have (or had?) database access to delete the information so hard that not even admins can see it. (But it also takes time to contact them, so it's fine for admins to revdel it in the meantime, like this.) This is precisely because of doxxing concerns, because many IP addresses identify the person's real address. (Other IP addresses, of course, merely send you to that one farm in Kansas.) If you ever see an edit where you think the content of the edit is wrong, just undo the edit... as you saw in this case, the username being suppressed doesn't prevent you from undoing the edit. - -sche (discuss) 01:39, 8 May 2024 (UTC)Reply
Would there be an issue if contributors were to hide their IP address with their screen names after, say, a week? CitationsFreak (talk) 03:46, 8 May 2024 (UTC)Reply
I should clarify that AFAIK such hiding only happens when someone requests it—usually the person who made the edit, though plausibly someone else who simply noticed what was going on. Last I heard, WMF folks were trying to roll out something that automatically obfuscates all IP addresses by making them show up in edit histories as e.g. incrementing numbers that change periodically or on request (so anytime someone thinks their current [non-]IP is getting too much attention from admins, they can hit "refresh" and start doing vandalism under a new identity, just like logged-in users can by creating multiple accounts), which will probably remove the need to do this in the future, if it gets implemented. - -sche (discuss) 05:12, 8 May 2024 (UTC)Reply
Would there be an issue if contributors could request that their IP address be hidden by their screen name? CitationsFreak (talk) 05:42, 8 May 2024 (UTC)Reply

The issue of Old Kashubian (Old Pomeranian?)[edit]

I came to a recent realization about the {{R:zlw-opl:SPJSP|Old Polish dictionary}}: it contains texts from Pomerania with Pomeranian features, as it was made during a time when Kashubian was considered a dialect of Polish. However, typologically, this is very, very wrong. Pomeranian is considered North Lechitic, and anything "Polish" and (Masovian, Upper Polish, Lower Polish, and Silesian) are considered East Lechitic, therefor anything Old Kashubian should not be considered Old Polish. I propose a split; I intend to add the location of creation for any Old Polish documents anyway for a future dialectal project (for Old Polish this means categorizing somehow location of attestation by dialect) and separating any texts from Pomerania for "Old Pomeranian" with a code zlw-opm, or perhaps "Old Kashubian" zlw-ocb with Kashubian and Slovincian as the children. These codes seem clunky to me and I am open to others. I have also corroborated this by emailing the editors of the Old Polish dictionary, who have told me that it indeed is "Old Kashubian", which they accept in their framework of Old Polish. Gorazd also holds the same view. @Thadh @Sławobóg @Rakso43243 @Benwing2 @Mahagaja @Silmethule. Vininn126 (talk) 10:50, 6 May 2024 (UTC)Reply

Alternatived are if we accept Kashubian and Slovincian as the descendants of Old Pomeranian, then we could set them both to be descendants of Old Polish. However, the argument for this is one could accept "Old Kashubian" as a constituent of Old Polish - not a dialect, but constituent. This is what the editor of the Old Polish dictionary told me, quote " Nie napisałam, że to dialekt. Napisałam, że to element składowy języka staropolskiego. To duża różnica. Język starokaszubski to element składowy języka staropolskiego." The alternative is also we ignore this, which seems wrong to me as well. Vininn126 (talk) 11:42, 6 May 2024 (UTC)Reply
Another solution: give Old Kashubian an etycode and make it an alt of Old Polish and if a term is attested in Pomerania, we could set the Kashubian and Slovincian reflexes as inherited from that? Otherwise directly from Proto-Slavic. Vininn126 (talk) 14:36, 6 May 2024 (UTC)Reply
@Vininn126 I think this last solution is maybe the best. This is similar to what is done with Old Northern French, which is considered an etym-only variety of Old French even though Old French as normally construed refers to the Old French of the Paris area whereas Old Northern French refers to the Old French of Normandy, and neither is an ancestor or descendant of the other. The two differ significantly in phonology, e.g. Old French chacier /tʃatsiɛr/ -> English "chase" vs. Old Northern French cachier /katʃiɛr/ -> English "catch". Anglo-Norman and modern Norman are both descendants of Old Northern French (although we currently list Norman as a descendant of Middle French, which is wrong) and modern French is a descendant of Old French per se. Benwing2 (talk) 18:38, 6 May 2024 (UTC)Reply
I know @Silmethule also mentioned a similar situation with Ancient and Mycenean Greek and also Old Norse and Swedish/Icelandic. See also my question on WT:About Old Polish. Related to that, I'm unsure how to handle labels for all of this. I think we'd want to list Kashubian/Slovincian in the Old Polish entries if and only if a text from Pomerania has an attestation. And any Kashubian/Slovincian words should still have "inherited from Old Kashubian/Pomeranian". Vininn126 (talk) 18:49, 6 May 2024 (UTC)Reply
@Nicodene As our resident Romance expert, do you agree with changing the ancestor of Norman to be Old Northern French instead of Middle French? This will cause the 5 terms in CAT:Norman terms inherited from Middle French to throw errors, I think. Can you fix up those 5 terms? Also I notice there are 30 terms in CAT:Norman terms inherited from Medieval Latin, which seems impossible and probably need to be cleaned up. Benwing2 (talk) 19:54, 6 May 2024 (UTC)Reply
I've just cleared out the categories in question. Αgreed on removing Middle French as an ancestor of Norman. As for its further ancestor, I would leave it as just Old French, which includes ONF as-is. I think the latter are best treated as one overall language.
I've been meaning to eliminate '[Romance] terms inherited form Medieval Latin' in general, reassigning them to '...inherited from Early Medieval Latin' or '...borrowed from [later] Medieval Latin'. That will take some time. When it's done, perhaps we can make {{inh|romance language|ML.|...}} throw an error message and a brief comment. Nicodene (talk) 00:50, 7 May 2024 (UTC)Reply
@Nicodene Thanks! I think the basic advantage of setting the ancestor of Norman to be Old Northern French is it more clearly shows the ancestry (when you go CAT:Norman language and look at the Ancestors panel) than just setting it to Old French. Since Old Northern French is an etym-only variant of Old French, I don't think it will make any difference in terms of what Norman terms are allowed to inherit from. What do you think? Benwing2 (talk) 01:44, 7 May 2024 (UTC)Reply
Oh, so setting it to ONF won't disallow inheritance from Old French. In that case it sounds fine to me. Nicodene (talk) 01:50, 7 May 2024 (UTC)Reply
Yeah that's right. Benwing2 (talk) 01:56, 7 May 2024 (UTC)Reply
I wouldn't like this. This is almost akin to handling Old East Slavic as an Old Church Slavonic variety. Pomeranian and Polish are two distinct branches, and the fact that an earlier stage was highly influenced in their literary variety by the other doesn't make them one and the same. Thadh (talk) 20:43, 6 May 2024 (UTC)Reply
There's actually a similar issue with texts from Pomerania from {{R:pl:SXVI}} and {{R:pl:SXVII}} but I think we can safely nest these under modern Kashubian with a label, as I have done with Middle Polish. Vininn126 (talk) 19:43, 6 May 2024 (UTC)Reply

Old Polish regional categorization[edit]

As a sort of continuation of Wiktionary:Beer_parlour/2024/May#The issue of Old Kashubian (Old Pomeranian?) and Wiktionary talk:About Old Polish#Regional Old Polish, I'm trying to figure out the best way to handle regional information for Old Polish. I have a document explaining the origin of most texts in Old Polish so it should be easy to figure out which of the 5 lects currently considered Old Polish (those being Masovian, Greater Polish, Lesser Polish, Silesian, and Pomeranian/Kashubian). I think it would be useful for readers to know which region a definition/term has been attested, as Old Polish wasn't a single entity and ultimately is the source of those modern dialects today, so we can see more clearly regional features and the like. My concern about using labels is that they would imply that a term might have been limited to a given lect, which we can't know for sure. What do others think? Vininn126 (talk) 19:17, 6 May 2024 (UTC)Reply

One solution could be to use {{lb}} but print the text {{lb|zlw-opl|attested in|Masovia|Lesser Poland}} etc. @Benwing2, would this be technically bad? Vininn126 (talk) 15:56, 8 May 2024 (UTC)Reply
@Vininn126 No, I don't see why that would be an issue. attested in isn't currently a recognized label but could easily be made one, so that it suppresses the following comma. Benwing2 (talk) 23:21, 8 May 2024 (UTC)Reply

Continental Celtic[edit]

We have Continental Celtic as a family, but my understanding is that the consensus among Celticists is that is CC isn't a clade but just a term of convenience for Celtic languages other than the Insular Celtic ones. Isn't our custom at Wiktionary to have only actual genetic families, not convenient groupings? —Mahāgaja · talk 11:28, 7 May 2024 (UTC)Reply

@Mahagaja Yeah we should get rid of this. BTW the Wikipedia article on Continental Celtic was in a terrible state due to a bunch of crap added a month ago, which I reverted. Benwing2 (talk) 22:06, 7 May 2024 (UTC)Reply

Ban one-descendant Proto-Italic and Proto-Hellenic redlinks[edit]

There are already far too many one-descendant Proto-Italic and Proto-Hellenic entries, and adding one descendant redlinks to, for example, a descendant tree or an etymology section is only going to encourage more of these entries being created. These redlinks should be banned. -saph 🍏 13:31, 7 May 2024 (UTC)Reply

Right, there should be above-average incentive to create such a page, so unless it is already decided to have one, bots should neutralize these links. Fay Freak (talk) 13:51, 7 May 2024 (UTC)Reply
In practice, what does a 'ban' on making certain kinds of redlinks mean, and what is the alternative it is supposed to incentivize? I guess mentioning the same form but not linking it would be slightly better, as it doesn't encourage creating an entry, but I'm not totally happy with that either in some cases. E.g. if the reconstructed form is itself doubtful, I wouldn't want it to be mentioned anywhere.--Urszag (talk) 15:44, 7 May 2024 (UTC)Reply
For example:
From Proto-Italic *fworom, from Proto-Indo-European *dʰwor-om (enclosure, courtyard, i.e. something enclosed by the door, or the place outside, i.e. through the door), from *dʰwer- (door, gate).
With the Proto-Italic word displaying as just plain text, rather than what we currently have (forum). As for the reconstructed form being doubtful, we should just list the hypothesised PIE form, e.g.:
As opposed to the current etymology given at serius. -saph 🍏 15:58, 7 May 2024 (UTC)Reply
The alternative it is supposed to incentivize is not creating such entries. You would have to have a more serious motive than ticking off a removed red link, since they are not apparent in the first place. Fay Freak (talk) 16:03, 7 May 2024 (UTC)Reply

Add "Muslim", "Hindu" etc. labels?[edit]

Proposal to add labels for lemmas used by people of specific faiths (which are not necessarily religious terms, rather they're only used by certain groups. Case in point মিঞা (mĩa) which has a Muslim gloss, but the Muslim label is an alias for 'Islam', though it's not an 'Islamic' term, just used by Muslims. Urdu dictionaries, which I concern myself with, have used these labels for centuries without prejudice. I know this would be useful for languages in the Indian subcontinent, as well as European languages (especially English). نعم البدل (talk) 20:55, 7 May 2024 (UTC)Reply

@نعم البدل There are (at least) two possibilities here. One is to disentangle the labels 'Muslim' and 'Islam' in a language-independent fashion, and the other is to do it for specific languages. I suspect the aliasing of 'Muslim' and 'Islam' was done with English entries in mind, where on the surface it makes a certain amount of sense (e.g. we have 'Muslim finance' as an alias of 'Islamic finance' and 'Christian' as an alias of 'Christianity'). A third possibility is to create a separate label, something like 'Muslim usage' or 'Muslim speakers', which makes it clear that the term is used by particular speech communities. Note that the advantage of doing it in a language-specific fashion is we can create associated categories, such as Category:Muslim Bengali, to categorize such terms, which wouldn't make so much sense if done language-independently. Finally, the adjective-noun issue you're bringing up isn't limited to this case; there is for example the issue of 'British India' (English terms formerly used in British India) vs. 'British Indian' (English terms currently used by Brits of Indian background).
BTW if you think the terms should be disentangled language-independently, you can see all current uses of the label 'Muslim' here: Special:WhatLinksHere/Wiktionary:Tracking/labels/label/Muslim (there are only 9 of them). Benwing2 (talk) 21:58, 7 May 2024 (UTC)Reply
@Benwing2: I think the 'Muslim' (etc.) tag should be detached from the 'Islam' label and made into an independent label and placed under the Module:labels/data/topical so that, as you say, it can generate associated categories, something like Category:Bengali Muslim speech (similar to Category:English women's speech terms, a minor difference between 'Muslim Bengali' as the label I'm proposing should be shed of its religious connotations as much as possible).
  • you can see all current uses of the label 'Muslim' here – Thank you for this! As far as I can see, apart from marabout, all of the other terms should be placed under my proposed label, as that's what was probably implied. Note how the 'Muslim' tag in মিঞা (mĩa) was encapsulated with Template:a (added by an IP), not the 'Muslim' label – likely because the 'Muslim' label appends the lemma to Category:Islam which doesn't fit. نعم البدل (talk) 02:21, 8 May 2024 (UTC)Reply
@نعم البدل OK, let's see if there are any objections/comments, and if not I'll make this change in a few days. Benwing2 (talk) 03:04, 8 May 2024 (UTC)Reply
Yeah no worries! نعم البدل (talk) 17:34, 8 May 2024 (UTC)Reply

Englishman picture[edit]

So User:Shoshin000 (among other trollish activities) has been insisting on adding a picture of an angry football hooligan as the picture of "Englishman". I reverted it once, he restored. I mention this because I know the modus operandi and soon I'll be accused of being a badmin. Check out the entry and you know the previous picture was nicer. Equinox 22:47, 7 May 2024 (UTC)Reply

I personally think your picture is better (although I wonder, do we need a picture to illustrate this?). Benwing2 (talk) 00:00, 8 May 2024 (UTC)Reply
Honestly, I like Shoshin's pic, as it's more stereotypical.[1] There's nothing inherently Englishman-y about Eq's pic, besides the depicted person being English.
[1] Then again, that's a good argument against the pic. CitationsFreak (talk) 03:25, 8 May 2024 (UTC)Reply
It could be argued that pictures of nationalities, if they exist at all, should show someone of that nationality in characteristic clothing (although that is probably more appropriate for nationalities that actually have characteristic clothing that most people wear on a day-to-day basis). OTOH it's in general very hard to capture a nationality in single picture (for this reason, Wikipedia usually supplies a whole collection of pictures to illustrate a nationality), and in any case this is more encyclopedic than dictionaric (a real but rare word). Benwing2 (talk) 04:06, 8 May 2024 (UTC)Reply
Yeah, I was thinking that a college would be best. I'm not sure what a recognizable British outfit would be, and having one person stand-in for Britain could imply that British people all are X. Highly unlikely, but possible. CitationsFreak (talk) 04:12, 8 May 2024 (UTC)Reply
I don't think nationalities should have photos at all, but I also disagree that File:ENG-BEL (6).jpg is "a picture of an angry football hooligan". The person in that photo doesn't look angry, nor is he doing anything hooliganish. His Englishness is clearly shown by the St George's Cross painted on his face. He arguably does illustrate [[Englishman]] better than the photo of Greg Rutherford, since Rutherford is representing the entire UK (not just England) in his photo. All that said, however, it is probably better to leave such entries unillustrated to avoid stereotyping. —Mahāgaja · talk 08:07, 8 May 2024 (UTC)Reply
I agree, this is not an image that requires an image. Vininn126 (talk) 08:09, 8 May 2024 (UTC)Reply
Aren't photos appropriate where there is an attestable, probably dated and often derogatory or demeaning, definition of a stereotype? Eg, Bavarians with lederhosen, Prussians with spiked helmets, Mexicans with sombreros and/or serapes.
There is no such definition here, nor would I expect us to attest any such definition. DCDuring (talk) 17:34, 8 May 2024 (UTC)Reply
I don't really see this picture as a problem, really, even though I wouldn't pick it myself. It'd probably be fine as part of a collage. Theknightwho (talk) 17:51, 8 May 2024 (UTC)Reply

Fixing Telugu rhymes[edit]

For years now, User:Rajasekhar1961 has been adding Telugu rhymes written in Telugu script instead of IPA. There is a special hack in Module:rhymes to deal with this, but IMO Telugu should (obviously) use IPA for rhymes, just like all other languages. Does anyone object to this? Can anyone out there read Telugu script well enough to tell me if the rhymes listed under Rhymes:Telugu (e.g. Rhymes:Telugu/రం) and are even salvageable, or should just be nuked? I don't know much about Telugu but scripts are generally not 1-to-1 mappable to IPA, so I don't know what it means to have a rhyme listed using Telugu script. Benwing2 (talk) 00:40, 8 May 2024 (UTC)Reply

Strongly agree. Theknightwho (talk) 11:40, 8 May 2024 (UTC)Reply

Kwami is messing with translingual entries, again[edit]

Just want to make sure there are some eyes on Kwami, as they've been making mass edits to Translingual entries that seem... worrying. After being reverted by @Theknightwho and @Benwing2 for deleting the translingual section, Kwami has recently begun deleting all the definitions from the translingual section instead.

I reverted all (but one) of the single character edits they've made today. However, they've been editing hundreds of TL entries and I have no idea how many entries are affected, as I've been very busy recently and can't check.

I'm not sure how bad the situation is so I don't want to "call out" Kwami. Just want to make sure people are aware before it becomes out of hand, like the last time this was discussed on here. — Sameer مشارکت‌هابحث﴿ 23:54, 8 May 2024 (UTC)Reply

@Sameerhameedy Thank you. I have blocked him for a month this time; I am getting seriously sick of this. I think he has used up all his lives; next time we should consider a permablock. Benwing2 (talk) 00:38, 9 May 2024 (UTC)Reply
Thank you, I'm also a bit annoyed since Kwami has gotten so many warnings and continues to do the same action. Now, Kwami has indicated that they will actually start a discussion on this issue before acting. There's no way to know if Kwami will actually follow through on that statement, but hopefully they do, so we don't have to do this every month. — Sameer مشارکت‌هابحث﴿ 00:51, 9 May 2024 (UTC)Reply

Do descriptions count as "definitions"?[edit]

I'm not being facetious here. This is a serious question for something I haven't understood for a long time.

For instance, in the article á, would "the letter a with an acute accent" be a valid definition? If so, should such descriptions be added to all letters? If not, should they be removed (perhaps placed under a "Description" heading instead)? And if not, and the only material for an article is such a non-definition, should the entry be tagged as needing a definition, or the article tagged for deletion for having no content?

I suspect that if I were to add a definition to cat as "the word spelled C-A-T", I would be blocked for vandalism. I don't see any meaningful difference between that and defining á as "the letter a with an acute accent". I've been told this is a straw-man argument, but I really don't understand what's appropriate in our entries if graphical descriptions are allowed as actual definitions.

The same applies to emojis, of course. Should an emoji of a face with tears be defined as "a face with tears", or should the definition be what it means and what it's used for? kwami (talk) 00:29, 9 May 2024 (UTC)Reply

@Kwamikagami I agree that "the letter a with an acute accent" is not a good definition. If á had a translingual section it should at least explain how the letter is typically used across languages. I assume it usually represents some kind of /a/? Ioaxxere (talk) 01:43, 9 May 2024 (UTC)Reply

English anagrams[edit]

English anagrams haven't been updated in a while. Could someone run a bot to update them? Maybe @Kiril kovachev, Benwing2 Ioaxxere (talk) 01:43, 9 May 2024 (UTC)Reply