Wiktionary:Beer parlour/2023/November: difference between revisions

Content deleted Content added

Inline

Revision as of 06:58, 6 November 2023

Vector2022 letter to el.wiktionary - Discussion

A letter was sent to us at el.wiktionary, that by Novermber 11th a new skin Vector2022 (like this, or this example) will be applied as default desktop view.
Discussion in English is ongoing here for everyone to join. Regardless of aesthetics, I am worried for the loss of __TOC__ (placed at all our Appendices and such pages) and very sad to have interwiki links, which we click constantly, hidden in a dropdown. We are now trying to substitute these, manually, or with some Tempaltes. If this skin is intended for wiktioanries too, not only wikipedias, could we ask for wiktionary‑specific modifications? We would be interested in your opinion and support. Thank you. ‑‑Sarri.greek ^♫ I 23:53, 1 November 2023 (UTC)[reply]

Parsing policy

Why is Wiktionary:Parsing categorized as a Wiktionary policy? --Lambiam 08:39, 2 November 2023 (UTC)[reply]

You'll have to ask @Koavf, who added the category. P U C – 08:49, 2 November 2023 (UTC)[reply]

Because I couldn't think of anything better. —Justin (koavf)❤T☮C☺M☯ 08:59, 2 November 2023 (UTC)[reply]

Resolved: now categorized as just Wiktionary. --Lambiam 17:58, 2 November 2023 (UTC)[reply]

Splitting Ancient Greek

@Mahagaja, Sarri.greek, Saltmarsh (please ping any other interested users)

Currently, Ancient Greek is handled as one (macro)language. This means that while Attic and Homeric Greek have a very good coverage, other lects like Aeolic and Doric are mostly an afterthought. For instance, the inflection tables note that the "dialectal" inflections are discussed in the appendix.

AFAIK, until Koine Greek there was no standardised Hellenic variety at all. Homeric Greek had a lot of influence on the various lects, but everyone mostly wrote in their own vernacular. As such it makes sense to me to split the Ancient Greek lects into major dialect groups, also considering the fact the various lects differ quite strongly. I imagine two scenarios:

A very rough division (Arcado-Cypriot, Ionic (incl. Attic), Aeolic, West Greek (incl. Doric).
A more detailed division (Arcadian, Cypriot, Attic, Western Ionic, Eastern Ionic, Thessalian, Boeotian, Lesbian, Doric, Northwestern Greek)

This would help on the following fronts: Most importantly, it would increase the possibilities in covering the various dialects from inscriptions, as well as (Lesbian) Aeolic in Sappho's work or (Eastern) Ionic in Herodotus. It will also make etymological coverage for Tsakonian historically accurate. As a bonus it would also finally give Proto-Hellenic more credibility, and make it much easier to provide descendants in the form of various languages, rather than various dialects of one single language.

I'm eager to hear your thoughts on this. Thadh (talk) 11:57, 2 November 2023 (UTC)[reply]

I don't think splitting grc into multiple languages is necessary to achieve any of those goals. All of your desiderata are achievable with the status quo of having the dialects be etymology-only varieties of Ancient Greek. Splitting grc up would simply unlink the less well covered dialects from the very useful infrastructure (templates and modules) we have in place. If there are ways in which the existing templates and modules are inadequate for the less popular dialects, I think it makes more sense to improve the templates and modules to accommodate them. —Mahāgaja · talk 12:31, 2 November 2023 (UTC)[reply]

The only ancient greek dialect which was certainly NOT mutually intelligible with the others is Arcado-Cypriot, which, like Scots compared to English, is very conservative in Nature. I think it is the only one which needs to be split.

As for the other four, Attic, Doric, Aeolic and Ionic, they were probably no more different than modern english dialects, with the exception that english orthography is practically the same everywhere. Ελίας (talk) 14:20, 2 November 2023 (UTC)[reply]

Splitting the language up would just make things more complicated: more language codes to keep track of and more knowledge of Ancient Greek dialectology required to do simple things like adding a quote. Chuck Entz (talk) 15:08, 2 November 2023 (UTC)[reply]

Splitting would be a nightmare for Greek borrowings in other languages. We would not know which dialect code to use. Vahag (talk) 19:17, 2 November 2023 (UTC)[reply]

I agree with User:Mahagaja. We already have the infrastructure in place for handling several dialects of Ancient Greek; the focus on Attic and Homeric Greek is simply due to the fact that there are a whole lot more sources for these dialects than for the others. Look at the current situation with Scots, which is almost completely neglected; that's what would happen if we split off various of the Ancient Greek dialects. IMO if there's any split that makes sense, it's splitting the later stages of Greek (e.g. Medieval Greek) into a different L2 language, and I know there has already been a discussion about this initiated by User:Sarri.greek, although it didn't end up anywhere. (Note, I'm not expressing a specific opinion on whether this split is the best thing as I don't know enough about Medieval Greek.) Benwing2 (talk) 20:46, 2 November 2023 (UTC)[reply]

@Benwing2. It did not (Medieval Greek, March2023), and I intend to renew the petition once a year, so that I can resume my work (now mainly on Koine and Med.Greek) at en.wiktionary. I hope that en.wiktionary handles 'languages', period phases as well as dialects (which it calls 'languages' too), according to bibliography, not because of the personal interests of editors. The love of wiktionarians for Homer and dialectal Greek is commended, but may I remind you, that an Athenian of the 5th century would comfortably listen to Doric at theatre plays -amidst the Peloponnesian War- (a label marking dialects, I think, suffices). Speaking of phases, the label Koine (grc-koi), also needs some care, because it covers many centuries. Although en.wikt/arians dislike Koine and Medieval Greek, they did exist, and all bibliography accepts it, the variety of opinions regarding only termini. Thank you Sir, for bringing this issue up, it really was a blow to me, the neglect with which it has been suppressed. ‑‑Sarri.greek ^♫ I 00:16, 3 November 2023 (UTC)[reply]

@Sarri.greek It is unfortunately common for Wiktionary discussions to peter out with no action taken. Feel free to create another Beer Parlour discussion, and make it a simple request to split Medieval Greek (with an appropriately defined time period) from Ancient Greek. The last discussion was long and I am not sure exactly what the objections were. You might want to state the prior objections and give rebuttals, but the fewer words used, the better, otherwise people are likely to not read it. Benwing2 (talk) 00:24, 3 November 2023 (UTC)[reply]

@Benwing2, sorry: Why is a discussion needed for the obvious? Does en.wiktionary need discussions to handle well referenced linguistic issues, may that be 'kinds of borrowings, languages, dialects, etc? My first paragraph at Medieval Greek, March2023 is quite short, very clear, mentions the reference-support, and I was amazed that the blah blah had to drag that far. The sysops of a wiktionary or of a wikipedia, need to just take a brief look at the bibliography to get the picture; one does not need to be a specialist on the language. If the sysops of enwikt, abstain from taking a look on the grounds that they are not specialists, it will never, ever be implemented. If there were non-anonymous, professional consultants for wikiprojects, discussions would deal only with tech matters and the details of implementing things. ‑‑Sarri.greek ^♫ I 00:38, 3 November 2023 (UTC)[reply]

We should choose a Word of the Year

Choosing a Word of the Year seems to be a popular dictionary tradition. The problem is that most of the picks are godawful, either being some neologism that no one has heard of (goblin mode, lol) or having no apparent relationship to what actually happened in the last year. I think the dictionary world needs some people who can take this job seriously.

My proposal is generative, reflecting the sophisticated generative AI models released throughout late 2022 and early 2023, such as: ChatGPT (November 2022), GPT-4 (March 2023), and DALL-E 3 (August 2023). Google Trends data shows a significant increase in searches for "generative" and other AI-related terms throughout 2023 [1].

Do you guys think this would be a good idea? Pinging @Sgconlaw, Lingo Bingo Dingo. Ioaxxere (talk) 20:13, 3 November 2023 (UTC)[reply]

Agreed. It can be based on some actual data like increased percentage of views. Maybe a top 10 or unranked list of five? —Justin (koavf)❤T☮C☺M☯ 20:31, 3 November 2023 (UTC)[reply]

Interesting. I have a few questions:

Who chooses the word? Is there to be a panel, or is the word to be voted on? Or is it to be based on actual data, as @Koavf suggests?
Presumably the word has to have gained currency in the preceding year? Or, to put it another way, should Word of the Year 2023 be featured in 2024?
On what date does the word get featured?

— Sgconlaw (talk) 20:34, 3 November 2023 (UTC)[reply]

@Sgconlaw, Koavf I think the word should be chosen by a WT:VOTE in which anyone can nominate candidates. The winner can be featured on the main page around late December to early January. Ioaxxere (talk) 19:11, 4 November 2023 (UTC)[reply]

Good thinking. —Justin (koavf)❤T☮C☺M☯ 19:12, 4 November 2023 (UTC)[reply]

@Ioaxxere: I suppose the Word of the Year will be featured somewhere on the Home Page? We'll need to think about the layout of the page and where the WOTY will appear—above the WOTD box, or elsewhere? Will it stay up for a whole year? — Sgconlaw (talk) 20:09, 4 November 2023 (UTC)[reply]

No, I meant that it would only be featured around late December to early January. Ioaxxere (talk) 20:31, 4 November 2023 (UTC)[reply]

It seems a fun idea and I encourage you to pursue it, but I am not going to take part in setting it up or choosing words. I do have an alternative proposal for a WotY: jailbreak, which is in my opinion a lexically more interesting word than generative. ~~←₰-→~~ Lingo ^Bingo _Dingo (talk) 20:36, 4 November 2023 (UTC)[reply]

We could ask ChatGPT for suggestions, weren't it for the 2021 cutoff date :) Jberkel 20:56, 4 November 2023 (UTC)[reply]

Ok, I've created Wiktionary:Votes/2023-11/Word of the Year Ioaxxere (talk) 22:50, 5 November 2023 (UTC)[reply]

@Ioaxxere: it probably doesn’t need to be that formal a vote. An ordinary vote here at the Beer Parlour is sufficient. — Sgconlaw (talk) 23:06, 5 November 2023 (UTC)[reply]

I agree that it doesn't have to be that formal, but I think it's fine to keep it as a formal vote. It will give it more prominence, since it will show up on everyone's Watchlists. Andrew Sheedy (talk) 23:13, 5 November 2023 (UTC)[reply]

incipient edit war on Module:ar-headword

Module:ar-headword has long had the ability to mark a personal/non-personal distinction on nouns, since it affects the agreement and pluralization patterns (non-personal nouns take feminine singular agreement in the plural and often use different plural forms). User:Fenakhay removed this functionality without explanation, and when I asked them why, they gave no justification other than "doesn't make sense". I undid this as a contentious change made without consensus, and Fenakahay reverted my undo claiming that the onus is on me to find consensus to undo his change. AFAIK this isn't at all how Wiktionary consensus works; the onus is on the person making the change to seek consensus if the change is controversial. In my view, this information is useful and important, and similar to the animacy marking in Slavic languages (compare also Romanian, which has a class of gender-changing nouns that are marked as "neuter" on the lemma). Fenakhay thinks this info is not useful mainly based on the fact that it's typically not marked in Arabic dictionaries (but from what I've seen, Arabic dictionaries are deficient in many respects compared with the best dictionaries of other major inflected languages, and leave out lots of info useful for non-native speakers). Benwing2 (talk) 01:26, 4 November 2023 (UTC)[reply]

No Arabic dictionary; be it monolingual or bilingual, marks “animacy” to words. The addition is unjustifiable. Non-natives making stuff up and reinventing how Arabic gender is listed because they read it in a grammar book... Typical. — Fenakhay ^{(حيطي · مساهماتي)} 01:31, 4 November 2023 (UTC)[reply]

Furthermore, it is not about “animacy” but being sentient or not. So anything that's not sentient, their adjective is inflected in the feminine singular including animals. For example: تِلْكَ ٱلْكِلَابُ ٱلْحَمْرَاءُ تَنْبَحُ ― tilka l-kilābu l-ḥamrāʔu tanbaḥu ― those red dogs bark, as you can see, the adjective أَحْمَر (ʔaḥmar) is inflected in the feminine singular, same for the determiner تِلْكَ (tilka) and the verb itself.
If a learner wants to know if a word refers to a sentient or non-sentient, they only need to ask themselves if the referred is a person or not. — Fenakhay ^{(حيطي · مساهماتي)} 01:48, 4 November 2023 (UTC)[reply]

"sentient" is another word "person". "Animacy may not be the right word but there are different levels, e.g. Polish/Ukrainian, etc. has "inanimate/animate/person" (three-way distinction), as opposed to "inanimate/animate" only in Russian, etc. We can discuss terminology. Anatoli T. ^{(обсудить}/^вклад) 02:13, 4 November 2023 (UTC)[reply]

Notifying Arabic editors: (Notifying Alarichall, Atitarev, Benwing2, Mahmudmasri, Erutuon, عربي-٣١, Fay Freak, Assem Khidhr, Fixmaster, Roger.M.Williams, Zhnka, Sartma): — Fenakhay ^{(حيطي · مساهماتي)} 01:34, 4 November 2023 (UTC)[reply]

Adding grammatical information is a plus, especially if it helps to determine how words are used in a sentence. Native speakers may find it intuitive but I don't know if the person/non-person agreement is never taught at school in Arabic speaking countries.

Let's look at these examples (yes, from a grammar book in English)

Persons:

الْمُعَلِّمُونَ مُجْتَهِدُونَ (persons) ― al-muʕallimūna mujtahidūna ― the teachers (m-p) are diligent, personal pronoun: هُمْ (hum)
الْمُعَلِّمَاتُ مُجْتَهِدَاتٌ (persons) ― al-muʕallimātu mujtahidātun ― the teachers (f-p) are diligent, personal pronoun: هُنَّ (hunna)

Non-persons:

الْأَقْلَامُ جَدِيدَة (non-persons) ― al-ʔaqlāmu jadīda ― the pens (m-p) are new, personal pronoun: هِيَ‎ (hiya)
الطَّاوِلَاتُ كَبِيرَة‎ (non-persons) ― aṭ-ṭāwilātu kabīra ― the tables (f-p) are big, personal pronoun: هِيَ‎ (hiya)

The adjectives for non-persons in the plural are in the feminine forms.

Another example:

السُّودُ (as-sūdu, “the blacks”), الْبِيضُ (al-bīḍu, “the whites”) - these can only refer to humans

I don't quite know what the Arabic gender structure was and is now at Wiktionary but I think we need to distinguish persons from non-persons.

(By the time I've typed my answer, I see new edits appeared) Anatoli T. ^{(обсудить}/^вклад) 02:06, 4 November 2023 (UTC)[reply]

It is a simple equation:

if WORD1 (in the plural) refers to a human being, then the adjectives are inflected according to the gender/number of the word.
if WORD2 (in the plural) refers to a non-human; be it an object, a concept or an animal, then the adjectives are inflected in the feminine singular.

Sorry but this is a grammar rule and doesn't add any information to the word itself. It is not rocket science. — Fenakhay ^{(حيطي · مساهماتي)} 02:12, 4 November 2023 (UTC)[reply]

@Fenakhay. No rocket science, true but I find it useful what agreement to use dependent on the sense. Like @Thadh also mentioned below, it depends on the sense. The same applies to Slavic languages for many words (not trying to make Slavic and Semitic similar to each other but I find similarities) Anatoli T. ^{(обсудить}/^вклад) 02:15, 4 November 2023 (UTC)[reply]

We are taught about عَاقِل (ʕāqil, “sentient”) and غَيْر عَاقِل (ḡayr ʕāqil, “non-sentient”) in school. — Fenakhay ^{(حيطي · مساهماتي)} 02:13, 4 November 2023 (UTC)[reply]

@Fenakhay: Thanks. Do you think labelling "sentient/non-sentient" is inappropriate in the Arabic headword (more than one if that's a case for specific words)? "person/non-person" is just another way of expressing the same thing, which is also used in grammar books. Anatoli T. ^{(обсудить}/^вклад) 02:19, 4 November 2023 (UTC)[reply]

I'm told that this personal/non-personal distinction in verbal/adjectival agreement is always evident from the noun's meaning, and not inherent to a lemma by itself; In that case, it seems like something we might put as a note in inflection tables, but I don't think it needs to be added to headwords. As for plurality patterns, that doesn't seem like a strong enough argument by itself. Thadh (talk) 01:42, 4 November 2023 (UTC)[reply]

It can be left out in the plural, marking all, even the animates, as pl, but editors we know not will feel a need to be more explicit, I know the edit patterns of casual site visitors. And mark inanimate plurals as feminines for example. To avoid inconsistencies and have clear models, it is sensitive that we have specific gender markers at plural POS. While singular entries typically have enough noise, I wouldn’t want to add on every page “hey, did you know that plural forms of inanimate nouns agree with feminine singular forms in Arabic?” Fay Freak (talk) 02:34, 4 November 2023 (UTC)[reply]

The declension table shows "sound masculine plural"/"sound feminine plural" for sentient (person) nouns. The gender labels can only be applied to sentient (person) nouns to avoid too much "noise".

It can be compared to Czech nouns where only masculine nouns differ by animate/inanimate. The animacy for feminines/neuters is unimportant (no grammatical changes). Anatoli T. ^{(обсудить}/^вклад) 02:42, 4 November 2023 (UTC)[reply]

@Benwing2: You have not answered my question implied on Fenakhay’s talk page whether after the removal the site uses less Lua memory or processing time. Since I am not invested in essentialist dogmatic distinctions, the consideration that there were only about ten pages, out of myriads, an amount of pages that for Wiki pages necessarily constitutes an “error margin” conditioned by the negligence resulting from project participation being voluntary, that were actually using the removed genders, in combination with the computing principle of toning down complexity, favouritises the removal. I am much less concerned with what would “make sense” abstractly than you might expect: though comparative conceptualization is attractive, the implementation’s predictable effect upon and explainability to occasional readers and editors is of concern. If there are some instructions that can be chosen for a template then one needs to understand what one would try to achieve on pages with it, what to signify to readers, otherwise it is not “useful”. If it were “useful info”, man would have marked it, isn’t it? The entries do not appear to be adrift of accurate, exhaustive grammatical information. Didn’t feel a need nor note and suddenly Benwing opened our eyes that without marking the genders by the theoretically envisioned method we were missing out on something all the time? Here I am concerned with not making entries overfraught with information few uninvited in our particular circle would understand—such as claiming other genders than feminine and masculine in the singular. The point to make, which eventual editors also attempt to make at those pages no matter our choice, is most appropriately noted at plural entries: the being a plural of something but agreeing with feminine singular vs. the being masculine plural and the being feminine plural. Fay Freak (talk) 02:18, 4 November 2023 (UTC)[reply]

I support the view that sentient/non-sentient (person/non-person) should be added as an option to the headword/tables or usage notes, even if it hasn't been regularly done. We could say that Slavic word animacy is not important either. Come on, it's common sense, right? (:sarcasm:)

I will comply with whatever is decided, though. It bothers me, also that we are marking non-sentient plural nouns as "plural", which is kind of misleading. To me, it seems we have to distinguish three types of plurals, which govern different adjectives, verbs and pronouns. Anatoli T. ^{(обсудить}/^вклад) 02:29, 4 November 2023 (UTC)[reply]

Honestly I don’t really know where the marking non-sentient plural nouns as “plural” comes from, somewhen I recognized it as the correct thing. (Many old entries mark inanimates falsely as m-p or f-p after this rule.) Fay Freak (talk) 02:37, 4 November 2023 (UTC)[reply]

I would use "m-p", "f-p" and something like "np" for non-sentient nouns. (Was is used before?) Anatoli T. ^{(обсудить}/^вклад) 02:45, 4 November 2023 (UTC)[reply]

@Fay Freak I am trying to understand your comment, but the difference in processing speed and memory between having the extra gender distinctions and not having them is negligible. Benwing2 (talk) 02:56, 4 November 2023 (UTC)[reply]

Actually, @Fay Freak, how should we mark non-sentient plural nouns in your opinion? Should the gender be marked? This has been raised several times. Also, knowing what the original gender (in the singular) of those nouns was seems irrelevant grammatically. In fact, for some words I came across, it may be impossible or difficult to determine if they are feminine singular or (non-sentient) plural. Anatoli T. ^{(обсудить}/^вклад) 02:55, 4 November 2023 (UTC)[reply]

Probably plural-inanimate (with some abbreviation), this would make learners more aware of the agreement, and fewer editors would make the mistake of marking as feminine singular, as—due to the morphologic relations which a language user is aware of, and to avoid the claim of masculine inanimates switching their gender in the plural—I would prefer to say it is not feminine singular; technically it means that the verbs and adjectives used with the inanimate plurals are also not feminine singular but of the same inanimate plural gender having form syncretism with feminine singular but we won’t muddle the tables with this observation. I know those cases where one is unsure whether something is plural of something or just an alternative form and/or by itself a singular, this is no specific problem, since in such cases there are also masculine singular inanimates. Fay Freak (talk) 03:09, 4 November 2023 (UTC)[reply]

Animacy is culture-specific, and the Slavic languages do not allow for personal beliefs. ("Я съел вкусный зайца" would be ungrammatical regardless of whether you think a hare is animate). From what I understand, in Arabic this isn't the case, and the speaker does decide whether to assign animacy (/sentiency) to a noun or not. If I misunderstand, do tell me, because in that case I will change my opinion above. Thadh (talk) 03:32, 4 November 2023 (UTC)[reply]

@Thadh: In Arabic, the usage is also quite grammatical. Please see the simplest Arabic examples I used in my post above. "sound masculine plural" would be inappropriate for e.g. non-sentient nouns (non-humans). The distinction is not between animate/inanimate but between persons and non-persons (animals fall into the same category as things). Anatoli T. ^{(обсудить}/^вклад) 03:38, 4 November 2023 (UTC)[reply]

I don't doubt non-persons cannot be agreed to with personal markers, but is the other way around also true? If a word is not clear to be a person or not (e.g. mythical creatures)? Thadh (talk) 04:06, 4 November 2023 (UTC)[reply]

@Thadh The situation in the Slavic languages is not quite so clear-cut, AFAIK. The Russian terms for things like "bacteria" and "virus" may or may not be animate, depending on the speaker (e.g. scientists tend to view the terms as animate, others mostly not) and Czech is known to have a large number of "facultative animates" (things like mushrooms that may or may not be considered animate, depending on the speaker, and things like salami that are clearly inanimate but nonetheless treated as animate by some speakers). Benwing2 (talk) 04:56, 4 November 2023 (UTC)[reply]

@Thadh: The other way around is also true.

In هٰؤُلَاءِ أَوْلادٌ ― hāʔulāʔi ʔawlādun ― these are boys هٰؤُلَاءِ (hāʔulāʔi, “these”) (m. pl) can only refer to sentient (rational) nouns.
In هٰذِهِ كُتُبٌ ― hāḏihi kutubun ― these are books هٰذِهِ (hāḏihi, “this”) can refer to any feminine singular or non-sentient (irrational) plural nouns.

The plurals of non-sentient nouns are treated as feminine singular. They use the pronoun هِيَ‎ (hiya, “she”), which also means "they" for non-sentient plurals.

Native speakers may shed more light on how mythical creatures are declined but will it make a difference for this discussion? Slavic languages also have corner cases. Anatoli T. ^{(обсудить}/^вклад) 05:34, 4 November 2023 (UTC)[reply]

Just adding a separate perspective as an Anglophone student of Arabic. I don't have great expertise in the language and I use Wiktionary a lot when reading Arabic because it is more informative and more easily navigable than traditional dictionaries. (I mostly edit when I try to look up a word in Wiktionary and realise that a word or a sense is missing.) I really appreciate having as much grammatical information as possible: the transliterations into Latin script, full inflection tables, information about gender, etc. Wiktionary is unlike traditional dictionaries in providing all these. It was especially useful to me at the beginning of my Arabic-studying journey: Wiktionary helped me stick with learning Arabic, and this in turn has encouraged me to keep contributing to Wiktionary. So although I don't have particularly well informed opinions about marking sentience, I would generally encourage including and keeping information that may seem obvious to native-speakers but is not obvious to students—including and perhaps especially total beginners. Alarichall (talk) 08:01, 4 November 2023 (UTC)[reply]

How consensus works

An important point is being missed in this discussion. I'd like to get clarity on this. If a module, template or other practice has been stable for a long time, then any contentious change needs consensus before the change is made, and should be left in the status quo until consensus is achieved to change it. User:Fenakhay seems to disagree with this principle, based on their consistent attempts to force through the change being discussed above, and their serial reversions of my undos. Fenakhay claims as justification for this change that "there was no vote when the functionality was originally added", which seems quite spurious, as there rarely is such a vote. As an example, there was certainly no vote that led to the current state of Latin verbs using "I" forms, but the practice has long been stable, hence I am seeking consensus in the BP to change this. Similarly there was no vote that led to Ancient Greek being treated as a single L2 rather than several dialect-specific L2's, and User:Thadh rightly created a BP discussion instead of unilaterally introducing changes and then demanding that anyone wanting to undo the change needs consensus to do so. Benwing2 (talk) 03:09, 4 November 2023 (UTC)[reply]

I agree with the principle that any such changes (provided there is either an active community for the language, the language has a large amount of readers, or the editor in question isn't an editor of the language), in 'core' matters, including headwords and language treatment, should be discussed first. Thadh (talk) 03:26, 4 November 2023 (UTC)[reply]

He thought it is not contentious. It is inflammatory to claim he seems to disagree with the principle. Since practical use in the future was not demonstrated, on the contrary. The discussion has become theoretical by large now since no one is realistically hindered in expanding upon our Arabic entries. Accurate though that the reasoning formulated was spurious. Accurate also that, for this but theoretical effect of the particular present state of the module, there is a negligible status quo bias in favour of the previous module’s state, which would of course be changed anyway if you turn out to have the better view, after the discussion which no one has been prevented from kicking off if he care, this is surely a consideration when someone is WT:BOLD. I understand you cherished your own work and intellectual input that went into the module; if you actually planned to use the contested features within the next days it would be a different matter, but this is not the case, hence we are rightly apathic to whether your version or Fenakhay’s edits stay in the near future: We are making consensus now, and whether or not one of you two gets the provisory last word—you could edit war on it and nothing would change in the world, futile! Again, we try to think what people would realistically use in the entries. Fay Freak (talk) 03:32, 4 November 2023 (UTC)[reply]

Does 'terms borrowed back into LANG' include cases where the borrowing was from an ancestor?

I am cleaning up remaining cases where 'twice-borrowed terms' occurs, since the category has been renamed. There are, for example, 57 cases in CAT:Greek twice-borrowed terms, all of which appear to have the category added manually and where the chain of borrowing was typically Greek <- Ottoman Turkish <- Ancient Greek. Do these count as "borrowed back into Greek" terms? Similarly, there are several French terms borrowed from English which ultimately were borrowed from Old French. Do these count as "borrowed back into French" cases? Yet another example are wasei kango terms (Japanese coinages made from Chinese words) that are borrowed back into Chinese (we have around 100 of them). Most Japanese borrowings of Chinese words occurred during Middle Chinese, yet the {{wasei kango}} template considers them 'borrowed back into Chinese' terms and adds the category manually. If we do consider these are "borrowed back into" terms, this should be handled automatically, and either way, we should remove the manually added categories (ignoring cases similar to fakaleitī, where the etymology is incomplete so the category wouldn't get added automaticall). Benwing2 (talk) 05:57, 4 November 2023 (UTC)[reply]

Hmm... on one hand, if we say these don't count, then it's kind of arbitrary that terms from ancient Hebrew or ancient zh borrowed via another language back into modern Hebrew or Chinese can be categorized, whereas terms from ancient Greek borrowed back into modern Greek can't, just because we previously and unrelatedly decided it was most practical to handle Ancient and modern Greek under separate L2s, but ancient and modern Hebrew and Chinese under (mostly) one L2 apiece. And a term that went from early Middle English to e.g. (middle) French to late Middle English can be categorized, but a term that went from late Middle English to (middle) French to Early Modern English can't be (which is, again, arbitrary). It means decisions about whether it makes sense to handle two different languages under one L2 will start being influenced by whether people want to be able to consider the language(s) to have twice-borrowed terms, which seems undesirable.
On the other hand, if we say these do count, do we have a "cutoff mechanism", so that we're not considering a term that went "PIE → Latin → English" to have been "borrowed back into English"? (That's not a rhetorical question; do we already have some module in which we record that "Old English, Middle English, modern English" count as stages of 'a language' in a way that "Proto-Indo-European, Proto-Germanic, English" don't? It seems plausible we might.) - -sche (discuss) 06:50, 4 November 2023 (UTC)[reply]

@-sche That's a very good question that I didn't think of. AFAIK we don't have a built-in way currently of specifying that e.g. Old English is an earlier stage of English from this perspective whereas the ancestor of Old English (Proto-West-Germanic) is not. We do have a distinction between object inheritance (which represents an "is-a" relationship, e.g. US English is a kind of English, Mandarin Chinese is a kind of Chinese) and ancestrality (Middle English is an ancestor of English, and Old Italian is an ancestor of Italian even though it's also an etym-language variant of Italian). However, the ancestrality chain for English goes all the way back to PIE. I do think this can be determined automatically in most cases by looking for shared words at the end of the language name, and this accords with most people's sense of "early stage of a language": English and Old English share a word at the end, and Western Neo-Aramaic and its ancestor Aramaic share a word at the end assuming hyphens separate words, whereas English and Proto-West-Germanic don't. Benwing2 (talk) 07:00, 4 November 2023 (UTC)[reply]

@Benwing2 I'm not a fan of that approach, because it's still totally arbitrary: Buryat and Mongolian are both descendants of Classical Mongolian, but your approach would only apply between CM and Mongolian, not CM and Buryat. The only reason we consider one to be Mongolian and the other not is for historical and political reasons, and if we renamed Mongolian to Khalkha (which would we very plausibly could) then suddenly it would change the status of all these terms. You could make the same argument for all the Langues d'oïl other than French with respect to Old French, for example. One of the strengths of the current set-up is that it gets around the issue of which language is the "true" main descendant, and I'd oppose adding it in. Theknightwho (talk) 07:06, 4 November 2023 (UTC)[reply]

@Theknightwho There's also the practical issue that there's no way of distinguishing "back-borrowings" between A and B and regular borrowings using templates such as {{der}} or {{bor}}. Either we'd need to create an explicit {{bbor}} = "back-borrowing" or similar, or we'd have to make a new version of {{der}} that can have multiple levels of the chain inside its parameters. For example, replacing the following:

From {{inh|en|enm|orenge}}, {{m|enm|orange}}, from {{der|en|fro|pome orenge|t=fruit orange}}, influenced by the place name {{m|en|Orange}} (which is from Gaulish and unrelated to the word for the fruit and color) and by {{der|en|pro|auranja}} and calqued from {{der|en|roa-oit|melarancio}}, {{m|it|melarancia}}, compound of {{m|it|mela|t=apple}} and {{m|it|[[un]]'[[arancia]]|t=an orange}}, from {{der|en|ar|نَارَنْج}}, from Early {{der|en|fa-cls|نارنگ|tr=nārang}}, from {{der|en|sa|नारङ्ग|t=orange tree}},<ref name="OnlineED">{{R:Online Etymology Dictionary|entry=orange}}</ref> from {{der|en|dra-pro|*nār-}} (compare {{cog|ta|நார்த்தங்காய்}}, compound of {{m|ta|நரந்தம்|t=fragrance}} and {{m|ta|காய்|t=fruit}}; also {{cog|te|నారంగము}}, {{cog|ml|നാരങ്ങ}}, {{cog|kn|ನಾರಂಗಿ}}).

We'd have something like this:

From {{der|en|<<inh:enm:orenge>>, {{m|enm|orange}}, from <<ibor:fro:pome orenge<t:fruit orange>>>, influenced by the place name {{m|en|Orange}} (which is from Gaulish and unrelated to the word for the fruit and color) and by <<der:pro:auranja>> and <<ical:roa-oit:melarancio>>, {{m|it|melarancia}}, compound of {{m|it|mela|t=apple}} and {{m|it|[[un]]'[[arancia]]|t=an orange}}, from <<ibor:ar:نَارَنْج>>, from Early <<ibor:fa-cls|نارنگ<tr:nārang>>>, from <<ibor:sa:नारङ्ग<t:orange tree>>>,<ref name="OnlineED">{{R:Online Etymology Dictionary|entry=orange}}</ref> from <<ibor:dra-pro:*nār->> (compare {{cog|ta|நார்த்தங்காய்}}, compound of {{m|ta|நரந்தம்|t=fragrance}} and {{m|ta|காய்|t=fruit}}; also {{cog|te|నారంగము}}, {{cog|ml|നാരങ്ങ}}, {{cog|kn|ನಾರಂಗಿ}}>>

.

The basic idea is that you can stuff an entire sentence into the second parameter of {{der}} (or whatever), and inheritance/borrowing/calque/etc. relationships are placed inside of <<...>>, similar to {{place}}. The variants ibor:, iinh:, ical:, etc. stand for "indirect borrowing", "indirect inheritance", etc. and indicate that the term in question is borrowed/inherited from the preceding-specified term; this lets the code have access to the full etymology tree, meaning it can do things like automatically find back-borrowings and other interesting phenomena. Benwing2 (talk) 07:45, 4 November 2023 (UTC)[reply]

@Theknightwho If we do include back-borrowings of this sort, I would rephrase it not as "what is the (single) true descendant of a given language" but "how far up the chain do earlier stages go"? That means that e.g. Scots and English (since we treat them as separate L2's) could both have Middle English and Old English as earlier stages, but not Proto-West-Germanic, and similarly the various modern Oïl languages would all have Old French as an earlier stage but not Proto-Gallo-Romance. Benwing2 (talk) 07:55, 4 November 2023 (UTC)[reply]

@Benwing2 I feel like the natural cut-off is to only include attested languages, but that may be too broad. Theknightwho (talk) 08:45, 4 November 2023 (UTC)[reply]

@Theknightwho Does that mean Latin counts as an earlier stage of French? Benwing2 (talk) 08:58, 4 November 2023 (UTC)[reply]

Well I suppose it is, and I suppose it’s somewhat interesting to see borrowings back and forth between language families: compare Old/Middle Chinese terms borrowed into Old/Middle Japanese, where the Japanese descendant has been borrowed into Mandarin. Intuitively, those seem notable to me. Theknightwho (talk) 09:15, 4 November 2023 (UTC)[reply]

Any attempt at automation runs into the problem of determining when a given language started. This can’t reliably be determined by their conventional names, as mentioned in the above discussion. It may be best to simply let the status quo stand, leaving users free to decide this on a case-by-case basis. Nicodene (talk) 11:21, 4 November 2023 (UTC)[reply]

Spitballing: have an extra parameter for each language like isStageOf so e.g. ang would be set to enm, and enm to en and sco. Alternatively, store this in a separate module* only {{bor}} et al. access, so it doesn't inflate the size of the module that {{l}}, {{lb}}, {{head}} et al. access. What to consider a stage of what is subjective in places, but I don't think avoiding automating it avoids the problem, since we still need to know whether it's right if an editor manually categorizes a term, so people don't (intentionally, or even unawarely) edit-war over it.
For my part, I'm not sure I would consider Latin to be just an earlier "stage" of French, because Latin split into so many languages and French is not considered the "Modern Latin" (actual la-Latin is). So it'd be useful for us to decide that, regardless of whether we're categorizing manually or by module.
The question also extends to descendants of French, English, etc: if a term in Middle English was borrowed into (middle) French, then borrowed from modern French by Jamaican Creole, was it "borrowed back into Jamaican Creole"? I'm inclined to say no. OTOH an edge case like "term used in colonial-era English texts from Jamaica, borrowed into another unrelated language there, and then borrowed by Jamaican Creole" is the sort of thing I'd suggest allowing manual categorization of.
*In a separate module, each chain could also be separate, if other people actually do want to categorically allow any English term borrowed into another language and then into Jamaican Creole to count as twice-borrowed, and of course allow an Old English term borrowed into [stages of] French and then back into English to count as twice-borrowed, but don't want to consider an Old English term borrowed into French and then into Jamaican Creole to be twice-borrowed. Just have one chain "ang, enm, en" and another "en, jam", and {{der|jam|ang}} would see that no chain contained both "jam" and "ang" and so not count it as 'borrowed back'. - -sche (discuss) 15:16, 4 November 2023 (UTC)[reply]

I've been doing cleanup of {{bor}} vs. {{der}}, and the same issue comes up there: {{bor}} should only be used for borrowing into the language of the entry, but people tend to see the word "borrowed" in an etymology and use {{bor}}, regardless of the steps in between. This is easy to sort out when an English entry uses {{bor}} for the borrowing of an Ancient Greek word into Latin, but there are lots of cases such as English entries where the borrowing occured in Middle English or Old English, or Indonesian entries where the borrowing was into Classical Malay. I can see how it could get really sticky in cases like borrowings between Scots and English, since they're both descended from Middle English but English speakers tend to think of English as the "real" continuation of Middle English. Then there are the Norwegian lects and their relationship with Danish.

Another thing I see a lot of is the use of {{inh}} for ancestors of terms that were borrowed from a related language, so someone might use {{inh|nb|gem-pro}} for a term that was borrowed from Middle Low German- but that's a separate issue. Chuck Entz (talk) 16:02, 4 November 2023 (UTC)[reply]

Appendix cruft in Citations

e.g. Citations:spectre. I don't think these citations for fancruft appendices should be in "real" citations space, mixing with the useful stuff that meets WT:CFI. Thoughts? Equinox ◑ 16:28, 5 November 2023 (UTC)[reply]

Yes. get them out of there. — SURJECTION ^{/ T / C / L /} 17:30, 5 November 2023 (UTC)[reply]

Why? —Justin (koavf)❤T☮C☺M☯ 01:00, 6 November 2023 (UTC)[reply]

Because the sense they are for is never going to meet CFI. — SURJECTION ^{/ T / C / L /} 06:58, 6 November 2023 (UTC)[reply]

I think the Citations namespace can be used a place to show that a term is on its way to meeting CFI. I support having the Mass Effect cites there (although there should be a cite that refers to this sense without a mention of the video game.) CitationsFreak (talk) 19:25, 5 November 2023 (UTC)[reply]

I agree with this stance. The citations in the Citations namespace should either count towards meeting CFI or, for particularly rare terms, help clarify the meaning when context alone is insufficient. The namespace should not be used for senses that are not CFI-compliant to begin with. Andrew Sheedy (talk) 19:48, 5 November 2023 (UTC)[reply]

@Andrew Sheedy, CitationsFreak, Daniel Carrero, Equinox, Surjection See Appendix talk:Mass Effect for context. You guys work out what you want to do; this is totally experimental for me and I don't really care. I would encourage you not to judge the Mass Effect cites by Citations:spectre, but instead by one of the better ones: Citations:Ardat-Yakshi. Bro that Citations page kicks ass, as I believe you will agree. Anyway, it's all theoretically identical to Citations:protocol droid which has been around for decades with no problem. lol lmao &c. --Geographyinitiative (talk) 23:43, 5 November 2023 (UTC)[reply]

You are correct, it is a very good Citations page. Not ready for a main entry, but something worth being there. CitationsFreak (talk) 23:47, 5 November 2023 (UTC)[reply]

Yes, but the entry at protocol droid was deleted. Jberkel 00:01, 6 November 2023 (UTC)[reply]

@Jberkel Please see Appendix:Star Wars, where 'protocol droid' is listed in an in-universe fancruft appendix containment zone with a link to the ancient page Citations:protocol droid. My goal with the recent Citations pages for Mass Effect in-universe words was to do something similar in most respects. --Geographyinitiative (talk) 00:25, 6 November 2023 (UTC)[reply]

@@ Line 143: / Line 143: @@
 :Yes. get them out of there. &mdash; [[User:Surjection|S<small>URJECTION</small>]] <sup>/''[[User talk:Surjection| T ]]''/''[[Special:Contributions/Surjection| C ]]''/''[[Special:Log/Surjection| L ]]''/</sup> 17:30, 5 November 2023 (UTC)
 ::Why? —[[User:Koavf|Justin (<span style="color:grey">ko'''a'''vf</span>)]]<span style="color:red">❤[[User talk:Koavf|T]]☮[[Special:Contributions/Koavf|C]]☺[[Special:Emailuser/Koavf|M]]☯</span> 01:00, 6 November 2023 (UTC)
+:::Because the sense they are for is never going to meet CFI. &mdash; [[User:Surjection|S<small>URJECTION</small>]] <sup>/''[[User talk:Surjection| T ]]''/''[[Special:Contributions/Surjection| C ]]''/''[[Special:Log/Surjection| L ]]''/</sup> 06:58, 6 November 2023 (UTC)
 :I think the Citations namespace can be used a place to show that a term is on its way to meeting CFI. I support having the Mass Effect cites there (although there should be a cite that refers to this sense without a mention of the video game.) [[User:CitationsFreak|CitationsFreak]] ([[User talk:CitationsFreak|talk]]) 19:25, 5 November 2023 (UTC)
 ::I agree with this stance. The citations in the Citations namespace should either count towards meeting CFI or, for particularly rare terms, help clarify the meaning when context alone is insufficient. The namespace should not be used for senses that are not CFI-compliant to begin with. [[User:Andrew Sheedy|Andrew Sheedy]] ([[User talk:Andrew Sheedy|talk]]) 19:48, 5 November 2023 (UTC)

Wiktionary:Beer parlour/2023/November: difference between revisions

Revision as of 06:58, 6 November 2023

Contents

Vector2022 letter to el.wiktionary - Discussion

Parsing policy

Splitting Ancient Greek

We should choose a Word of the Year

incipient edit war on Module:ar-headword

How consensus works

Does 'terms borrowed back into LANG' include cases where the borrowing was from an ancestor?

Appendix cruft in Citations

Navigation menu

Wiktionary:Beer parlour/2023/November: difference between revisions

Revision as of 06:58, 6 November 2023

Vector2022 letter to el.wiktionary - Discussion

Parsing policy

Splitting Ancient Greek

We should choose a Word of the Year

incipient edit war on Module:ar-headword

How consensus works

Does 'terms borrowed back into LANG' include cases where the borrowing was from an ancestor?

Appendix cruft in Citations

Navigation menu

Search