Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome to the Beer Parlour! This is the place where many a historic decision has been made, and where important discussions are being held daily. If you have a question about fundamental aspects of Wiktionary—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list below (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don’t make personal attacks, don’t change other people’s posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page and consider before posting here whether one of our other discussion rooms may be a more appropriate venue for your questions or concerns.

Sometimes discussions started here are moved to other pages for further development. In particular, changes to a major policy or guideline may be discussed on the corresponding talk page and “simple votes” (as opposed to drawn-out discussions) can be conducted on our votes page.

Questions and answers typically remain visible on this page for one to two months, but they can always be found in the appropriate monthly archive (based on the date discussion was initiated). While we make a point to preserve all discussions that were started here, talk that is clearly not appropriate for this page may be deleted. Enjoy the Beer parlour!

Beer parlour archives edit

October 2021

Definitions of Letters[edit]

As words of a particular language, many letters have definitions such as "the second letter of the Welsh alphabet". (The Welsh entries themselves are not quite so bad, as they also then spell out the letter and gives their predecessors and successors.) Such definitions are intrinsically unstable, for letters may be inserted in an alphabet. For example, the letter 'j' has been added to the Welsh alphabet since I was a child, and as a result of different sources we now have the opening definition "the fourteenth letter of the Welsh alphabet" for both J and L! As a result of the deletion of letters, both Ll and N are defined as 'the 14th letter of the Spanish alphabet'. --RichardW57m (talk) 11:11, 1 October 2021 (UTC)[reply]

I therefore feel that it would be appropriate to change definitions of one-character letters from "the nth letter of the WW alphabet" to "the letter of the WW alphabet used as the header word of this entry", and add "It is the nth letter of the WW alphabet" to the "Trivia" section of the entry. History may cause the trivium section to expand. Multi-character letters would be handled by analogy. As boldly making this change might be considered vandalism, what do people feel about this proposed change? Does it need a vote? --RichardW57m (talk) 11:11, 1 October 2021 (UTC)[reply]

Should we be documenting the use of letters in non-additive numbering systems, such as 'Section 5(c)'? The most significant feature of such systems is that some letters are not used in such lists. I can see an argument that such documentation belongs to a grammar, rather than a lexicon.--RichardW57m (talk) 11:11, 1 October 2021 (UTC)[reply]

I feel like this discussion will be pointless if the vote about letters entries passes. Thadh (talk) 11:23, 1 October 2021 (UTC)[reply]
@Thadh: How so? Are you assuming that all the letter entries of a language can be squeezed into a single table? --RichardW57m (talk) 12:30, 1 October 2021 (UTC)[reply]
@RichardW57m: Not necessarily in a table, but they probably won't look the same way they do now, so it doesn't make much sense to discuss the way they look in entries before we know where the vote's heading. Thadh (talk) 13:47, 1 October 2021 (UTC)[reply]

Rhyming categories for Middle Chinese[edit]

I think all the data for Middle Chinese rhymes are already there. Those data were sourced from rhyme dictionaries in the first place. Is there plan for actually implementing Middle Chinese rhyming categories? This may even be a fairly good case for automation. --Frigoris (talk) 16:53, 1 October 2021 (UTC)[reply]

HSK lists of Mandarin words update[edit]

Currently, Wiktionary has Appendix:HSK list of Mandarin words accumulating all the vocabulary of the old (pre-2010) HSK test. Recently, the exam was reformed, and the lists of words and characters were published. See this pdf for official specifications. Thus, I propose to update the appendix.

I made drafts of the new HSK word lists:

HSK Beginner (levels 1-3): all three levels

HSK Intermediate (levels 4-6): level 4, level 5, level 6

HSK Advanced (levels 7-9): a-h, j-s, sh-zh

The words are OCRed from the paper, and then converted into traditional characters with some manual corrections. I think some proofreading is still needed.

The following problems arise here:

  1. What should be done with the old appendix?
  2. How should the new appendix be divided? The current version of the HSK has 9 levels grouped in 3 ranks. The high levels (7-9) are not delimited, but they contain roughly as many words as all the preceding levels combined (5636 vs 5456). Note that it's computationally heavy to have a huge amount of words in Template:zh-l on a single page.
  3. There is a category tied to the old word lists, see Category:Mandarin by difficulty level. You may want to reorganize it.
  4. Many words in the HSK can be considered SoPs, and some of them were previously deleted on that ground (see the red links on my drafts).
  5. Many words in the HSK have optional erhua. How should they be listed in the new Appendix?
  6. I think everyone would agree on inclusion of traditional forms of the words, but what should be done about the variant pronunciations (Taiwanese or colloquial Mainland) not listed in the official HSK paper? Should they also be included? --YousuhrNaym (talk) 23:51, 3 October 2021 (UTC)[reply]

Let's talk about the Desktop Improvements[edit]

Annotated Wikipedia Vector interface (logged-out).png


Have you noticed that some wikis have a different desktop interface? Are you curious about the next steps? Maybe you have questions or ideas regarding the design or technical matters?

Join an online meeting with the team working on the Desktop Improvements! It will take place on October 12th, 16:00 UTC on Zoom. It will last an hour. Click here to join.


  • Update on the recent developments
  • Sticky header - presentation of the demo version
  • Questions and answers, discussion


The meeting will not be recorded or streamed. Notes will be taken in a Google Docs file. The presentation part (first two points in the agenda) will be given in English.

We can answer questions asked in English, French, Polish, and Spanish. If you would like to ask questions in advance, add them on the talk page or send them to sgrabarczuk@wikimedia.org.

Olga Vasileva (the team manager) will be hosting this meeting.

Invitation link

We hope to see you! SGrabarczuk (WMF) 15:09, 4 October 2021 (UTC)[reply]

Unifying the transliteration of ʾalef and ʿayin in Semitic languages[edit]

Dear Wiktionary Semitists, I'd like to bring to your attention the current lack of consistency in how ʾalef and ʿayin are transliterated across Semitic languages. Have a look at the following pages and compare transliterations, for example:

  1. Reconstruction:Proto-Semitic/ʕaśar-#Descendants.
  2. Reconstruction:Proto-Semitic/tišʕ-
  3. Reconstruction:Proto-Semitic/šabʕ-

The inconsistency is both inter- and intra-linguistic. It is quite confusing, and since it's basically just a stylistic question, I'd like to start a discussion on whether we should unify to the more traditional (but not user friendly, since they're small and difficult to tell apart) /ʾ/ and /ʿ/ or the more modern (and much more user friendly) /ʔ/ and /ʕ/. Opinions? Thoughts? Let's discuss! —⁠This unsigned comment was added by Sartma (talkcontribs) at 12:22, 5 October 2021 (UTC).[reply]

For Amharic, ʾ and ʿ are the ones in use and since these aren't contrastive, I would like to keep following that practice. I don't have a strong opinion on other Semitic languages though, but /ʔ/ and /ʕ/ do seem more user-friendly in languages where that distinction is relevant. Thadh (talk) 19:30, 5 October 2021 (UTC)[reply]
I'd rather consistency between languages that frequently appear together, like the Ge'ez-script languages or Arabic topolects. I don't see any reason why there should be consistency between all Semitic languages, which only appear next to each other on protolanguage entries. —Μετάknowledgediscuss/deeds 20:32, 5 October 2021 (UTC)[reply]
In my own handwritten notes I find I'm using the IPA symbols as just clearer. We don't have to use pure IPA in transcriptions, but the traditional little curly apostrophes, barely readable in a printed book, become impossible in a computer typeface. The IPA symbols magnify them and make them readable. If you're going to use š rather than sh in transcriptions, you're half way to pure phonetic symbols. The apostrophes are appropriate for semi-technical formats like maps and history books, but for a more linguistic purpose, use clear, readable, unambiguous symbols. --Hiztegilari (talk) 20:58, 5 October 2021 (UTC)[reply]
I support the IPA symbols except for the Gəʿəz-script languages, in which field the half rings seem uncontested, and as mentioned are also the distinction is less contrastive. For Akkadian I don’t know. Fay Freak (talk) 22:57, 5 October 2021 (UTC)[reply]
I have seen that in the Routledge volume The Semitic Languages, most authors use ʔ and ʕ even when they use conventional non-IPA symbols otherwise, e.g. ʔǝgziʔ-ä sämay yä-ṣnǝʕ mängǝśt-ǝyä (Butts, chapter "Gǝʕǝz"). I don't know if this is a general trend, but consistently using ʔ and ʕ in place of ʾ and ʿ is nothing unseen. –Austronesier (talk) 10:33, 6 October 2021 (UTC)[reply]
@Austronesier: True, I remember these fashionable books. They owe it to their character as general overviews, while Wiktionary’s mission is to document the individual languages in detail and as one does when one deals with a narrow selection of languages in detail. I framed the field as Ethiopian studies (Äthiopistik). While this is an Orchideenfach I do not know the people of who nowadays study it, I doubt little that the bulk of the field is gutted if it sees a deviation from that certain transcription system which we currently automatically put and which is of course followed and presented without even any question or any glance on an alternative by the Wikipedia article on Geʽez script—so nobody seeks an article like Romanization of Arabic for Ethiopian Semitic—, and Ethiopists would rather refrain from any change to it. Fay Freak (talk) 16:05, 6 October 2021 (UTC)[reply]
@Fay Freak: Good point. I can confirm from my very own experience that editors of such overview volumes set standards which contributors wouldn't normally follow in more specialized works: e.g. I was urged to change the name of a language to make it confirm with the ISO-standard (in that special case a real abomination). Since you say that the Gəʿəz transliteration in that book was adjusted to an in-volume standard that is otherwise uncommon, I agree we shouldn't really follow it. –Austronesier (talk) 16:28, 6 October 2021 (UTC)[reply]
I've just picked Al-Jallad as an example: in the Routledge volume, he uses ʔ and ʕ in the Safaitic chapter; but in his Safaitic grammar (Brill), he naturally uses ʾ and ʿ. –Austronesier (talk) 16:46, 6 October 2021 (UTC)[reply]
@Austronesier, Fay Freak, Thadh, Metaknowledge Ok, it looks like the majority is ok with using different signs depending on the language. But what about those languages that don't seem to have a standard at the moment? Like Aramaic (the various variety), Hebrew, Arabic and its topolects? To be honest, despite much preferring ʔ and ʕ, I'm more than happy to unify everything to ʾ and ʿ. In the end, there's no real "tradition" that uses ʔ and ʕ, these are just the more "modern" style. To me it's really strange to see Standard Arabic using ʾ/ʿ and other Arabic topolects using ʔ/ʕ, for example. There's no reason why it should be like this. What shall we do? Sartma (talk) 15:16, 7 October 2021 (UTC)[reply]
ʔ and ʕ—the easier if standardization is less relevant. I made the exception only for Ethiosemitic—which is separated by a mere, anyway; I think it will vex you not if we have ʾ and ʿ for Ethiosemitic and ʔ and ʕ elsewhere. Fay Freak (talk) 15:31, 7 October 2021 (UTC)[reply]
Arabic needs input from a great deal more people than will see and interact with this; we'd want a dedicated discussion at Wiktionary talk:About Arabic. As for Aramaic, it will never be completely unified, because some of the modern neo-Aramaic varieties have romanisation traditions that emerged independently from scholarly usage, and should be left as they are. For the long-extinct Aramaic varieties, we can do as we like, and though ʾ and ʿ are the closest we have to a standard for them, I would be happy to switch them over to ʔ and ʕ — although that could be putting the cart before the horse, in that most of the entries don't have romanisation at all and the scheme isn't completely settled anywhere. —Μετάknowledgediscuss/deeds 17:37, 7 October 2021 (UTC)[reply]

Request for new language family and proto-language codes: North Halmahera / Proto-North Halmahera[edit]

User:Alexlin01 and I (or better, mostly Alexlin01 who has been active as IP in the past) have started to add lemmas from languages of the North Halmahera family, together with etymologies from reconstructed proto-forms. There is an existing corpus of 180 proto-forms available, and we might carefully add more reconstructions based on regular sound correspondences.

The North Halmahera languages are part of the proposed West Papuan macrofamily which has the code [paa-wpa] in WT. While West Papuan is still tentative and only based on resemblance sets, North Halmahera is universally accepted, since it is as self-evident as e.g. the Slavic languages. Therefore, we request a code for North Halmahera and Proto-North Halmahera. North Halmahera would be under [paa-wpa] (West Papuan), and include the following languages:

  • Galela [gbi]
  • Gamkonora [gak]
  • Ibu [ibu]
  • Kao [kax]
  • Laba [lau]
  • Loloda [loa]
  • Modole [mqo]
  • Pagu [pgu]
  • Sahu [saj]
  • Tabaru [tby]
  • Ternate [tft]
  • Tidore [tvo]
  • Tobelo [tlb]
  • Tugutil [tuj]
  • Waioli [wli]
  • West Makian [mqs]

Currently, they are under [paa-wpa] (West Papuan) or the generic [paa] (Papuan). ‑Austronesier (talk) 07:35, 6 October 2021 (UTC)[reply]

Hi! Also, from these, Ibu is already extinct. Alexlin01 (talk) 14:34, 6 October 2021 (UTC)[reply]
@Austronesier Created paa-nha and paa-nha-pro. DTLHS (talk) 03:10, 8 October 2021 (UTC)[reply]
@DTLHS Great, many thanks! –Austronesier (talk) 08:43, 8 October 2021 (UTC)[reply]

Inconsistent treatment of Arabic words in Persianate languages[edit]

(Notifying AryamanA, Atitarev, Benwing2, Smettems, Kutchkutch, Bhagadatta, Msasag, Svartava2, Getsnoopy): @Allahverdi Verdizade

There is an inconsistency in the treatment of Arabic words in Persianate languages.

  • In South Asian languages, the proximal donor is given as Persian.
  • In Turkic languages (especially Turkish and Azeri), the proximal donor is given as Arabic.

For example, Hindi किताब (kitāb) is given as coming from Classical Persian کتاب(kitāb), while Azerbaijani kitab or Uzbek kitob is given as ("ultimately") coming from Arabic كِتَاب(kitāb) with no mention of Persian.

Could this be resolved one way or another? I suppose it's a bit iffier for Anatolian Turkish given that the Ottomans had direct contact with Arabic-speaking subject populations, but for Azeri Turkish or the Central Asian languages it should be the same situation as with South Asian languages, i.e. these words entered the language through the means of a Persianate literati class who used both Persian and Arabic, but whose primary language of writing was the former.

My understanding is that there is evidence of Persian mediation for both South Asian and Turkic languages, e.g. Hindi फ़ुर्सत (fursat) meaning "spare time" or Turkish macera meaning "adventure".--Tibidibi (talk) 13:32, 6 October 2021 (UTC)[reply]

Also ping @Vox Sciurorum, @Fay Freak.--Tibidibi (talk) 13:50, 6 October 2021 (UTC)[reply]
I mark Ottoman Turkish and Turkish terms as derived from Arabic unless I have evidence that one was borrowed from Persian. If the word has been in Turkic languages from before the 13th century or so I may assume it was borrowed from Persian. Nineteenth century borrowings I assume were directly from Arabic, if not Ottoman coinages based on Arabic grammar. If there are any phonological or temporal guidelines to use, let me know. Vox Sciurorum (talk) 13:53, 6 October 2021 (UTC)[reply]
@Vox Sciurorum I think there is a stronger justification for having Ottoman terms be derived directly from Arabic because Persian was neither the language of the Ottoman administration nor that of any significant part of the population. For the Turkic languages east of the Ottoman-Safavid border, and for all South Asian languages, the influence of Persian as a prestige language was much more direct.--Tibidibi (talk) 14:10, 6 October 2021 (UTC)[reply]
What Squirrels Voice said.
Also, I see zero value in clogging up the etymology of Arabic derivatives with an extra piece of information, which is hardly provable anyway if it came in through Persian or directly via bookish contexts. Allahverdi Verdizade (talk) 14:15, 6 October 2021 (UTC)[reply]
The Seljuk dynasty that invaded Anatolia after their victory in the Battle of Manzikert was a Persianate society. While Ottoman Turkish was not Persian, the language was replete with loanwords from Persian covering cultural and administrative terminology, while Arabic was the donor for many religious terms. Some of the Persian loanwords the Seljuks brought with them to Anatolia came from Arabic. It is IMO truly impossible to decide whether the proximate source of Ottoman Turkish فلسفه‎ was the (identically spelled) Persian term, or, directly, Arabic فلسفة‎. The choice not to mention Persian as a possible donor is then merely a choice for the sake of convenience, not a matter of principle.  --Lambiam 17:25, 10 October 2021 (UTC)[reply]
The distribution of Persian has a cohesive epicentre while Arabic has been scattered all around the world. Have you heard of Uzbeki Arabic? Now Samarqand clearly was a hotspot of Arabic communication; from there Arabic-speaking tradesmen in low concentration reached Uyghuristan, in the vicinity of which Arabs learned words like خُتُو(ḵutū), on the entry of which I included a quote where Samarqand occurs as a casual station of Arabic rulers; I don’t think one has to imagine the mediation of communication by Persian, contact was generally Arabic language to Turkic language, this regard is most parsimonious. Fitting this picture, Persian words use to reach Mongolian but via Tibetan (!). For Anatolian Turkish it is only most prominent and most obvious, to a Westerner, that contact with Arabic was there, because Arabs were Ottoman subjects (but so they were Kipchak and Turkmen subjects before …). Fay Freak (talk) 15:38, 6 October 2021 (UTC)[reply]
@Fay Freak: Arab speakers in Khorasan are a small minority because the colonists there assimilated quickly. From The Cambridge History of Iran, Volume 4, page 602:
Alongside both the early dialects and dari, which had spread everywhere with a greater or lesser degree of local variation, Arabic had also taken root in Iran. It was of course the everyday language of the Arab immigrants: certain towns such as Dinavar, Zanjan, Nihavand, Kashan, Qum and Nishapur had a considerable Arab population and Arab tribes had also settled in Khurasan. However, these Arab elements were more or less rapidly assimilated: in the middle of the 2nd/8th century the majority of the Arabs in the army of Abu Muslim spoke dari.
In fact, the Islamic conquest led to the expansion of Persian and its replacement of local Eastern Iranian languages like Sogdian.
Since major urban centers such as Bukhara and Samarqand were clearly predominantly Persophone by the period when the region was becoming increasingly linguistically Turkic, I don't see any justification for claiming that most Arabic loans in e.g. Uzbek are directly from the small community of native Arabic speakers instead of reflecting Arabic's position as a prestige language upheld by a primarily Persophone literati elite.
Chagatai, the direct literary ancestor of Uzbek, was marked by extensive Persian influence (to the point that some texts have virtually no Turkic content words) and became a literary language explicitly on the model of Persian in Timurid and Shaybanid courts, both of which retained Persian as the chief bureaucratic language. I understand that Chagatai has little additional Arabic influence beyond what is already systemically found in Persian. Tibidibi (talk) 16:16, 6 October 2021 (UTC)[reply]
The point was that there had been a constant latent presence of Arabic, not only as traces in Persian. Be the communities more or less native or be they acquainted with it due to trade or war or education. Arabic was never eradicated and the influx was continuously renewed. While in India this latent presence lacked, Arabic was really remote and for the educated. Oddly of course Persian scholars wrote Arabic – for Samarqand I think of Najib ad-Din Samarqandi – while Indians wrote Persian, does this tell us something for the question of the thread? So in the former borrowings could be more from Arabic due to some familiarity. Fay Freak (talk) 16:30, 6 October 2021 (UTC)[reply]
If you actually read about Central Asian Arabic, you'll see that they bear signs of having close ties to dialects in Arab countries, which allows us to reconstruct migration events. This is clearly inconsistent with a "constant latent presence" of actual speakers (as opposed to scholars and clerics, who could only influence the language on a literary or religious level). For Indian and Central Asian Turkic languages, there is no reason not to assume a Persian intermediary unless specific evidence is brought to bear for a given word; for Turkish and Azerbaijani, I don't think it's generally knowable. —Μετάknowledgediscuss/deeds 03:30, 8 October 2021 (UTC)[reply]
@Metaknowledge Why do you think it's unknowable for Azerbaijani? I'm not really sure what the major difference would be between Azerbaijani and Chagatai vis-a-vis their relationship to Arabic/Arabs and Persian/Persians. Tibidibi (talk) 14:09, 10 October 2021 (UTC)[reply]
Because the West Oghuz tribes have actually been geographically adjacent to Arabs since around 1000 AD. Allahverdi Verdizade (talk) 10:53, 13 October 2021 (UTC)[reply]

Romanization pages for Mandarin and Cantonese - possible update task for a bot?[edit]

Currently, the various romanization pages for Mandarin Pinyin and Cantonese Jyutping are in a poor state. I presume due to the quantity and ancillary nature of such entries, many are lacking updated content with common characters and there are inconsistent presentation of the relevant characters. Some examples:

  • For 烹, the pinyin entry pēng shows characters such as 硷 and 軽, which are simplified or variant forms but the linked traditional forms do not show this pronunciation. In the case of 軽, this character is more commonly recognised as Japanese Shinjitai since the regularly observed Chinese forms are 輕 and 轻.
  • paang1 does not show 烹 at all
  • xiǎn shows in list items 5 崄 and 6 嶮 which are the simplified and traditional version of the same character, while lower down item 23 lists 猃, 獫 together.
  • Also in xiǎn, item 17 濁 is shown but the simplified form is not included.

This seems to be a good target for a bot to update the entries if it is able to take all the existing pinyin and Jyutping pronunciations for all characters and to update the entries systematically, while also standardising the presentation of simplified and variant character forms. A good example to reference is shí which has a good number of entries (however I'm not sure if it includes all) and most entries list the traditional and simplified forms together. This entry does however list item 2 as "実, 实, 實, 寔", which is a bizarre ¿alphabetical? order of Shinjitai, simplified, traditional and variant characters. As for item ordering, it might seem like it is ordered by radical and stroke - this might be something that needs consideration for standardisation of the romanisation entries.

Would anybody be able to take on this task?

I can try to built such a bot but I have not built bots before and I believe it requires data scraping the pronunciations off all the existing entries, which will be a arduous task in itself, even if done with automation.

Zywxn (talk) 17:14, 6 October 2021 (UTC)[reply]

User TheNicodene - revert war to hide unresolved abuse[edit]

The user is trying to obstruct my efforts at bringing to attention at addressing the abuse they've perpetrated against me deleting and archiving the discussion at Talk:formaticus. They're trying to hide the abuse and break the existing links in other discussions. The issue is not resolved and cannot be archived until it is. I request this user be blocked if they continue the edit war. Brutal Russian (talk) 05:31, 7 October 2021 (UTC)[reply]

I did not 'hide' the discussion; that is a flat-out lie which can be disproved by clicking the link. I placed the discussion in an archive and added a link at the top of the talk page; doing so with discussions over 75000 bytes, in order to free up space for new discussions, is standard Wiki practice. The discussion has not even been replied to for four months now. Nor did archiving it 'break links', which is another flat-out lie. Talk: formaticus functions exactly as it always did.
See here for a write up of only some of the insults this user has thrown at me over several months, for which he has even been temporarily blocked. I have no idea why he is suddenly acting up again after a merciful three-month hiatus. The Nicodene (talk) 05:53, 7 October 2021 (UTC)[reply]

Macedonian: standard, non-standard, misspelling[edit]

@Chuck Entz, Erutuon, Metaknowledge Since I am now creating entries for non-lemma forms of verbs, I would like to discuss some issues relating to the treatment of non-standard and misspelled words. We scratched the surface with User:Erutuon in August, but there are quite a lot of problems to be addressed:

Currently, my entries are formatted as follows:

Assigned to: verbs, lemmas (I am omitting less relevant categories)
Assigned to: misspellings, non-lemmas
Assigned to: participles, non-lemmas
Assigned to: participles, misspellings, non-lemmas
  • очерупа - nonstandard word, non-lemma: "verb" in the headword line, {{lb|mk|nonstandard}} in the definition
Assigned to: verbs, non-standard terms, lemmas
Assigned to: participles, non-lemmas

The problems are as follows:

  • It is also possible to treat корегиран as a misspelling of коригиран, i.e. to link two non-lemma forms to each other, rather than defining each as an inflected form a lemma. I have always tended to opt for the second solution, including with categories other than partciples.
  • Putting "misspelling" in the headword line of a misspelled verb lemma prevents it from being assigned to "verbs", but putting "misspelling" in the headword line of a misspelled participle (non-lemma form) of a verb does not prevent it from being assigned to "participles", because the parameter "part" inside {{infl of}} seems to populate that category.
  • "misspelling" does not distinguish between misspelled lemmas and misspelled non-lemmas.
  • Non-lemma forms of non-standard words are not labelled in any way to indicate that they are non-standard, because if I write {{lb|mk|nonstandard}}, they will get categorized as non-standard terms, which is wrong (they are not terms but non-lemmas), whereas if I write {{lb|mk|nonstandard forms}}, that will technically be correct, except that this label is used elsewhere for non-standard forms of standard words (comparable to English "goed", a non-standard preterite of the standard "go").

Further complications:

Participles have their own inflection, e.g. "коригираниот", which is the definite form. I do not want this to link back to the verb коригира; it is more appropriate for it to be defined as {{infl of|mk|коригиран||def|m|s}}. It will then be assigned to participle forms, with the help of the headword line {{head|mk|participle forms}}. However, if the inflected participle is misspelled as "корегираниот", it would be defined as {{infl of|mk|корегиран||def|m|s}} and the headword line would be {{head|mk|misspelling}}. Consequently, there would be nothing to assign "корегираниот" to participle forms. This would be a second inconsistency, in addition to the aforementioned one ("misspelling" suppresses the category "verbs" but not the category "participles") Martin123xyz (talk) 11:48, 7 October 2021 (UTC)[reply]

Ideal solution:

Redefine the category system to have the following:

  • lemmas
  • non-lemma forms
  • misspelled lemmas
  • misspelled non-lemma forms
  • non-lemma forms of misspelled lemmas
  • non-standard lemmas
  • non-standard non-lemmas forms
  • non-lemma forms of non-standard lemmas

Each of these would contain subcategories for "noun", "verb", "adjective" instead of "lemma", e.g. "misspelled nouns", "misspelled noun forms", "forms of misspelled nouns", etc. There would be separate headers for each, e.g. {{head|mk|noun}}, {{head|mk|misspelled noun}}, {{head|mk|form of misspelled noun}} (with abbreviations for easier typing).

For dealing with non-lemma forms of non-lemma forms, like the declined forms of Macedonian participles, we would need the following:

  • participles < verb forms
  • misspelled participles < misspelled verb forms
  • participles of misspelled verbs < non-lemma forms of misspelled verbs
  • non-standard participles < nonstandard verb forms
  • participles of non-standard verbs < non-lemma forms of non-standard verbs
  • participle forms
  • forms of misspelled participles
  • forms of participles of misspelled verbs
  • forms of non-standard participles
  • forms of participles of non-standard verbs

This is in my opinion the maximal categorization that we arrive at when we take into account all the relevant factors that my creating Macedonian entries has brought to the fore so far. Any other system, including the current one, seems to me to be bound to blur at least one of the empirically established distinctions highlighted above.

I am assuming that no one will be happy to implement such a categorization system, but the overview I have provided above should still be helpful for keeping track of what exactly the current system obscures and coming up with improvements addressing individual problems only. Needless to say, the distinctions that I have presented will also apply to many other languages.

Pending improvements, I would like to ask if the way I format the six types of entries listed at the start of this post is appropriate for the time being, or is there something I could do better, or even should, according to Wiktionary policies. Martin123xyz (talk) 11:48, 7 October 2021 (UTC)[reply]

In my opinion, a misspelt noun or verb is still a noun or verb, and should be categorised as such. Converting the header line of a lemma to the header line of a misspelling is Visigothism, even if committed by @Equinox, and in English loses the mentions of inflections that one could otherwise find by searching. {{misspelling of}} provides the appropriate information and categorisation. --RichardW57 (talk) 02:52, 8 October 2021 (UTC)[reply]
When adding "misspelling" to the header line in addition to using {{misspelling of}}, I was complying with the instructions provided at Wiktionary:Misspellings. However, your suggestion resolves the two inconsistencies I referred to above. Martin123xyz (talk) 07:03, 8 October 2021 (UTC)[reply]
My thought on reading that is 'Quo Warranto?'. I don't know whether to amend Wiktionary:Misspellings, tag it as unadopted or simply request its deletion. Can anyone justify not treating misspelt English verbs as verbs? One problem is that a manual maintenance action needed for verbs will not happen simply because misspelt verbs are not listed as verbs. --RichardW57 (talk) 08:03, 8 October 2021 (UTC)[reply]
Requesting its deletion without providing new instructions would not be helpful. As long as there are some instructions, at least a certain degree of consistency between different users' contributions is ensured. And if you leave it as it is, more users will find it, assume that it is an official policy which enjoys the consensus of the community, and continue to adhere to it. Either way, the instructions for contributors regarding things like "misspellings" need to be significantly expanded - currently they are simplistic, in addition to being biased in favour of English entries. I am considering writing a user guide for Macedonian contributions, except that so many things are unregulated or poorly regulated on the English Wiktionary as a whole that I would need to make my own arbitrary decisions or keep asking here about every point. Martin123xyz (talk) 10:04, 8 October 2021 (UTC)[reply]
'Term' covers both lemma and non-lemma. --RichardW57 (talk) 02:52, 8 October 2021 (UTC)[reply]
Full information about a non-lemma should be given under the lemma; one would not wish to repeat the multiple meanings of a lemma for its inflected forms. Accordingly, it should suffice to record that something is the inflected form of a non-standard term by recording the non-standardhood at the parent term itself. --RichardW57 (talk) 02:52, 8 October 2021 (UTC)[reply]
Thank you for the input. Martin123xyz (talk) 07:03, 8 October 2021 (UTC)[reply]

I have noticed a further problem: not only is "nonstandard form" ambiguous between "inflected form of a nonstandard lemma" and "non-standard form a standard lemma", it can also be understood as "nonstandard equivalent/variant of a standard lemma" (on the analogy of "alternative form of". I had used it in this sense at допринесува recently. Regrettably, {{nonstandard form of}} does not address this threeway ambiguity. Martin123xyz (talk) 14:00, 8 October 2021 (UTC)[reply]

I just created a page for витруелен (vitruelen), using {{head|mk|misspelling}} and {{misspelling of|mk|виртуелен}}, and the entry appears in Category:Macedonian non-lemma forms and Category:Macedonian misspellings, which is wrong, because the word is misspelled lemma, not a non-lemma form. Maybe we need to use {{head|mk|misspelled lemma}} instead, and put those entries in Category:Macedonian misspelled lemmas? Gorec (talk) 14:47, 8 October 2021 (UTC)[reply]
The argument for using misspelling as a part of speech actually argues for splitting the lemma categories into misspelt and 'correctly' spelt lemmas. I'd rather add a parameter to {{mk-noun}} and {{en-verb}} etc. I'm waiting for an old hand to weigh in. --RichardW57 (talk) 16:48, 8 October 2021 (UTC)[reply]


As I've suggested before, we should establish an arbitration committee (much like the one Wikipedia has) to settle entrenched disputes among users. The finer details can be discussed later, but in general, is there any considerable support for this proposal? Imetsia (talk) 19:14, 8 October 2021 (UTC)[reply]

There is from my part! Of course we hope to not have any disputes at all, but as the previous year has shown, they are inevitable in a project of our size. Thadh (talk) 19:36, 8 October 2021 (UTC)[reply]
Just as seatbelts and airbags have lead to more automobile accidents, creating an arbitration committee is guaranteed to lead to more intransigence. Participants in such disputes are all fairly confident that they are in the right and that their PoV will be the prevailing one, with only minor concessions to the other side. Also, there will be less avoidance of potentially controversial edits and other changes because one's point of view will be perceived as more likely to prevail. DCDuring (talk) 20:23, 8 October 2021 (UTC)[reply]
I think I should clarify: I don't know how WP's arbitration works, but my idea was similar to what Vox Sciurorum proposes below. I think we ought to have some system where unaffiliated admins can resolve ongoing disputes. Thadh (talk) 10:36, 11 October 2021 (UTC)[reply]
ArbCom over at Wikipedia has not been a roaring success. It is very important that we recognise that the way their judicial system works is not ideal, it is simply how things happened to play out. Their ArbCom has three distinct purposes: policy, block appeals, and conflict resolution. There is no reason that one body should decide on all three, nor is this necessarily a good thing. As it stands, Wiktionary is much more democratic than Wikipedia, and we handle more policy through votes. I think this should remain the case. So the question is then whether block disputes (not just appeals, which are usually spurious, but where admins are actually in disagreement) and conflict resolution could be handled better than they are now, and at what cost. I think we could do better, so this idea has some merit — but we would also create a venue for the bickering that already distracts from the actual work of editing, and this has been a major effect of Wikipedia's ArbCom. —Μετάknowledgediscuss/deeds 20:39, 8 October 2021 (UTC)[reply]
  • Symbol support vote.svg Support because this would prevent long endless disputes like the recent one ({{inh+}} & {{bor+}}). Svartava2 (talk) 06:08, 9 October 2021 (UTC)[reply]
I agree with Μετα that WP:ArbCom is not as functional as one might wish, and with DCD that the laudible intention of avoiding arbitrariness in arbitration has led to rule codification paving the road to hell endless wikibickering. We should be careful what we wish for. A dispute over a deep disagreement can be held in an amicable way; what made recent disputes unpleasant were the sometimes implied, often straightforward accusations of bad faith cast at the other side. Perhaps an etiquette committee might do some good.  --Lambiam 16:57, 10 October 2021 (UTC)[reply]
I don't like the idea. I know I'm a bit of a handful but it's not "I don't want to be officially reprimanded" (I don't care if I'm officially reprimanded, that's fine), it's more, as Meta suggests above, I think that creating a special little judicial system-in-system does more to foster bullshit than it does to fix actual project issues. Equinox 17:22, 10 October 2021 (UTC)[reply]
It would be useful to have a way to resolve disputes where neither of two contradictory and strongly-held positions has supermajority support. I doubt a formal arbitration committee is the way. Maybe we can find a less formal way to have senior administrators cut the knot in cases like derivation wording without having every vote appealed to them. Vox Sciurorum (talk) 18:44, 10 October 2021 (UTC)[reply]
Say the proposal is instead to create "Wiktionary:Requests for Arbitration," where users can make their case, and well-established editors can vote in support of one disputant or another. I'd imagine this would be very similar to how we run RFD - no committees, formal procedures, rules of evidence, etc. And by the end of one month, we count the number of votes and act according to what the majority decides. Is this a "less formal way" that you'd support? (Really, this question goes to all users in this discussion who don't like the idea of forming an ArbCom). Imetsia (talk) 23:23, 13 October 2021 (UTC)[reply]
@Vox Sciurorum, Metaknowledge? —⁠This unsigned comment was added by Imetsia (talkcontribs).
The problem is that this doesn't differ much from a simple vote... I really do think we ought to restrict the solving of such disputes to the (uninvolved) administrators. Thadh (talk) 21:43, 15 October 2021 (UTC)[reply]
This solution introduces so many new problems that it more than counterbalances the ones it solves. I think that instead of throwing half-baked ideas at the wall and seeing what sticks, it's worth asking what you really want and how to achieve that. If what you want is to know whether you're allowed to use {{bor+}}, then I would say that you're going about it the wrong way — a Supreme Court shouldn't be making policy. —Μετάknowledgediscuss/deeds 22:14, 15 October 2021 (UTC)[reply]
The + templates situation would have been something an arbitration committee could have helped solve. However, it is a moot case at this point, and I wouldn't use a proposed ArbCom to continue to litigate it. For a more current issue, I'd point to the Brutal Russian versus TheNicodene complaints, even though I have no personal stake in that issue and am very unfamiliar with the fact pattern. Again, a board of well-established users voting in his favor/opposition is one possible avenue to put this issue to rest once and for all. Indeed, I think it is the best way to resolve the two above issues declaratively. Such conflict-resolution is squarely in the province of a judicial branch, whose sole purpose it is to interpret policy and settle disputes among litigants. But ultimately, I also understand the objections (though I still think the benefits outweigh the detriments), and I won't continue to pursue the creation of an arbitration committee in spite of myself. Imetsia (talk) 23:44, 15 October 2021 (UTC)[reply]
  • I share concerns that establishing a bureaucratic structure here with formal committees probably wouldn't help in the way proponents are hoping. I worry about the risk of "borrowing trouble", as a wiser fellow expressed to me a while back. ‑‑ Eiríkr Útlendi │Tala við mig 21:39, 13 October 2021 (UTC)[reply]
With the number of people actively in this community, an arbitration committee would feel like a sitcom or Alice in Wonderland trial, where there's an argument and someone puts on a wig and a fine bit of farce is had that satisfies nothing. The English Wikipedia ArbCom works in part because the Committee is not tangled up in all the issues that reach them; I can't see that happening here. Referring our issues to the English Wikipedia ArbCom might work.--Prosfilaes (talk) 23:40, 13 October 2021 (UTC)[reply]
I am deeply reticent to refer any EN Wiktionary concerns to the EN Wikipedia ArbCom. Our organizational cultures and norms are very different. We've had various issues arise because Wikipedia editors engage here, based on Wikipedia norms, requiring much cleanup and coordination. I can't imagine that issues referred to the WP ArbCom would be handled with any ease. ‑‑ Eiríkr Útlendi │Tala við mig 02:48, 14 October 2021 (UTC)[reply]
Like Prosfilaes, I don't think we have a big enough active editor base to have an Arbcom. I like the suggestion that if there's an intractable issue where neither position can get supermajority support, or it's unclear what the status quo is (since votes are structured as changes to the status quo) but we have to do something, we should have a majority vote. It isn't without issues, but...it's an idea. I don't know if Wikipedia's Arbcom would be keen to accept cases from us, since they have a workload as it is, and they (or we) also might often feel they lacked the relevant expertise to judge things like disputes over what template wordings are best for a dictionary. For intractable disputes over blocks, we could ask global sysops to weigh in. - -sche (discuss) 01:33, 14 October 2021 (UTC)[reply]
Global sysops are just as bad as outsourcing to Wikipedia. In my experience, they generally neither know nor care about Wiktionary, and would probably be annoyed at the very suggestion of foisting another local task on them. —Μετάknowledgediscuss/deeds 18:00, 14 October 2021 (UTC)[reply]


This page survived RFD, but many users pointed out the need for a cleanup. Modernization/expansion from experienced editors is welcome. (Discussion here, to be archived at Wiktionary talk:Etymology.) Ultimateria (talk) 00:02, 10 October 2021 (UTC)[reply]

Wording of RFD banner[edit]

I propose that we change the banner message generated by {{rfd}} as follows:

Current text:

This entry has been nominated for deletion
Please see that page for discussion and justifications. Feel free to edit this entry as normal, though do not remove the {{rfd}} until the debate has finished.

Proposed new text:

This entry has been nominated for deletion
Please see that page for discussion and justifications. While voting is in progress, please do not edit this entry in a way that may alter or make unclear the apparent intention of votes already cast. Do not remove the {{rfd}} template until the debate has finished.

What do you think? Mihia (talk) 21:02, 10 October 2021 (UTC)[reply]

I noticed that someone put a noun sense under the verb sense of push and shove, which seemed like a good idea but made the voting less clear. None Shall Revert (talk) 06:56, 11 October 2021 (UTC)[reply]
Also wiki things are not supposed to be "votes" None Shall Revert (talk) 06:58, 11 October 2021 (UTC)[reply]
It does happen from time to time. I have observed several cases where fundamental changes have been made to the whole basis of an entry while voting is in progress, and moreover people sometimes do not even bother to mention that they have done this at the RFD discussion. So an entry is listed at RFD, people vote "Delete" let's say, and then the entry is completely changed or rewritten, or redirected maybe, with no notice, leaving the status of the pre-existing votes totally unclear. I definitely do not agree that we should simply say "Feel free to edit this entry as normal" on the RFD banner -- it's just a question of exactly what we do say. Rather than my suggestion above, we could say "please mention any substantial changes at the RFD discussion", but this still leaves the problem of what should be done with pre-existing votes that may no longer be applicable. Mihia (talk) 08:12, 11 October 2021 (UTC)[reply]

Alternative suggestion (a bit more permissive):

This entry has been nominated for deletion
Please see that page for discussion and justifications. You may continue to edit this entry while the discussion proceeds, but please mention significant edits at the RFD discussion and ensure that the intention of votes already cast is not made unclear. Do not remove the {{rfd}} template until the debate has finished.

Mihia (talk) 08:22, 13 October 2021 (UTC)[reply]

I like the last one. Ultimateria (talk) 17:16, 13 October 2021 (UTC)[reply]
I like this wording better than the first proposal. - -sche (discuss) 01:35, 14 October 2021 (UTC)[reply]
Likewise, I support this last wording. Imetsia (talk) 17:06, 14 October 2021 (UTC)[reply]
OK, I have implemented the second suggestion. Mihia (talk) 17:07, 14 October 2021 (UTC)[reply]

Proposal for new parameter in linking templates: "alternative script"[edit]

I suggest a new parameter for linking templates which will input alternative (non-lemma) script forms within parantheses. This is already partly done for Korean and Vietnamese:

But these language-specific templates are not ideal because they lack most key functions (e.g. part of speech, literal meaning, suppression of transliteration.) and cannot be integrated with other templates such as {{alter}}, {{syn}}, {{bor}}, etc.

An "alternative script" parameter would be useful for various languages:

  • In the case of Korean, especially formal or academic language, there is a very large number of Chinese-derived homophones. An example is 연기 (yeon'gi), whose entry currently features nine not uncommon and completely unrelated words:
연기 (演技, yeon'gi, “acting”), 연기 (煙氣, yeon'gi, “smoke”), 연기 (延期, yeon'gi, “postponement”), 연기 (緣起, yeon'gi, “dependent origination”), 연기 (年記, yeon'gi, “date of composition recorded on an artwork”), 연기 (年期, yeon'gi, “certain number of years”), etc.
A fully integrated "alternative script" parameter would allow far easier disambiguation of these. To a lesser extent, this is also true of Vietnamese.
  • Many languages are written in multiple scripts. On Wiktionary, one script is usually chosen as the lemma script, with the result that forms in the other script are neglected. For instance, the majority of Azerbaijani speakers live in Iran and primarily use the Arabic script, which has also been the script for most of Azerbaijani history. But this fact is neglected because all Azerbaijani lemmas are in the Republic's Turkish-based Latin script. The integration of an "alternative script" parameter would allow for a more equitable coverage of such languages in etymology or descendant sections, in translation charts, etc. Example:
current {{m|az|Azərbaycan}} Azərbaycan > new {{m|az|Azərbaycan|altscr=آذربایجان‎}} Azərbaycan (آذربایجان‎‎)
current {{m|ks|کٲشُر}} کٲشُر(kạ̄śur) > new {{m|ks|کٲشُر|altscr=कॉशुर}} کٲشُر‎ (कॉशुर, kạ̄śur)

Thoughts?--Tibidibi (talk) 07:11, 11 October 2021 (UTC)[reply]

I've found a similar need in Pali, where there are multiple scripts in use, and I anticipate a similar need for Sanskrit. The solution for Pali is documented by a full set of examples for {{pi-link}}, which generalises {{link}}. One complication there is that some Pali writing systems are ambiguous and that the Roman script is one of the major writing systems, so we end up with transliterations and Roman script equivalent sometimes having to be different. Generally we want to link to the Roman script equivalent, but sometimes it is not easily available, e.g. in inflection tables, which commonly link to the entries in the tables. Sanskrit has a similar but different complication. The Bengali script writing system is ambiguous, and Devanagari is the 'lemma' script. (Don't like the term, as we treat the equivalents in the other scripts as alternative forms, thus also lemmas.) For Pali I've built specialised forms of some linking templates on the standard templates, such as {{pi-alternative form of}} on {{alternative form of}}. I've independently encoded {{pi-nr-inflection of}}, which I ought to convert to build on the standard template using common generalisation logic. --RichardW57 (talk) 12:11, 11 October 2021 (UTC)[reply]
Note that my scheme treats the form in the alternative script as the primary input. --RichardW57 (talk) 12:11, 11 October 2021 (UTC)[reply]
Korean is an unusual case, where the hidden parameter to the conversion is meaning rather than pronunciation. --RichardW57 (talk) 12:11, 11 October 2021 (UTC)[reply]
@Tibidibi: It's a yes for me. Maybe with the possibility of adding a description before the alternative script, like they do in Serbo-Croatian entries (for example: dom#Noun_28). Sartma (talk) 08:27, 12 October 2021 (UTC)[reply]

Splitting Hebrew roots?[edit]

There are a bunch of homonymous Hebrew roots that mean completely different things but just so happen to look the same and there doesn't seem to be a way to distinguish between them. חילוני, התחיל וחלל don't really share a root, right?.--The cool numel (talk) 08:47, 12 October 2021 (UTC)[reply]

I don’t see how the root of חילוני(khiloni, secular) can be ח־ל־ל‎, while that of חילון(khilún, secularization) is ח־ל־ן‎‎. I guess this is a typo. If we had pages for these roots, we could document several unrelated meanings like we do for other homonymous terms, such as fluke.  --Lambiam 04:30, 13 October 2021 (UTC)[reply]
@Lambian: I'm pretty sure the root of חילון‎ is ח־ל־ן‎‎, as it's derived from חילוני‎ which is in turn just the root ח־ל־ל‎ with the pattern קִטְלוֹנִי (like צבעוני). The thing I'm talking about is splitting categories like Category:Hebrew terms belonging to the root ח־ל־ן by meaning. --The cool numel (talk) 09:57, 13 October 2021 (UTC)[reply]
So I take it then the root is the inflectional root, not the etymological root. Doesn’t that make splitting categories by meaning much less interesting? IMO such splitting would best be done by creating subcategories of homonymous roots according to their different core senses, but deciding what these core senses are and recategorizing terms with homonymous roots accordingly will mean a lot of work for a very small bunch of active Hebrew editors.  --Lambiam 11:44, 13 October 2021 (UTC)[reply]

Adding DRAE links to all Spanish lemmas[edit]

There are currently ~18,500 lemmas with links to DRAE. There are an additional ~27,000 Spanish lemmas that do not currently have a DRAE link but do have a corresponding DRAE entry.

I can run a bot to add a "Further reading" category with a link to {{R:DRAE}} to the entries missing DRAE links. Would this be desirable or just annoying clutter? JeffDoozan (talk) 17:02, 13 October 2021 (UTC)[reply]

If you can match the entries accurately I don't see why it would be a problem. I routinely add them manually. – Jberkel 17:10, 13 October 2021 (UTC)[reply]
Huh, I expected more pages to have an entry. I think it's helpful! As I expand Spanish entries I could use it to filter out a set of "core" Spanish words to work on. Ultimateria (talk) 17:14, 13 October 2021 (UTC)[reply]
Only if the bot checks that the target of the link is a real definition. Today I saw several French entries where people added {{R:TLFi}} but the web site has no definition. Vox Sciurorum (talk) 18:26, 13 October 2021 (UTC)[reply]
Yes, it does. JeffDoozan (talk) 18:39, 13 October 2021 (UTC)[reply]
Did the bot run on all forms? I added one earlier manually: Special:Diff/62116035/64262193 – Jberkel 19:47, 17 October 2021 (UTC)[reply]
Also, could you adapt it to work with {{R:TLFi}}? – Jberkel 08:24, 18 October 2021 (UTC)[reply]
The bot did not run on all forms, only on pages with entries containing a lemma. The page you edited was skipped because it previously contained only a verb form. If anyone is interested, I could generate a list of pages where the DRAE has a lemma but we have only a form.
I'll see what I can do with {{R:TLFi}} but I can't promise anything. JeffDoozan (talk) 15:19, 18 October 2021 (UTC)[reply]
Yes, such a list would be useful, thanks! Especially the adjectives often exist only as participles, presumably autogenerated at some point. – Jberkel 20:53, 19 October 2021 (UTC)[reply]
Here's a list of the 2,935 pages where we have Spanish forms that have corresponding DRAE lemmata. JeffDoozan (talk) 18:39, 20 October 2021 (UTC)[reply]
@Jberkel: ~10,000 TFLi links are being added right now, it should be complete within a day. Here's a list of the 2,743 entries where we have forms but TLFi has a lemma. JeffDoozan (talk) 01:12, 29 October 2021 (UTC)[reply]
This is very much appreciated and I whole-heartedly endorse this. I work on Spanish here and I add this to all entries I make. —Justin (koavf)TCM 06:23, 27 October 2021 (UTC)[reply]

The phrasebook is in dire need of rules.[edit]

(Not referring to the CFI, that's another topic.) Coming from languages that are both gendered and have polite forms, the translation boxes in most phrasebook entries are a mess. It's completely random whether:

  • ...only the polite, only the familiar or both versions are present.
  • ...these polite/familiar forms are qualified as such, whether this qualification comes before or after the entry and whether this qualification is called polite/familiar or formal/informal.
  • ...plural phrases are present.
  • ...all these forms are consistently present both in their male as well as their female forms (if applicable) and how those forms are annotated.
  • ...what the order of all these forms is.

My suggestions:

  • Decide whether to call it polite/familiar or formal/informal and then apply this consistently. See the inconsistencies in are you allergic to any medications
  • Split the translation box into two distinct ones in most articles (where applicable), one for familiar, one for polite forms. Languages that don't have this feature could either be automatically completed using a bot that copies over entries between the boxes or alternatively they could be barred from one of the boxes (maybe by introducing a new {{trans-top}} that only accepts languages with politeness distinctions).
    • If the above point doesn't happen, at least define a consistent scheme. Should the qualifier come before or after? Should entries without qualifiers in languages with politeness distinctions be allowed? What should come first?
  • Disallow plural translations.
  • Decide whether gender should be expressed using the gender parameter of {{t}} or using {{qualifier}}, then apply this consistently. See the inconsistencies between e.g. are you religious and are you single.

--Fytcha (talk) 02:36, 14 October 2021 (UTC)[reply]

I agree with all of this. But it's worth noting that in many languages, politeness and formality are not the same thing. In Korean, you can be politely informal and non-politely formal. Tibidibi (talk) 04:40, 14 October 2021 (UTC)[reply]
In that case, as I don't think it is within the scope of a phrasebook to give impolite phrases (except perhaps for phrases that are explicitly/obviously impolite), I would suggest that we stick with formal and informal and avoid any distinctions between politeness and impoliteness. Andrew Sheedy (talk) 05:53, 14 October 2021 (UTC)[reply]
I also agree with all the above, with the caveat that some languages, like Korean, have both a formal/informal and polite/familiar distinction. As you say, we can choose the most relevant one (I would probably keep polite/familiar for Korean too, since formal/informal is a distinction more pertinent to more restricted scenarios, but I guess Korean editors will make the call on that. Just a note: non-polite means "familiar" and doesn't mean impolite.). Sartma (talk) 09:12, 14 October 2021 (UTC)[reply]
From how I read it, you two are in disaccord on whether formal/informal or polite/familiar should be the primary divider of the translation boxes. What is your opinion on this @Tibidibi? If a normal phrase like how are you had only two boxes, would you want them to be formal/informal or polite/familiar? Note that we could always provide all four combinations of (formal+polite), (informal+polite), (formal+impolite), (informal+impolite) by the use of the appropriate {{qualifier}}s; the question is merely which distinction is more useful and semantically more sensible. Fytcha (talk) 16:44, 10 November 2021 (UTC)[reply]
@Chuck Entz Can I change Wiktionary:Phrasebook and the entries accordingly or is a formal vote necessary? It's a bit of a radical change so I don't want to do it unilaterally; OTOH people seem to not care much about the phrasebook and this proposal. --Fytcha (talk) 14:37, 2 November 2021 (UTC)[reply]

Major opportunity for us to step in for word of the year[edit]

Heads up that OED are slipping. It's our time to strike. —Justin (koavf)TCM 16:36, 14 October 2021 (UTC)[reply]

"...observing that 'worms are all over the place' and 'everybody loves a good worm.' Well, I'm sold. Ultimateria (talk) 16:52, 14 October 2021 (UTC)[reply]
In a way it would be funnier with the computing sense of worm (something like a virus), since I can imagine somebody really out of touch thinking this was a "new" hi-tech word of the 21st century! Equinox 10:13, 15 October 2021 (UTC)[reply]

New SOP policy idea[edit]

I propose adding a new SOP test at WT:Idioms that survived RFD. It would have a caption like "Terms whose parts are substitutable, but with which only a few variations greatly predominate. For instance, the word "air" in air resistance can be switched out for "wind," "snow," "water," "fluid," and others; but "air resistance" is the only widely used and attested form." (A better writer could improve some of the wording). Accordingly, I would name the test WT:AIR RESISTANCE/WT:AIR, although there are probably other entries to which this logic has been applied in RFD discussions. (Talk:idle threat comes to mind). There are also the ongoing discussion about rumor has it and puré de batata.

As a community, this is a justification that has previously won the day, so it makes sense to codify it. In addition, all of our SOP policies are essentially advisory and open to great interpretation (there are no bright-line rules), and I don't think this test would depart from that tradition. Lastly, this policy would finally bring us one step closer to a more fleshed-out approach to handling set phrases and common collocations. Thoughts? Imetsia (talk) 17:36, 14 October 2021 (UTC)[reply]

Your idea sounds great, I like it. The reason why I'd advocate for the inclusion of articles such as air resistance isn't because they're so indecipherable (let's be honest, you really can guess what it means based on the parts) but because:
  • It is the canonical collocation to express this idea. There might be other SOPs that convey the same meaning but this one is the one that's actually used.
  • The article serves many other purposes other than just explaining the idea, such as providing translations, coordinate terms, hyponyms etc.
Your proposal shifts the focus of SOP discussions a bit away from the question "Can its meaning be guessed based on the parts?" to "Is it the principal (i.e. most widespread) collocation to express this concept?", which is a change I welcome with open arms. Fytcha (talk) 18:06, 14 October 2021 (UTC)[reply]
Support. This seems like a good idea. We need some way of including collocations and fixed expressions, anyway. Andrew Sheedy (talk) 19:36, 14 October 2021 (UTC)[reply]
I now agree that we should have a firm basis for including entries for strong set phrases -- combinations that are explicable as SoP, but in practice overwhelmingly predominate over other possible ways of saying the same thing by word substitution of synonyms (however we can best define this idea). While we are looking at this policy area, I also believe that we should have a firm basis for including SoP phrases that are particularly hard to understand from the parts if one does not already know which of many possible meanings to combine together -- another argument that is often made at RFD. Mihia (talk) 21:36, 14 October 2021 (UTC)[reply]
On the second suggestion, I think we'd have to firmly pin down whether there is enough of a multitude of "possible meanings to combine together" for a term to not be SOP. This seems quite hard to establish clearly through policy. Talk:amico per convenienza comes to mind. On the first try, it passed RFD because of just this justification, though the vote was later overturned. (To me, the argument that it was SOP was a slam dunk, and it shocked me that so many users initially disagreed). So I do not disagree with the idea in principle, but we would have to adjust the dials just right to ensure we are neither over- or under-inclusive. Is there really an administrable standard we can come up with to achieve just this result? Imetsia (talk) 22:01, 14 October 2021 (UTC)[reply]
I think both ideas are equally hard to precisely codify because there will always be an element of subjectivity. I think we just have to accept this, and establish the broad policy and let borderline or argued cases go to RFD. I think that examples of phrases that have passed RFD on the stated grounds, as we have done with other cases, are very helpful. FTR, a recent one that was undeleted on the second ground is track meet. Mihia (talk) 10:08, 15 October 2021 (UTC)[reply]
I oppose including common colocations because they are common colocations. We can use {{ux}} to illustrate the more common uses. Vox Sciurorum (talk) 23:30, 14 October 2021 (UTC)[reply]
I oppose the subjectivity of the idea. Although “idiomaticity” is close friends with commonness.
The real question should be technical utility, with cross-language perspectives (which most who want to have a say on a term don’t have, naturally since our language knowledges are limited by our origins in particular language communities). And it wasn’t even about the utility of the term alone in the case of air resistance, but people apparently wanted it as a model for other types of resistance (so we do not have to create them but look in this entry how to construct them, very remarkable). But you are unable to form a reasonable rule or guideline from this example. A particulari ad universale non valet consequentia. (Case law is bad and a meaningless Anglo-fetish.) Fay Freak (talk) 00:37, 15 October 2021 (UTC)[reply]
The guideline I've formulated seems quite reasonable to me. Why do you disagree? It's readily administrable and provides a good general principle that can be applied not mechanically, but by using sound judgment and discretion. Just like every other example on WT:Idioms that survived RFD, this is not a hard and fast rule, and it includes an element of subjectivity. Editors constantly disagree about the application of SOP policies; some are more permissive on the issue of term inclusion, and others are more conservative. This is not an exception to that rule. It fits in perfectly with every other advisory rule we've ever put forward about idiomacity and SOP-ness. Imetsia (talk) 15:55, 15 October 2021 (UTC)[reply]
I don't support individual entries for mere common collocations. I think we can find a conceptual division, albeit slightly grey and subjective, between common collocation and strong set phrase. Mihia (talk) 10:12, 15 October 2021 (UTC)[reply]
The idea may have merit if we can formulate a solid objective criterion, but I cannot resist pointing out that air resistance is a poor example. The term denotes a physical force, expressible in the unit newton. In general, designers try to minimize air resistance. The term wind resistance as commonly used (pace M–W) is an entirely different species, the ability to stand up to wind damage,[1] a highly desirable property (except for the sets of disaster flicks such as Twister).  --Lambiam 10:18, 15 October 2021 (UTC) Addition: In English the first component of such a compound can be the subject or the object of the action. In French you can see the distinction in the preposition used: résistance de l'air versus résistance au vent.  --Lambiam 10:39, 15 October 2021 (UTC)[reply]
As solid and objective a criterion as possible, yes, but it will never be mechanically objective, such that anyone can apply a rule and will always come up with the same answer. If we had only mechanically objective CFI criteria then we would never need RFD discussions. Mihia (talk) 13:14, 15 October 2021 (UTC)[reply]
I agree with Mihia's comment right above. In addition, is air resistance really as poor an example as you argue? M-W, as you point out, has a definition much more in line with that of "air resistance." Even if you claim it's not the most used meaning, you must accept that it is a meaning. And what for the other substitutes like "snow," "water," and "fluid?" (I haven't checked these on my own, but maybe you can make a case for your position based on these). Imetsia (talk) 15:55, 15 October 2021 (UTC)[reply]
Fluid resistance” is a more general term than “air resistance”. It is the resistance experienced by a body in motion, relative to a surrounding fluid. Usually the fluid is air, but when something else, the term “air resistance” is not appropriate. “Snow resistance”, “water resistance” and “wind resistance” generally refer to the ability to resist, or protect against, the intrusion or harmful effects of said phenomena or substances; having good wind resistance means the same as being windproof.  --Lambiam 10:44, 16 October 2021 (UTC)[reply]
@Lambiam: OK, I agree now that snow and water resistance do not fall under the same family of meanings as "air resistance." But I don't agree when it comes to wind resistance. "Wind resistance" definitely does have a similar meaning which is used quite commonly ([2], [3], [4] just for starters). According to the wiki article you linked, there's also "wave resistance," under the same family of meaning. So what would you think about the proposed policy if we switched "wind, snow, water, and fluid" with simply "wind, fluid, and wave?" Imetsia (talk) 18:29, 16 October 2021 (UTC)[reply]
Can I just point out that I think that people here are talking about potentially two different things. This first is whether words with different meanings can be substituted to create a parallel phrase, for example "air resistance" changed to "wave resistance", and the second is whether synonyms can be substituted to produce an equally idiomatic way of saying the same thing, e.g. per Fytcha's comment "It is the canonical collocation to express this idea. There might be other SOPs that convey the same meaning but this one is the one that's actually used." (my emphasis). Mihia (talk) 19:34, 16 October 2021 (UTC)[reply]
The aim of my comment regarding the example “air resistance” was to point out that it is not a felicitous example to illustrate the proposed test, and equally infelicitous to serve as its name. A better example might be the collocation disaster preparedness; while its synonyms catastrophe preparedness and disaster readiness have been used, it is clearly[5] the winner of the “canonical collocation” (con)test.  --Lambiam 20:24, 16 October 2021 (UTC)[reply]
I wonder whether we can come up with something a bit punchier than "disaster preparedness". What about "human rights"? Mihia (talk) 21:30, 16 October 2021 (UTC)[reply]
Would that be an example that fits your second criterion (synonym substitutability) but not the first (parallel phrases)? For synonym substitutability, I'm guessing we have, e.g., "people rights," "mankind rights," and similar. But are there any parallel phrases? The only ones I can think of are either [ADJ]+rights or [possessive]+rights, which don't really fit. Imetsia (talk) 23:40, 16 October 2021 (UTC)[reply]
"human rights" is supposed to be an example of something that would pass the first test, the clear predominance of one way to say something over other candidates involving word substitutions, such as those you mention. A parallel phrase would be animal rights. Despite our defining this as, essentially, the rights of animals, it again passes the first test because of its overwhelming predominance over e.g. "creature entitlements" or whatever. "human rights" and "animal rights" are examples of what I would call strong set phrases explicable as SoP. Mihia (talk) 08:15, 17 October 2021 (UTC)[reply]
Could we then just have two tests, one for parallel phrases and the other for synonym substitutability? "Air resistance" seems like a good candidate for the parallel-phrases test, while "human rights" passes the synonym-substitutability test. (The parallel phrase you mention is one for which we have an entry, so I don't think it's a great example -- isn't the idea to list parallel phrases that wouldn't be entryworthy, thus showing that the original term is in fact a set phrase?) Honestly, "air resistance" has the synonym, as argued above, of "wind resistance." So it could also work for the second case. But if we really want a more shining example of synonym substitutability, I suppose we could just include both of the tests. What do you think? Imetsia (talk) 16:18, 17 October 2021 (UTC)[reply]
My desire would be a rule to explicitly allow strong set phrases / fixed expressions even if explicable as SoP (and also, on a separate point, a rule to explicitly allow SoP combinations that are particularly hard to understand from the parts). Of course, the problem is how best to define "set phrase" or "fixed expression" (or, at least, the sort that we would want to include). The idea that "It is the canonical collocation to express this idea", aka (more or less) non-synonym-substitutable, will apply in some cases, perhaps not all. I am less clear how much additional help the "no parallel phrases" test would be. My suggestion, if we want a basis to include entries that we presently wouldn't, or that would presently be of unclear eligibility, is to compile as big a list of these entries as possible, so that we can check whether the proposed test(s) are adequate (and at the same time verify that the tests do not allow entries that we would not want to include, of course). A good source of these entries would probably be previous RFD discussions. Another possibility would be simply to say that we "allow strong set phrases or fixed expressions even if SoP" and, where disputed, let it be debated case-by-case what these are. Mihia (talk) 17:53, 17 October 2021 (UTC)[reply]
@Mihia: Could you lay out the full text of your proposed WT:HUMAN RIGHTS test? I'd like to start an informal vote below about both of them (since the discussion seems to have stalled at this point), so I'd like the full text. Imetsia (talk) 20:44, 19 October 2021 (UTC)[reply]
@Imetsia: Actually, of the two I would probably in the end choose WT:ANIMAL RIGHTS as perhaps slightly less susceptible to objections that it is not wholly explicable as SoP. Unfortunately I don't have a full proposal at the moment except the one that I mentioned, namely "allow strong set phrases or fixed expressions even if explicable as sum-of-parts", which I think may not fly as people may reasonably ask "how do we tell what is a strong set phrase or fixed expression"? That is the difficult part. I am of the opinion, as I alluded to above, that before making a concrete votable proposal the wording should be tested against as many actual examples as possible, which might be obtained from the imagination or from failed (or even passed) RFD candidates, to ensure not only that desired phrases pass but also that undesired ones fail. Compiling such a list is something that I have had on an "eventual to do" list, but haven't got round to yet. Mihia (talk) 16:31, 20 October 2021 (UTC)[reply]
  • I have a comment about the practical implementation of this idea, if it should go ahead. At WT:CFI it says "An expression is idiomatic if its full meaning cannot be easily derived from the meaning of its separate components [...] See Wiktionary:Idioms that survived RFD for other examples." We cannot therefore just plonk a "set phrase" test at Wiktionary:Idioms that survived RFD, as initially suggested, since quite likely the meaning of a set phrase can be easily derived from the meaning of its separate components. Mihia (talk) 17:27, 15 October 2021 (UTC)[reply]
In fact, the same could be said about some other tests at Wiktionary:Idioms that survived RFD, such as the "tennis player" test. It seems that this problem is a pre-existing slight muddle of the wording in these sections. Mihia (talk) 17:35, 15 October 2021 (UTC)[reply]
Support on my end. AG202 (talk) 21:47, 15 October 2021 (UTC)[reply]
Some dictionaries include a separate section of common collocations involving some term in their entry for that term. For examples, see the online Cambridge Dictionary and Collins. I think this would be a good alternative for us too.  --Lambiam 11:43, 16 October 2021 (UTC)[reply]
A separate section would be counterproductive. Using {{ux}} does the job better; remember that there’s the |t= parameter!— thus: {{uxi|en|sick burn|t=a particularly cutting insult}}. ·~ dictátor·mundꟾ 23:08, 16 October 2021 (UTC)[reply]
It is not clear to me why you expect this to be counterproductive. It seems a better alternative, at least to me, than introducing another vague exception to the non-SOP rule. The |t= parameter is explicitly intended for English translations of usage examples on foreign entries, not for glossing. While one or two {{ux}}es – which per policy should be be grammatically complete sentences – will generally suffice to demonstrate usage of a term, I can easily imagine a handful of associated common collocations.  --Lambiam 10:16, 17 October 2021 (UTC)[reply]
@Lambiam: The parameter t= is used to translate Early Modern English and dialectal English quotes whose mutual intelligibility with Modern Standard English is low. Youth slang and other suchlike jargons also do depart from Modern Standard English, and hence have I no qualms about any misue of the parameter. Sociolinguistically, any non-Standard variety is ‘foreign’, or else I would not have been blocked for using non-Standard English to write definitions. ·~ dictátor·mundꟾ 15:04, 17 October 2021 (UTC)[reply]
Out of interest, how would you define the test, or criterion, that would allow us to keep "nature[-]lover" against the arguments that it is SoP? Mihia (talk) 08:40, 17 October 2021 (UTC)[reply]
nature lover is a collocation, unlike SoPs like wine lover and nature person. ·~ dictátor·mundꟾ 15:04, 17 October 2021 (UTC)[reply]
I fear that "is a collocation" will be far too permissive for our purposes. Mihia (talk) 17:24, 17 October 2021 (UTC)[reply]
What if I told you that idiomaticity is exclusively essentiated by comparative grounds? Like there is name names, this is easily parsed as the sum of its parts, but if you look at its German translation Ross und Reiter nennen you are soothed. An ἰδίωμα (idíōma) is there by its being ἴδιος (ídios) in contradistinction to other, more hands down ἰδιώματα (idiṓmata) (there is no peculiarity without a general mass to other from). Because judging by commonness within a community runs into the sorites paradox, too obviously and frequently. “Collocation” is just a rephrasing of the same commonness idea. Fay Freak (talk) 19:27, 17 October 2021 (UTC)[reply]
  • Support per dictátor·mundꟾ. Collocations should be allowed where the collocation itself is substantially more likely to be used than any substitution, where the collocation is a commonly used rhyming pair of terms or alliterative pair of terms, or where at least one term in the collocation has multiple common meanings, but the usage in the collocation overwhelmingly intends one of those meanings (particularly where it is not the most common meaning of the term). bd2412 T 02:04, 18 October 2021 (UTC)[reply]
Without any further stipulation, the test "where the collocation itself is substantially more likely to be used than any substitution" would apparently allow "white cat" as substantially more likely than e.g. "ivory feline", or "chair leg" as substantially more likely than e.g. "stool limb", while I personally would not want to include either of those. I'm sure examples such as these abound. This is why we need to carefully check exactly what the letter of the rule would and would not allow. Mihia (talk) 21:00, 20 October 2021 (UTC)[reply]
I would have zero problems including chair leg as an entry. As for color-noun combinations, they are obvious enough that we might as well append a rule saying no "color noun" terms unless the meaning is something other than a common noun of that name and of that color. bd2412 T 04:55, 11 November 2021 (UTC)[reply]
@Mihia: Is white cat really so much more common than black cat or white noise? Fytcha (talk) 10:56, 11 November 2021 (UTC)[reply]
Not necessarily, but I don't see the connection to the topic. We include black cat and white noise because those have idiomatic meanings. The same doesn't apply to white cat (as far as we know; if it did then we would include it, no problem). Mihia (talk) 18:23, 12 November 2021 (UTC)[reply]

Voting to elect members to the Movement Charter drafting committee is now open (October 12 - 24)[edit]

Voting to elect members to the Movement Charter drafting committee is now open. In total, 70 Wikimedians are running for 7 seats in these elections.

Voting is open from October 12 to October 24, 2021.

We are piloting a voting advice application for this election. It helps show which candidates hold positions similar to the choices entered.

According to the set up process, the committee will initially consist of 15 members in total. 7 members elected in this process, 6 members selected by Wikimedia affiliates, and 2 members appointed by the Wikimedia Foundation. Up to 3 additional members may be appointed by the committee, and steps may be taken to replace members as needed.

More details and the voting link is on Meta.

Please feel free to let me know if you have any questions about this process.

Xeno (WMF) (talk) 01:47, 15 October 2021 (UTC) (Movement Strategy & Governance Team, Wikimedia Foundation)[reply]

(Disclosure: I'm a candidate.) The election closes in 17 hours (at 12:00 UTC). The referenced Charter to be drafted is basically going to be a constitution for Wikimedia, binding on all our supporting organizations, and outlining the formation of new governance structures. --Yair rand (talk) 19:13, 24 October 2021 (UTC)[reply]

Wiktionary:Votes/2021-10/Standardising wording for showing cognates[edit]

I recently created this vote, for consistency and standardisation. Looking for feedback, concerns, comments, etc. Svartava2 (talk) 16:49, 16 October 2021 (UTC)[reply]

The nuisance of a lot of edits in my watch list may exceed the small benefit. Other than that, I understand the proposal to be replacing all instances of the five strings before {{cog}} with a single one of them, and leaving all other uses of {{cog}} alone. Thus typos like "cognate witth" and alternate wording like "from the same origin as ..." would be untouched. I suggest leaving "include", "with", and "compare" alone and replacing "to" and "of" with "with". Which is not on the list of options. Note that include implies additional unlisted cognates. It is not correct for a bot to replace include with anything else. Vox Sciurorum (talk) 17:02, 16 October 2021 (UTC)[reply]
@Svartava2, a better formulation of the vote might just be to standardize cognates like this: (1) require "Cognate with" for full cognates, (2) allow "Cognates include" when multiple cognates exist, and (3) allow "Compare" for "non-full cognates" (i.e., per Richard, terms that "are semantically similar or etymologically related"). If the vote were so phrased, it would have the typical consensus-building problems of any omnibus vote. Different users would like and dislike different parts of the proposal, and few will embrace it in full, leading then to a mixed opposition that ultimately tanks the vote. To avoid this, I do agree with Vox's solution above: a simple vote to replace "'to' and 'of' with 'with.'" Imetsia (talk) 15:59, 17 October 2021 (UTC)[reply]
@Imetsia: I don't understand (1) above. It contradicts (2). 'Compare' is appropriate for when the relationship is unclear or a parallel formation can be seen. --RichardW57 (talk) 16:46, 17 October 2021 (UTC)[reply]
OK, let me rephrase it: (1) require "Cognate with" when only one full cognate exists, (2) allow "Cognates include" when there are multiple, and (3) allow "Compare" "when the relationship is unclear or a parallel formation can be seen." I also like your wording better, so thanks for the suggestion! Imetsia (talk) 17:12, 17 October 2021 (UTC)[reply]
@Imetsia: Would you allow 'cognate with' if there were descendants of the cognate given? --RichardW57 (talk) 19:18, 17 October 2021 (UTC)[reply]
@RichardW57: I don't really understand your question. Maybe an example could help? Imetsia (talk) 19:22, 17 October 2021 (UTC)[reply]
@Imetsia: Suppose all that we knew of the cognates of Greek θεός were Latin fānum and the latter's English borrowing fane. Would you allow us to describe the Greek word as 'Cognate with Latin fānum', or would we have to write 'Cognates include Latin fānum'?
I would allow "Cognate with." So I guess even my revised suggestion is too imprecise. I mean to say that one must use "Cognate with" rather than of or to if they wish only to include one cognate (even if others may exist). If, however, one wants to include multiple cognates they can use "Cognate with X and Y" if X and Y are the only cognates that exist; or use "Cognates include X and Y" if X and Y are only two of the many cognates that exist. Imetsia (talk) 20:07, 17 October 2021 (UTC)[reply]
Scratch that. I don't see the reason to make that distinction. I guess we could just deprecate "Cognates include." Imetsia (talk) 20:09, 17 October 2021 (UTC)[reply]
@Imetsia: 'Cognates include' does suggest that there is no need to list them all. --RichardW57 (talk) 21:45, 17 October 2021 (UTC)[reply]
Sure, but there isn't really an urgent need for that to be suggested. It's already understood that we shouldn't list out every single cognate in every case. And simply using "Cognates with" neither implies that the list is exhaustive nor that it's only a subset of the possible cognates. It implies nothing in this respect. Imetsia (talk) 22:48, 17 October 2021 (UTC)[reply]
  • Different phrasings can mean different things: When I say "Cognates include", I imply there are more cognates. When I say "Cognate with", I imply that there aren't, or that those aren't known. When I say "Compare" I usually mean that the words aren't full cognates, but are either semantically similar or otherwise etymologically related. I would like to keep this freedom to choose the most accurate and nuanced wording. Thadh (talk) 22:44, 16 October 2021 (UTC)[reply]
    @Vox Sciurorum, Thadh: "Compare" is often used just before {{cog}}, for true cognates also sometimes (eg. ਗੁਝਾ). My understanding is that "cognate with" or "cognates include" doesn't really imply that only that many cognates are there unless there is an "and". For example, Cognate with LANG term, LANG2 term2, LANG3, term3 or Cognate with LANG term, LANG2 term2, LANG3, term3 and Cognates include LANG term, LANG2 term2, and LANG3, term3 or Cognates include LANG term, LANG2 term2, and LANG3, term3. Another example - Marathi थुंकणे (thuṅkṇe); it says "cognate with" but doesn't include the Urdu cognate given at 𑀣𑀼𑀓𑁆𑀓𑀇. Svartava2 (talk) 03:45, 17 October 2021 (UTC)[reply]
    Everybody has their own stylistic choices, and I respect their choice even if I wouldn't necessarily make it myself. And AFAIK {{cog}} may be used for partial cognates, like Saterland Frisian Bäidenstied and German Kinderzeit (only the second part of the compound is etymologically related), but maybe I misunderstand when {{cog}} must be used? Thadh (talk) 09:05, 17 October 2021 (UTC)[reply]
    'Cognates include' would be odd for a complete list, even if it doesn't preclude it. While I appreciate that there is now push back against 'Wiktionary is not a paper dictionary', as space on mobile phones is limited, 'cognate with' also invites padding with a complete list of cognates, or a complete list of cognates of a particular type. Note that we use 'cognate' in a wider sense than some other dictionaries, by including words related by borrowing. --RichardW57 (talk) 14:08, 17 October 2021 (UTC)[reply]
    @Thadh I believe {{cog}} is only to be used when a single term in the source language; like Pali sarīra, Prakrit 𑀲𑀭𑀻𑀭 both from Sanskrit शरीर. As for Saterland Frisian Bäidenstied, {{noncog}} would be more appropriate; using {{cog}} there is like {{cog}} for Prakrit 𑀅𑀡𑀼𑀕𑀘𑁆𑀙𑀇 (aṇugacchaï) and Pali avagacchati (where the prefix is different and w/o prefix Prakrit 𑀕𑀘𑁆𑀙𑀇 (gacchaï) is true cognate of Pali gacchati). Svartava2 (talk) 13:28, 17 October 2021 (UTC)[reply]
    @Svartava2: {{noncog}} is a bit strong for partially cognate words; plain {{mention}} would be better. --RichardW57 (talk) 16:46, 17 October 2021 (UTC)[reply]
    {{m}} doesn't load a hyperlinked language name, though. But I do think we may want to be more lax with the usage of {{cog}}, because using {{ncog}} for false cognates is not uncommon. Thadh (talk) 17:08, 17 October 2021 (UTC)[reply]
    @Thadh: {{m+}} does. —⁠This unsigned comment was added by RichardW57 (talkcontribs).
    @RichardW57: It doesn't link the language name. Thadh (talk) 19:45, 17 October 2021 (UTC)[reply]
    @RichardW57 There is always a chance of any cognate list being incomplete; but do we always use "cognates include […]"? Do you really think that the 42,100 (approx.) uses of "Cognate with" are 100% complete with not even a single cognate missing? No, I don't think so. "Cognate with" ≠ "Cognate [only] with". Re: "we use 'cognate' in a wider sense" - some editors do, while some don't. I don't. In case of a borrowing, I prefer showing other (borrowed) words in other languages from the same etymon as "Compare {{ncog|LANG|term}}", for example, diff (initially which said "cognate" added by Kutchkutch). Svartava2 (talk) 15:41, 17 October 2021 (UTC)[reply]
    'Cognates include' tells the user and other readers that the list is not intended to be complete. (Quoting from a dozen Zhuang dialects does not seem not useful, unless we're looking at a recent borrowing.) 'Cognate with' does not reveal the author's intention. --RichardW57 (talk) 16:46, 17 October 2021 (UTC)[reply]
  • There appear to be grammatical issues to handle in any automated processing. While it seems safe to replace 'Cognate to' with the classier 'Cognate with', merging of 'cognate of' runs the problem that 'cognate' here is a noun. 'Cognates include' actually includes a verb, and there may therefore be grammatical issues as well as a loss of connotation. Would the bot know not to change quotations? --RichardW57 (talk) 14:08, 17 October 2021 (UTC)[reply]
    well yes you're right, that would also need some attention. I think we could deal with this with the help of some list like user:benwing2/pra-sc. Svartava2 (talk) 16:00, 17 October 2021 (UTC)[reply]
  • Someone please delete this shitty, nonsensical vote! It does not help us in any wise. (@Metaknowledge) ·~ dictátor·mundꟾ 15:31, 17 October 2021 (UTC)[reply]
    It isn't nonsensical; per Imetsia: “A discussion on whether to incorporate some other text in the {{cog}} template by default is a better place to start.” The vote may be hurried a bit, so I removed its starting date (for now); let's do some more discussion regarding this. Svartava2 (talk) 15:53, 17 October 2021 (UTC)[reply]

Bot to generate Spanish forms[edit]

I'm playing around with a bot to generate Spanish forms and I wanted to solicit some feedback concerning the "best" way to declare a form of. To start with, it'll just be generating forms of nouns and adjectives.

Below is a list of the templates/paramaters I would propose for the given situations. Given that these will be bot generated, I'm preferring templates that may generate the most helpful categories or other meta data without regard for how unwieldy their parameters may be.

Plural of a masculine/feminine adjective (verde -> verdes)

head: {{head|es|adjective form|g=m-p|g2=f-p}}

gloss: {{adj form of|es|verde||p}} -> plural of verde

Masculine plural of adjective (rojo -> rojos)

head: {{head|es|adjective form|g=m-p}}

gloss: {{adj form of|es|rojo||m|p}} -> masculine plural of rojo

Feminine of adjective (rojo - > roja)

head: {{head|es|adjective form|g=f}}

gloss: {{adj form of|es|rojo||f}} -> feminine of rojo

Feminine plural adjective (rojo -> rojas)

head: {{head|es|adjective form|g=f-p}}

gloss: {{adj form of|es|rojo||f|p}} -> feminine plural of rojo

Plural of a masculine/feminine noun (dentista -> dentistas)

head: {{head|es|noun form|g=m-p|g2=f-p}}

gloss: {{noun form of|es|dentista||p}} -> plural of dentista

Plural of a masculine noun (doctor -> doctores)

head: {{head|es|noun form|g=m-p}}

gloss: {{noun form of|es|doctor||p}} -> plural of doctor

Feminine equivalent of a masculine noun (doctor -> doctora)

head: {{es-noun|f}}

gloss: {{female equivalent of|es|doctor}} -> female equivalent of doctor

Plural of a feminine equivalent of a masculine noun (doctora -> doctoras)

head: {{head|es|noun form|g=f-p}}

gloss: {{noun form of|es|doctora||p}} -> plural of doctora

Plural of a masculine noun (naranjo -> naranjos) head: {{head|es|noun form|g=m-p}}

gloss: {{noun form of|es|mesa||p}} -> plural of mesa

Plural of a feminine noun (manzana -> manzanas) head: {{head|es|noun form|g=f-p}}

gloss: {{noun form of|es|mesa||p}} plural of mesa

Are there other cases I should consider or anything else anyone would like to see in a bot generated form entry (etymology, IPA, etc)?

Note: Some of the default head/gloss lines have been edited to reflect the suggestions below. I decided to keep the gender/plural declarations in the headword definition because most entries already have them and they would be difficult to add later but easy to remove.

JeffDoozan (talk) 22:08, 16 October 2021 (UTC)[reply]

Was there a decision on whether {{es-IPA}} is safe enough to add to all entries? If so, the bot should add pronunciation. On the bigger issue, I don't think all Spanish forms need their own pages. The list should be made by a human including common words but not rare words. Vox Sciurorum (talk) 22:47, 16 October 2021 (UTC)[reply]
Don't worry, I'm not going off on an anti-red-link campaign. I think there are tasks that are better suited to a bot than a human and generating forms seems like one of them. I'm open to input on how to apply this: perhaps only generating forms for lemmas that are DRAE attested that don't contain an obsolete/disused/antiquated qualifier. Additionally, the bot could monitor a page where humans could add lemmas that they deem form-worthy and the bot can save them the labor of creating them manually. JeffDoozan (talk) 18:04, 17 October 2021 (UTC)[reply]
@Vox Sciurorum: I asked this question recently and unfortunately, es-IPA is not 100% foolproof yet. I can get a citation if you need. —Justin (koavf)TCM 06:53, 27 October 2021 (UTC)[reply]
See what the convention is for Portuguese. One may not be better than the other, but two similar languages that often have identically spelled cognates should use the same wording. Vox Sciurorum (talk) 13:01, 17 October 2021 (UTC)[reply]
I've seen all of the variations that I posted above without an obvious consensus, so I thought it would be good to brainstorm to see if there are any nuances I've missed. JeffDoozan (talk) 18:04, 17 October 2021 (UTC)[reply]
  • No preference on templates for verde et al. I don't think e.g. "masculine plural of hombre" makes sense; I'd say don't mention the gender of nouns in the definition line except for female equivalents. doctora isn't exactly a noun form; we currently treat female equivalents as lemmas (as with alternative forms), and I agree with that format. On verdes et al, I'm weakly against including the gender/number in the headword line.
  • Something I would love to see you do with this bot is find instances of bluelinks with missing parts of speech. E.g. if a page has an adjective and a masculine noun, the plural of the noun will often exist while omitting the adjective form of the same spelling. It happens a lot with verb forms and nouns in -a, -e, -o.
  • As for es-IPA, I think it's ready for anything but modern borrowings and especially long words. I'm not 100% sure when secondary stress is used, but I suspect it's present with multisyllabic prefixes, compounds, and some long words. BTW, thanks for adding the DRAE links! Ultimateria (talk) 17:11, 17 October 2021 (UTC)[reply]
    Thank you for the feedback, especially regarding doctora, which I've adjusted above.
    Finding missing parts of speech in bluelinks is one of the motivations for writing this, as even frequently used forms can go unnoticed for a long time, like the missing adjective form alegres, which this bot will handle easily. Another part is detecting orphaned forms that reference lemmas that have been removed. JeffDoozan (talk) 18:04, 17 October 2021 (UTC)[reply]
    the stress on -mente adverbs isn't coded into {{es-IPA}} - the stress on normalmente, for example, goes on the "mal" syllable, just like normal QuickPhyxa (talk) 22:25, 18 October 2021 (UTC)[reply]


hi please creat a bot to creat plural of English names I creat some Amirh123 (talk) 12:03, 18 October 2021 (UTC)[reply]

@Amirh123: you can't even write a complete sentence- why are you creating entries? You were blocked for this three years ago. If you keep doing it, the next block may be permanent. Chuck Entz (talk) 12:27, 18 October 2021 (UTC)[reply]

Effect of Apple’s iCloud Private Relay[edit]

SGrabarczuk (WMF) (talk) 21:34, 18 October 2021 (UTC)[reply]

  • As use increases the admins may want to disable disabling of account creation when blocking an IP address. Vox Sciurorum (talk) 20:45, 19 October 2021 (UTC)[reply]

Over-eager abusefilter rule?[edit]

I was trying to add a question at the information desk and tripped alarms. I was expressing surprise the equivalent to #redirect took so much work here, and I unwisely muttered a fnord and got a rude surprise:

A brief description of the abuse rule which your action matched is: Bad redirect

by in the middle of the text saying:

Does that mean you create a redirect article that *isn't* a #redirect (heaven forfend!), but ...

Now of course here I've used &num; rather than '#'. I don't want to add more entries to my permanent record.

Still, isn't the mention of !#!redirect *anywhere* in text kinda too broad a rule?

63 Bad redirect Disallow, Tag Enabled 04:27, 4 October 2019 by Erutuon (talk | contribs) Private

I see User:Erutuon has been away for some 2 weeks. Anyone else want to take a look? (And is searching for "interface-admin" in Wiktionary:Administrators a reasonable approximation to the set of possible 'someone's? Shenme (talk) 06:28, 19 October 2021 (UTC)[reply]

I looked through the hits on the filter and they generally seem to be accidents or generally bad edits. There is some amount of spam where people try to redirect to other websites, which won't actually do anything, but maybe that is what Erutuon was trying to avoid. Maybe the filter should be updated to only prevent new and unregistered users from typing #REDIRECT elsewhere on the page. - TheDaveRoss 12:41, 19 October 2021 (UTC)[reply]
Erutuon only edited it. I wrote it. The problem it addressed was vandals adding a redirect to an existing page to effectively make it go away. It needs to be fixed so it only detects functioning redirects, but it addresses a real problem. I'd rather not get rid of it. Chuck Entz (talk) 13:57, 19 October 2021 (UTC)[reply]
I edited it again to make it a little less eager. The OP's attempted edit on the Information Desk would now not trigger it anymore, but all of the other edit filter hits that I looked at would. — Eru·tuon 20:36, 19 October 2021 (UTC)[reply]
Thank you. I'll redirect my attention elsewhere.   :-)   Shenme (talk) 04:11, 21 October 2021 (UTC)[reply]

Automatically generating form-of entries?[edit]

Hi. Is there any way to automatically generate form-of entries, or does one have to manually go create the page and fill out the correct details for each inflection? The language I'm thinking of doing this for has some subtleties regarding accentuation, but I'll ignore that for now and just ask about e.g. Italian or French verb conjugations. 05:34, 20 October 2021 (UTC)[reply]

French and Italian inflected forms have been added by a bot: SemperBlottoBot.  --Lambiam 09:35, 20 October 2021 (UTC)[reply]

Help needed for bor cleanup[edit]

I request that someone prepare a list of words in Indo-Aryan languages deriving from Sanskrit — that use only {{bor}} (i.e., no other specific templets). This will make it easier for me to substitute {{bor}} with {{lbor}} (or {{slbor}}, or in a few cases correcting to {{inh}}). A list will make it easier to do the cleanup, or else it is sore difficult to search through the entire list at CAT:X language terms borrowed from Sanskrit. Pinging @Benwing2, Erutuon. ·~ dictátor·mundꟾ 15:21, 21 October 2021 (UTC)[reply]

@Surjection: would it be possible for you to prepare such a list for me? Thanks for any help. ·~ dictátor·mundꟾ 15:21, 22 October 2021 (UTC)[reply]
@JeffDoozan, SemperBlotto: Could anyone of you help me prepare a list; maybe you could do it using a bot. Thanks for any help. ·~ dictátor·mundꟾ 17:31, 23 October 2021 (UTC)[reply]
@Inqilābī: I don't know if this is something I can generate, but there are a couple of things you could clarify to make this request easier to fulfill: which languages qualify as "Indo-Aryan languages deriving from Sanskrit" and exactly which other templates qualify as "no other specific templates"? If you want, say, a list of all Punjabi entries that have {{bor}} but not {{lbor}} anywhere in the entry, I can make that for you pretty easily. If, however, you're looking for a list of 10+ different languages, considering only templates that appear in the Etymology section, and excluding 10+ templates, that's significantly more work. JeffDoozan (talk) 17:47, 23 October 2021 (UTC)[reply]
@JeffDoozan: Thanks for your interest! Yes, I should clarify things. So, I would like a list of words of all Indo-Aryan languages that are categorized as only borrowed (using {{bor}}) from Sanskrit. However, you can of course choose to make separate lists language-wise, i.e., separate lists for Bengali, Hindi, Punjabi, etc. Entries that already use the specific {{lbor}} and {{slbor}} are not to be included. And also, Sanskrit includes any specific chronolect of Sanskrit as well— Classical, New, etc. Hope this helps. ·~ dictátor·mundꟾ 18:05, 23 October 2021 (UTC)[reply]
@Inqilābī: If you can give me a list of all of the language codes you want searched ("bn" for Bengali, "hi" for Hindi), I can make you a list of all entries that include {{bor}} but not {{lbor}} or {{slbor}}. JeffDoozan (talk) 18:15, 23 October 2021 (UTC)[reply]
@JeffDoozan: Here: as, bn, or, bho, inc-oas, awa, mag, bra, gu, hi, ur, kfr, ks, kok, mr, ne, pa, sd, inc-ogu, omr, pi, si .
Besides Category:Terms borrowed from Sanskrit, also do check other categories like Category:Terms borrowed from Classical Sanskrit, Category:Terms borrowed from New Sanskrit, etc. for any possible usages of {{bor}}. ·~ dictátor·mundꟾ 21:00, 23 October 2021 (UTC)[reply]
Here's your list. It turns out that there are no entries whatsoever that have use {{bor|??|sa}} and also {{slbor|??|sa}} or {{lbor|??|sa}}. If you look at the stats for as Assamese, you'll see it has 176 uses of {{bor}} and 12 uses of {{lbor}} but there is no intersection of the pages containing {{bor}} and pages containing {{lbor}}. —⁠This unsigned comment was added by JeffDoozan (talkcontribs).
@JeffDoozan: Thank you so much!! But I see you omitted some languages: inc-oas, awa, mag, bra, kfr, kok, inc-ogu, omr. ·~ dictátor·mundꟾ 14:43, 24 October 2021 (UTC)[reply]
@Inqilābī: They show 0 results because they don't contain any entries with a {{bor|??|sa}}. JeffDoozan (talk) 14:48, 24 October 2021 (UTC)[reply]
I spoke too soon some of them do include {{bor}}, I think I know what the problem is, I'll get you a new list. JeffDoozan (talk) 14:52, 24 October 2021 (UTC)[reply]
Fixed JeffDoozan (talk) 15:00, 24 October 2021 (UTC)[reply]
@JeffDoozan: Sorry to bother again: would it be possible to make the list an automated one, similar to Category:etyl cleanup, so that upon fixing one entry, it goes off the list? I would ideally want that, otherwise the cleanup is going to be very strenuous for me. ·~ dictátor·mundꟾ 15:31, 24 October 2021 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── @Inqilābī: The list is generated from the wikimedia database dump that is generated twice a month so you won't get live updates, but I'll try to remember to keep it updated for you and maybe automate it someday. Ping me if it's ever more than 4 weeks old and I'll refresh it for you. JeffDoozan (talk) 16:54, 24 October 2021 (UTC)[reply]

Using Template:head to populate Category:English N-letter words[edit]

In Wiktionary:Beer_parlour/2021/June#Categorization_bot, User:Suzukaze-c stated that {{head}} could be used to easily populate Category:English three-letter words (which is currently sparse). I would like to propose that {{head}} be used to populate the categories for English one-letter, two-letter, and three-letter words. (I lack the permissions to implement this myself.) The only explicit counterargument mentioned in the previous discussion (though I'm open to more) is that it is difficult to browse or search through a very long list of categories at the bottom of a page. If consensus is against populating these categories, then I'll happily create RFDs for them. - excarnateSojourner (talk|contrib) 03:13, 22 October 2021 (UTC)[reply]

When it comes to implementing, it should be noted that N-letter words must "have meaning(s) beyond their component letters that are neither names nor abbreviations", so the part of speech and possibly capitalization of a term will have to be examined to determine whether it fits the categories' criteria. - excarnateSojourner (talk|contrib) 03:19, 22 October 2021 (UTC)[reply]

Talk to the Community Tech[edit]

Magic Wand Icon 229981 Color Flipped.svg

Read this message in another language


We, the team working on the Community Wishlist Survey, would like to invite you to an online meeting with us. It will begin on 27 October (Wednesday) at 14:30 UTC on Zoom, and will last an hour. Click here to join.


  • Become a Community Wishlist Survey Ambassador. Help us spread the word about the CWS in your community.
  • Update on the disambiguation and the real-time preview wishes
  • Questions and answers


The meeting will not be recorded or streamed. Notes without attribution will be taken and published on Meta-Wiki. The presentation (all points in the agenda except for the questions and answers) will be given in English.

We can answer questions asked in English, French, Polish, Spanish, German, and Italian. If you would like to ask questions in advance, add them on the Community Wishlist Survey talk page or send to sgrabarczuk@wikimedia.org.

Natalia Rodriguez (the Community Tech manager) will be hosting this meeting.

Invitation link

We hope to see you! SGrabarczuk (WMF) (talk) 23:00, 22 October 2021 (UTC)[reply]

Old Prussian Macrons[edit]

Looking through Category:Old Prussian lemmas, I noticed two things: all the terms with macrons I've looked at so far are at the spellings with the macron, and all links to Old Prussian entries with macrons I've seen so far are redlinks. It turns that Module:languages/data3/p is explicitly set to strip macrons.

The obvious question: do we remove the macron-stripping parameter from the module, or do we move all the macron forms to macronless ones and add a head parameter with the macron?

I would also mention that there are a lot of non-lemmas in Category:Old Prussian lemmas, which shows that the headwords have had very little attention since before we made the lemma/non-lemma distinction. It's not uncommon to see one or two edits by humans in 2006 or 2007 and nothing else but bot edits in the edit histories. Chuck Entz (talk) 00:17, 23 October 2021 (UTC)[reply]

Thank you for bringing this up. The Old Prussian entries on Wiktionary could definitely benefit from cleanup and consistency.
Many Old Prussian words are only attested in non-lemma form, e.g. only the accusative plural tūsimtons (thousands) is known. We could go one of two ways: we could either attempt to reconstruct the original form (which would probably belong in the Reconstruction: namespace, similar to how some Gothic and Latin terms are there), or we could stick to only describing the attested forms. I personally wouldn't want to venture far into reconstruction (since I'm not a linguist), although if the work has already been done somewhere else and is simple to incorporate, then I wouldn't mind including it.
Another issue is orthography. As with a lot of old languages, every text had its own unique way of spelling things. That's not a huge problem in itself, since we can just choose one main form arbitrarily and make the other forms alternate. What requires more thought is the fact that a lot of words have an actual orthography, influenced by German, and a reconstructed orthography, based on Balto-Slavic phonology. Macrons are one part of this, but not the entire thing. Macrons were actually used in the Enchiridion, but I'm not sure if any other texts used them. Another idea was that stress is indicated by the presence of doubled consonants preceding a vowel, but I digress.
For example, smoy (person) is attested in the Elbing vocabulary, but the reconstructed form is zmūi. We probably want to make sure terms are in the original form, or at least be aware of this nuance. For smoy, we already have the original form. But, for instance, I believe our entry ēizwa ("wound") is actually a reconstruction of the original eyswo.
(Actually, I'm not sure if either ēizwa or zmūi are attested anywhere, although looking at Kortlandt's version of the Enchiridion, I'm not seeing them.)
Something to definitely keep in mind is that the dictionary at https://wirdeins.twanksta.org/ is for a reconstructed, revived version of Prussian. For example, if you type in "telephone", you'll get "telepōns" as a result. There actually is a way to tell which words are real and which aren't. You have to click the head word, and then you'll see "telepōns <32> masc [Telephon MK]". "MK" are the initials of the person who wanted to revive Prussian. Any words with "MK" in them, or certain other initials corresponding to people involved in the project perhaps (but I've only ever seen MK), are completely out of scope for Wiktionary. Whereas, if you type in "person", you'll get "zmōi <64> masc [Smoy E 187]", where "E" indicates that the word appears in Elbing, and "Smoy" is the original attested form, while zmōi is their reconstruction.
Any other editors with interest in Old Prussian or Balto-Slavic historical linguistics in general may want to take a look at this discussion. 11:01, 23 October 2021 (UTC)[reply]
Macrons: As they are part of one regular spelling, wouldn't they belong into the title? Compare:
  • Latin, Greek, Germanic (OHG, MHG, MLG, Anglo-Saxon): Macron isn't regulary used in writing, but is used in some dictionaries to indicate vowel length. That's why titles don't have macrons.
  • Baltic (Lithuanian, Latvian): Macrons are used in writing and hence are also used in titles.
Reconstructions: Indeed, reconstructions belong into the reconstruction namespace.
--Myrelia (talk) 16:27, 24 October 2021 (UTC)[reply]
[Edited for brevity 20:29, 24 October 2021 (UTC)] Lithuanian and Latvian do use macrons/ogoneks in their standard orthography to indicate vowel length, but I think it's worth noting that like Greek (etc.) they also have special diacritical marks only used in dictionaries to indicate pitch accent that we don't include.[reply]
I think it could be reasonable to include the macrons in titles, but only for words where it is attested (probably a subset of words from the Enchiridion), not for every form where some people have tried to guess where the stress would have been. 19:09, 24 October 2021 (UTC)[reply]
The problem I see is that we haven't been very consistent in marking where and in what form these are attested. There are no doubt a number where the original document had them marked as long, but there are the words for sodium and iodine (now in rfv) that show macrons, but can't possibly have been even attested- with or without macrons. If only one source has macrons, why are there so few macronless entry names in Category:Old Prussian lemmas?
at any rate, we can't stay with the status quo. Old Prussian muti is a good illustration: it's defined as "Alternative form of mūti", but there's no way to go to mūti- the template strips the macron and treats it as a self-link. In most cases, templates have redlinks to a non-existent macronless form rather than linking to an existing macron form. Simply put, you can't use a template to link to an Old Prussian entry with a macron.
As I said, there are only two solutions: disable macron-stripping (easy, but inconsistant with how we handle other languages) or add a {{head}} parameter with the macron form to all entries with a macron, then move them to the macronless spelling (time-consuming, but there are only a couple hundred entries at most). I'm tempted to just implement the second solution
I would suspect that a lot of people who wrote Prussian entries just didn't do their due diligence. They might have looked up the words in that revived-Prussian dictionary or something similar and just added them, without checking in what form they were attested. Either of those solutions could work as a stopgap measure. Ideally we could go through all words words and figure out the form in which they were attested, but that would take an awful lot of effort. 20:29, 24 October 2021 (UTC)[reply]
"macrons in titles, but only for words where it is attested": yeah, that is going by actual spelling in the sources.
muti/mūti: If mūti is the actual spelling in a source, then the entry mūti should stay and the templates/modules be fixed.
"so few macronless entry names": Possibilities:
--Myrelia (talk) 21:34, 24 October 2021 (UTC)[reply]
Re the first possibility: It looks like the Enchiridion (3rd Catechism) is the most voluminous source: 132 pages of text, vs. ~6 pages each for the other two Catechisms, ~800 words in the Elbing vocabulary, and various fragmentary texts.
If anyone wants to look into this further, I think a good source is prusistika.flf.vu.lt, although it is in Lithuanian (with some German glosses). It is based on Mažiulis's dictionary and has very detailed entries, including various inflected forms, 'normalized'/'reconstructed' forms, etymology, and links to the original usages. E.g., for abasus ("cart") (which we currently list under abazzus): [6].
By the way, if we want to follow that dictionary, they seem to include macrons in headwords if it is attested: īmt, but exclude them if it is not: smoy (note that the entry does include the normalized form with the macron, prefixed by an asterisk to indicate it's a reconstruction: "*zmōi̯"). 22:44, 24 October 2021 (UTC)[reply]
I just went through five arbitrary Prussian lemma entries near the start of the alphabet, and all of them were in the normalized/reconstructed form:
These are not cherrypicked. If we aim to only use attested forms as headwords, we have our work cut out for us. 00:03, 25 October 2021 (UTC)[reply]
I have my doubts about Category:prg:Chemical elements in general. Some of them would be straightforward enough for someone of that era to get right, but at least two are obviously modern inventions and there may be modern guesswork involved in the identificaion for others. We definitely should remove the atomic numbers and descriptive stuff- they can get that at the English entries, and it may be implying more precision than the sources merit. Chuck Entz (talk) 01:35, 25 October 2021 (UTC)[reply]
I've gone through all the entries and added |head= parameters. I did so because they do no harm as long as they match the entry title, and because it removes one obstacle to moving rather than changing the module. I won't mind if everybody decides not to move- I also checked all the headwords and added sortkeys, so if we don't move them, all the categories generated by {{head}} won't sort with all the macron forms at the end. Either way, I will have wasted a little bit of my time, but that's fine with me.
Re: muti/mūti: the way to deal with this in the macronless option is to have two noun sections next to each other: one with the macron in the headword and one without.
While going through the entries, there were a few entries that weren't nominative singular, but were formated like a lemma. I'm guessing that these are only attested in the inflected forms. Another oddity is unds, which is defined as the masculine singular of undan. I've noticed that pretty much all of the accusative singulars end in -n, so someone may have assumed that undan is one of those, though it's said to be an alternative form of wundan which is in turn the gloss in the Elbling vocabulary for "Wasser".
I should also mention a "Usage notes" section at drūwis, which discusses standards for capitalization of the names of religions. I'm guessing this only applies to modern revived Old Prussian, since a dead language attested mostly in vocabulary lists doesn't really have usage in that sense. I would suggest that we remove it.
This does bring up the question of capitalization, however. All of the proper nouns seem to be capitalized and the common nouns seem to be lowercase. I really have doubts as to whether this was the practice in all the original manuscripts. Still, if the difference is due solely to the linguistic background of the mostly second-language speakers who wrote the words down, we might want to standardize capitalizations to avoid a random mess. Chuck Entz (talk) 01:11, 25 October 2021 (UTC)[reply]
I've removed macron-stripping from the module. If we decide we want to move the macron forms to macronless, it can be easily changed back. At least now the links work. Chuck Entz (talk) 04:36, 25 October 2021 (UTC)[reply]
That's great. BTW: diff caused the issue, and was changed without explanation or taking care of the entries. --Myrelia (talk) 09:33, 25 October 2021 (UTC)[reply]

Reference Placement[edit]

Where should inline references go? Normally the correct placement is after the information presented, but there are a number of awkward cases.

  1. Information in the headword line. Some, such as @Inqilābī, interpret that footnote link location as being for the whole L3/L4 section. I interpret such a reference as being for the information on the headword line. For longer headword lines, this could be confusing, as for some languages there may be several items on the headword line, e.g. simple pasts and past participles for English.--RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]
  2. Senses. For less-documented languages, it may be permissible to cite a dictionary. How are dictionaries supposed to be cited? For some very simple instances, it may be possible to quote a dictionary (or thesaurus) that is in the public domain, though I suspect that may be dispreferred.--RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]
  3. General sources. There seems to be some debate on how to cite a dictionary entry for the whole of an L3/L4 entry. I would argue that the citations should be atomised to the individual parts of the Wiktionary entry, and that where there is more information to be found in the cited dictionary, it is more appropriately referenced under 'Further Reading'. Inqilābī prefers to put the footnote link on the headword line. Note that the <ref> tag has an attribute name to allow multiple links to the same footnote. --RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]
  4. Further quotations. Some of the links to the Pali Text Society dictionary are justified by its listing of locations where the words are used, which we should in the fullness of time add to Wiktionary. (I'd rather someone else did the blinking work - I haven't managed to reduce it to a simple mechanical task.) I think that hints for further extensions to the article should go under 'Further Reading'.--RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]
  5. Inflections. Or should we just use non-lemma entries to give sources? There may be some issues where forms don't independently meet the Criteria for Inclusion (CFI). Ideally, we could use footnotes in inflection tables, but for now that is largely an aspiration rather than a real capability.--RichardW57 (talk) 01:05, 23 October 2021 (UTC)[reply]

Category:Proto-Chinookan language[edit]

This category seems very odd to me, and it was created by a bot. Question from a complete layman; is there really a "Proto-Chinookan language"? I see no evidence of that online. Could someone knowledgable in this check it out for validity? Thanks. PseudoSkull (talk) 18:54, 25 October 2021 (UTC)[reply]

I'm not knowledgeable in this particular matter but I do find scientific attestation on the internet: [7], [8], [9]. Fytcha (talk) 19:01, 25 October 2021 (UTC)[reply]
It was added by @-sche: diff. DTLHS (talk) 19:30, 25 October 2021 (UTC)[reply]
Chinookan languages are one the self-evident families in North America, and no one is in doubt about its validity. So there must have been a common proto-language. However, Proto-Chinookan has not yet been reconstructed except for person and tense/aspect-marking prefixes in two papers by Silverstein. Since Sapir, comparatists have been more eager to find evidence for the inclusion of Chinookan in the Penutian stock, rather than doing the necessary homework for Proto-Chinookan first. –Austronesier (talk) 10:09, 26 October 2021 (UTC)[reply]
Interesting; as Austronesier says, it's a 'real' protolanguage (the Chinookan languages are clearly descended from a common source, and a few references refer to it), but I no longer recall (years later) why I added it, since no entries make reference to it (maybe I was going to mention it in etymologies and got sidetracked). The references on it are sparse; no objection if someone wants to remove it. - -sche (discuss) 22:14, 14 November 2021 (UTC)[reply]

Vote: New page-protection level[edit]

Proposing a new protection level in-between of autoconfirmed and template editor. Auto-confirmed right can be easily obtained, and per details at w:WP:AUTOC, it just requires 4 days + 10 edits, which someone can easily get. Disruption by auto-confirmed users exists and isn't that rare, as can be seen with [10], [11], [12] and many others. A new protection level would be stricter than auto-confirmed protection, which would be used to prevent auto-confirmed disruption.

[Note: If both options pass, the one with greater SUPPORT:OPPOSE ratio will be implemented.]


  • Starts: 16:09, 26 October 2021 (UTC)
  • Ends: 16:09, 10 November 2021 (UTC)

Option 1: Extended confirmed user group[edit]

Creating a new right like w:WP:XC named extendedconfirmed similar to WP ― auto granted when 30 days + 500 edits. This would enable a extended-confirmed protection.


  1. Symbol support vote.svg Support, as proposer. Svartava2 (talk) 16:09, 26 October 2021 (UTC)[reply]
  2. Symbol support vote.svg Support. It could be a good protection-level that admins place on heavily-vandalized pages. And Wikipedia has it too. We should also look into other user permission groups to condition voting at RFD, etc. as I had suggested at Wiktionary:Beer parlour/2021/March § Dentonius. But we can worry about that in a later discussion. Imetsia (talk) 17:02, 26 October 2021 (UTC)[reply]
    Changed to abstain, now that we have the second option below, which I think is better. Imetsia (talk) 15:23, 27 October 2021 (UTC)[reply]
  3. Symbol support vote.svg Support--Jusjih (talk) 21:44, 1 November 2021 (UTC)[reply]


  1. Symbol oppose vote.svg Oppose. Unsafe move. Trusted people who are prolific editors can be made admins or template editors, if need be. Also, it would be very problematic if Wonderfool is able to edit protected pages. ·~ dictátor·mundꟾ 14:57, 27 October 2021 (UTC)[reply]
  2. Symbol oppose vote.svg Oppose Out-of-process vote. Not publicized as regular vote, etc. DCDuring (talk) 17:20, 27 October 2021 (UTC)[reply]
  3. Symbol oppose vote.svg Oppose We have a process for votes. This is pseudo-vote is not following our process. What else will Svartava2 do that ignores our established processes? This whole exercise reads to me as disrespect for the Wiktionary community.
By not following the process, this pseudo-vote has not been publicized, and does not last the indicated one month. Proper editor notification and proper timelines are even more important when it comes to our infrastructure, such as the permissions system.
I cannot support this kind of procedural abuse. ‑‑ Eiríkr Útlendi │Tala við mig 17:49, 27 October 2021 (UTC)[reply]


  1. Symbol abstain vote.svg Abstain Seems redundant with the autopatroller capability. Both identify non-admins with a large number of good edits. Is it technically possible to set pages to be editable by admins and autopatrollers? Vox Sciurorum (talk) 17:18, 26 October 2021 (UTC)[reply]
    @Vox Sciurorum No. currently it isn't possible to protect a page as to be edited only by autopatrollers and admins. Added the option for this now (see below). Svartava2 (talk) 17:54, 26 October 2021 (UTC)[reply]
  2. Symbol abstain vote.svg Abstain: I think option 2 is better. PUC – 12:02, 27 October 2021 (UTC)[reply]
  3. Symbol abstain vote.svg Abstain per PUC. Imetsia (talk) 15:23, 27 October 2021 (UTC)[reply]

Option 2: Autopatroller user group[edit]

Adding an option for protection which would allow only autopatrollers and admins to edit the protected page, which currently doesn't exist. Autopatrollers are generally trusted users, unlikely to intentionally disrupt pages. Related discussion: User talk:Bhagadatta#Lahore.


  1. Symbol support vote.svg Support, as proposer. Plus, this would make the autopatroller group a bit more useful. Svartava2 (talk) 17:54, 26 October 2021 (UTC)[reply]
  2. Symbol support vote.svg Support; I think this new protection level would, for some pages, strike the right balance between too little and too much. PUC – 18:05, 26 October 2021 (UTC)[reply]
  3. Symbol support vote.svg Support. Imetsia (talk) 19:22, 26 October 2021 (UTC)[reply]

# Symbol support vote.svg Support. There are certain pages I can't edit despite having over 10,000 edits over 6 years, simply because they're targets for vandalism. Andrew Sheedy (talk) 04:43, 27 October 2021 (UTC)[reply]

I have stricken my vote, not because I do not support this, but because I interpreted this as an opinion poll, but now see that it has been presented as a vote. My support still stands, but I agree with those below who are saying that this needs a formal vote. Andrew Sheedy (talk) 02:57, 28 October 2021 (UTC)[reply]
  1. Symbol support vote.svg Support -- 𝓑𝓱𝓪𝓰𝓪𝓭𝓪𝓽𝓽𝓪(𝓽𝓪𝓵𝓴) 13:09, 27 October 2021 (UTC)[reply]

# Symbol support vote.svg Support This seems like the right approach to me. DCDuring (talk) 14:21, 27 October 2021 (UTC)[reply]

  1. Symbol support vote.svg Support--Jusjih (talk) 21:44, 1 November 2021 (UTC)[reply]
    @Jusjih: This vote was cancelled on 28 October; see Wiktionary:Votes/2021-10/Autopatroller-level page protection. J3133 (talk) 10:57, 2 November 2021 (UTC)[reply]


  1. Symbol oppose vote.svg Oppose. There are also autopatrollers who are disruptive, controversial editors. Our current protection level is fine: WT:Autopatroller is not supposed to be a right. The proposer himself was recently made an autopatroller, and is eager to go on a vandalism spree. ·~ dictátor·mundꟾ 14:57, 27 October 2021 (UTC)[reply]
    Why not propose removal of autopatroller right for such a (non)contributor? DCDuring (talk) 15:39, 27 October 2021 (UTC)[reply]
    No protection level is entirely failproof: some administrators have also been controversial editors in the past. It's simply a question of cost-benefit analysis: will allowing more (experienced) editors to edit certain pages be worth the increased risk of controversial edits on those pages by said editors? On the whole, I'd say the answer is yes. The autopatroller status is not given lightly, and we can trust that autopatrollers are people who know what they're doing and should consequently be free to edit (almost) any page. If we then notice that even that protection level is not enough on one page or another, we could either increase the protection level for that page, either revisit the question of the autopatroller status of the editor, as DCDuring suggests.
    The other risk I see is that of an admin starting to use the new protection level too liberally, on pages who should really be editable by all auto-confirmed users, not only autopatrolled ones. But that appears unlikely to me, so again, I think the benefits outweigh the costs. PUC – 16:21, 27 October 2021 (UTC)[reply]
    Inqilabi, when will you stop making nonsensical and illogical comments? I initially proposed only for extended confirmed, but later Imetsia at Discord suggested for the autopatroller one, and I agreed to include that, and that has nothing to do with my recent whitelisting. I'm not a non-contributor. The 2 admins who player a role in whitelisting are sensible, unlike you. Just because of our recent controversies and disputes, please do not make every page our battle ground, like you also did at diff. For your inappropriate comments, you should be blocked to be taught a lesson. Svartava2 (talk) 16:59, 27 October 2021 (UTC)[reply]
    Even if this is a plot to gain access to a protected page, the admin who prescribed the meaning or etymology of the word is still going to be watching the page to protect it against dissenting views. This proposal belongs in a discussion among the admins. This vote is a waste of time. Only the admins will be deciding how to use any protection mechanism. As far as I know there is no existing voter-approved policy saying that the some list of protection levels is exhaustive, so nothing to vote to amend. Vox Sciurorum (talk) 18:23, 27 October 2021 (UTC)[reply]
    @Inqilābī: I see no reason to impugn the motives of another contributor based on speculation and personal feelings. There is no evidence to allege that Svartava has created this vote pretextually so he can "go on a vandalism spree." Argue the issue based on its merits rather than by attacking the person who created it. At this point, your ad hominems have become quite annoying, disruptive, and probably blockworthy. Imetsia (talk) 19:47, 27 October 2021 (UTC)[reply]
    TBH, having watched some of their interactions from afar, I wouldn't say Svartava is beyond reproach either; so without condoning Inqilabi's ad hominems I can see where they're coming from. But yes, it's regrettable that personal feelings are getting in the way.
    In general, Inqilabi and Svartava strike me both as overly enthusiastic teenagers, who have knowledge and sometimes good ideas (the idea discussed here being, in my opinion, one of them) but should learn to be more patient and accommodating. And that's me saying this. PUC – 23:00, 27 October 2021 (UTC)[reply]
    You people are mistaken, I always try my best to keep the project from harm. This was my alert to the community about the danger posed by Svartava, the only person with whom I have a standoff, caused by Svartava’s impatience and ill will. ·~ dictátor·mundꟾ 10:33, 28 October 2021 (UTC)[reply]
    Grow up please.we had a much better position in each other's eyes before a few disputes ruined it. I was patient and good willed until I accepted your dictatorship. When I disputed it, I was no longer so. My attempt to get a clean start with account user:Svartava was in good faith I had even informed a check user to ensure that I wasn't doing anything wrong. After that you thoroughly convinced yourself that I was a vandal, which I clearly wasn't. All I would like to say is that this isn't the right discussion to talk about this; you could start a new discussion on BP or my talk page. I humbly request you to keep any discussion free of unrelated and off topic comments. Svartava2 (talk) 16:06, 28 October 2021 (UTC)[reply]
  2. Symbol oppose vote.svg Oppose Out-of-process vote. Not publicized as regular vote, etc. DCDuring (talk) 17:21, 27 October 2021 (UTC)[reply]
  3. Symbol oppose vote.svg Oppose We have a process for votes. This is pseudo-vote is not following our process. What else will Svartava2 do that ignores our established processes? This whole exercise reads to me as disrespect for the Wiktionary community.
By not following the process, this pseudo-vote has not been publicized, and does not last the indicated one month. Proper editor notification and proper timelines are even more important when it comes to our infrastructure, such as the permissions system.
I cannot support this kind of procedural abuse. ‑‑ Eiríkr Útlendi │Tala við mig 17:49, 27 October 2021 (UTC)[reply]
@Eirikr: If you saw this as a non-binding poll, as I do, what would be your opinion about the proposal? PUC – 23:15, 27 October 2021 (UTC)[reply]
@PUC: It says right there at the top that this is a vote. This has many of the trappings of a formal binding procedural vote, without following the actual procedure for such a vote. I cannot view this as just a straw poll.
That aside, even as a poll, I think this is poorly constructed. It's not clear why we need either option. It's not clear why we wouldn't just censure the users causing trouble. This seems like an ill-defined solution in search of a problem. ‑‑ Eiríkr Útlendi │Tala við mig 00:20, 28 October 2021 (UTC)[reply]
Some may depend on a vote appearing in the vote box, which appears on the watchlist. Some valuable contributors may not have the time to waste on long-winded BP discussions. We often have votes when someone makes (or proposes to make) a change which some think is not widely supported and which has implications for multiple entries. If this isn't a vote, it should not purport, as in the heading, to be a vote. If it is a vote, we have a framework for such things, discussion first, possibly a poll, then a vote. I, for one, would really like to hear from a broader group of contributors, including those who are active patrolers, but I don't want to waste their time on whatever this is: proposal or poll or vote. DCDuring (talk) 01:03, 28 October 2021 (UTC)[reply]



This is NOT A VALID VOTE. If we want a vote, we follow the rules, make a vote page etc. DCDuring (talk) 15:35, 27 October 2021 (UTC)[reply]

@DCDuring: This would of course lead to the implementation of the option which has consensus and is valid. Taking important decisions on BP is precedented. The voting at BP is also precendented, see wiktionary:Beer_parlour/2021/February#Splitting_WT:RFVN and the creation of WT:Requests for verification/CJK. Svartava2 (talk) 16:59, 27 October 2021 (UTC)[reply]
The better call would have been to present the idea and ask for input, instead of setting up a vote here. —Μετάknowledgediscuss/deeds 19:33, 27 October 2021 (UTC)[reply]
I don't think there's clear-enough guidance on when to create votes and when simple discussions will suffice. Sometimes votes are created when unnecessary (the plus-templates and Prakrit-lects votes are prime examples), while at other times no vote takes place when some editors argue it should. In this case, we're voting on whether to add (not change) something to the existing state of things on a matter that does not call for a policy decision (i.e. we don't need to edit anything in our guidelines for this addition to work). I'd say a BP vote is enough for most such cases. If only there were a body with the power of judicial review to ordain whether or not a vote is required on a case-by-case basis... Imetsia (talk) 19:47, 27 October 2021 (UTC)[reply]
@Imetsia, this is a proposed decision affecting our permissions structure, affecting what our userbase can and cannot do. Any changes to the permissions system has, by its very nature, a very broad impact. This is not a trivial change. As a non-trivial change, this proposal should follow established process for gaining consensus as broadly as possible. At present, that process is spelled out at Wiktionary:Voting policy. ‑‑ Eiríkr Útlendi │Tala við mig 20:49, 27 October 2021 (UTC)[reply]
We've created a new role before without any formal vote: Wiktionary:Beer parlour/2018/November § Mover role. Admittedly the impact was more limited than that of the current proposal, but still. PUC – 23:15, 27 October 2021 (UTC)[reply]
  • Vote cancelled. Some people are opposing simply because this vote is on the beer parlour and not on a separate page. I will create a proper vote today of 20 days, which will have only the second option since it looks like this is the one preferred. Svartava2 (talk) 06:13, 28 October 2021 (UTC)[reply]
    The proper vote is here. Svartava2 (talk) 16:06, 28 October 2021 (UTC)[reply]
    This vote still makes no change in policy. It's like voting to bell the cat, aspirational but not productive. Vox Sciurorum (talk) 16:11, 28 October 2021 (UTC)[reply]
    Now if that vote passes, somebody will implement that. Svartava2 (talk) 16:22, 28 October 2021 (UTC)[reply]
    @Svartava2: I would suggest delaying the vote for a week or two. There's no rush. Let people emit ideas, objections here or on the vote talkpage, etc. PUC – 18:33, 28 October 2021 (UTC)[reply]
    I actually thought the exact same thing before creating it―can/should a vote start right after its creation? But since it seems that this idea seems to have consensus, I started this anyway. Whatever, I have created this vote just for the people who were not accepting its legitimacy for it being on BP. Now since the vote has started and votes have been casted, I think it's best to keep it going. Svartava2 (talk) 19:57, 28 October 2021 (UTC)[reply]

How to deal with this case? (pitch-accent paradigms in Lithuanian)[edit]

I was editing the entry abatija (abbey), and ran into an issue. This noun has two alternative accent paradigms, abãtija (1) and abatijà (2). In total, there are 6 forms with 3 distinct pronunciations that would be written abatija:

  • abãtija - nominative paradigm 1 (lemma)
  • abãtija - instrumental, paradigm 1
  • abãtija - vocative, paradigm 1
  • abatijà - nominative, paradigm 2 (lemma)
  • abatijà - instrumental, paradigm 2
  • abatìja - vocative, paradigm 2

Currently, I think all but the last are indicated properly, since they appear under a bold headword that lists the alternatives of "abãtija" or "abatijà". I'm not sure how to indicate the "abatìja" pronunciation correctly. Would it be proper to create a new ===Noun=== heading just for that pronunciation? These things are a bit of a mystery to me. 16:19, 26 October 2021 (UTC)[reply]

That depends on what you want to do with it. For example, if you want to give a pronunciation section for that form, it's a good idea to split the section into "Pronunciation 1" and "Pronunciation 2" (compare Afar awka, even though the issue is different there) or "Etymology 1" and "Etymology 2" (compare Finnish napsaa). If you don't, you could just ignore it and leave it to the reader to find the accentuation in the inflection section. Thadh (talk) 23:46, 27 October 2021 (UTC)[reply]

I'm thinking of making more navbox-style templates for topics. Thoughts?[edit]

In the style of {{table:colors}}, I'd like to add more templates that navigate between sequences or tightly-bound lists of entries. By “tightly-bound”, I mean fairly manageable and non-arbitrary sets of terms like states of the United States, rather than broad topics that would fill up a huge navbox like emotions. E.g. to start with, I'm thinking of times of day with the sequence being something like:

  • midnight
  • daytime
  • dawn/twilight
  • daybreak/sunrise
  • morning
  • midday/noon
  • lunchtime
  • afternoon
  • teatime
  • dinnertime
  • sunset/dusk
  • evening/twilight
  • suppertime
  • nighttime

Thoughts on the general idea of having more navboxes in See also sections or on the specific examples I've given here? —Justin (koavf)TCM 03:23, 27 October 2021 (UTC)[reply]

I support this. Highly valuable for anyone learning English (and any other language, if you're able to implement it more broadly, which would be great). Andrew Sheedy (talk) 04:44, 27 October 2021 (UTC)[reply]
I support this too, and for other languages as well, per Andrew Sheedy. Plus it can help spot gaps in our coverage. PUC – 12:01, 27 October 2021 (UTC)[reply]
I'd support, per all the above. Imetsia (talk) 15:26, 27 October 2021 (UTC)[reply]
I support this. I've seen the color table many times and I've always enjoyed having it available. You can ping me when they're ready and I will translate them into my languages. Fytcha (talk) 22:45, 27 October 2021 (UTC)[reply]
I’d support for things that are clearly non-arbitrary sets like states of the USA, but oppose for the example given of times of day. The identification and semantic division of times of day is highly culturally dependent and so poorly suited for a one-size-fits-all list; individual languages should have individual, manually customized lists in such cases, not a common template. Shoehorning extremely culturally specific notions like ‘teatime’ into a universal list of times of day in all languages is a bad idea. This is also a major problem with the color table template, which, for many languages, provides an extremely misleading picture of how those languages actually conceptualize and divide up the color space. — Vorziblix (talk · contribs) 20:46, 29 October 2021 (UTC)[reply]
@Vorziblix: Do you like the idea of language-specific versions of that time-of-day template? I.e. not a translated table like the color one but just one-off templates? Do you think that would be useful? —Justin (koavf)TCM 04:44, 30 October 2021 (UTC)[reply]
I think language-specific versions would be good (for example, the times of the day in Spanish or Portuguese would need to include madrugada), but it would also be good to indicate variations within a table for a given language. For instance, "teatime" could be put in parentheses or marked with a qualifier. Likewise, you could list two variations in a single box in the table: "dinnertime (US), suppertime (Canada)", for instance (obviously, adapted to cover all English-speaking countries). I do think it's valuable to have tables like these, even if some elements are variable. Andrew Sheedy (talk) 16:40, 30 October 2021 (UTC)[reply]
@Koavf: Yes, I’d support one-off language-specific lists, with items chosen as appropriate to each particular language. I do think lists are helpful in that form. (In fact, for the color-table template, I made a couple such one-off versions for languages where the original was wholly unsuitable, e.g. Template:table:colors/egy. Unfortunately there are many more languages that still use the common table despite its unsuitability for them.) I also agree with @Andrew Sheedy that including variations within a particular language is a good idea; in many cases we already do this, albeit quite messily (e.g. at Template:list:Gregorian calendar months/sh/Latn). — Vorziblix (talk · contribs) 13:49, 1 November 2021 (UTC)[reply]


I made template:table:USA, template:table:USA/en, and inserted it into Indiana. I guess now that I'm thinking about it, the trick is getting alphabetical order for different languages... Seems tricky. Any thoughts? —Justin (koavf)TCM 05:22, 30 October 2021 (UTC)[reply]

@Andrew Sheedy: @Imetsia: @Fytcha: @Vorziblix: For visibility. —Justin (koavf)TCM 05:23, 30 October 2021 (UTC)[reply]
It seems like most of these navigational tables are of two kinds, where there is either a clear sequence (temporal: days of the week, months of the year, zodiac symbols or spatial: solar system) and some where there is a more-or-less arbitrary sequence (card suits, chess pieces [these could start with the lowest value and move to the highest but it does the opposite now]). None of them are alphabetical. The only way to really arrange U. S. states like this is by admission to the Union but that is not a very intuitive listing. —Justin (koavf)TCM 05:57, 30 October 2021 (UTC)[reply]
Looks good. I'm not sure how you would automate an alphabetization for other languages, since templates aren't really my thing. Andrew Sheedy (talk) 16:32, 30 October 2021 (UTC)[reply]
I agree, looks good. To alphabetize them you’d probably need to rewrite the template in Lua. Unfortunately there’s no simple solution using just MediaWiki parser functions or the like (that I’m aware of, at least). — Vorziblix (talk · contribs) 13:49, 1 November 2021 (UTC)[reply]

Heads up that I propagated this template to all entries per the above encouragement. Alphabetizing for other languages is still a big consideration that I'm not competent to bother with. I'll continue making more of these and post them for feedback. —Justin (koavf)TCM 23:20, 14 November 2021 (UTC)[reply]

What is Wiktionary:Requested_entries_(Swiss_German)?[edit]

I thought Swiss German is treated as a part of Alemannic German, which already has Wiktionary:Requested_entries_(Alemannic_German). Is it a remnant from the time Swiss German was treated as a separate language? Fytcha (talk) 22:40, 27 October 2021 (UTC)[reply]

@Fytcha: Once merged into Wiktionary:Requested entries (Alemannic German), I think it can be speedily deleted. —Μετάknowledgediscuss/deeds 18:28, 29 October 2021 (UTC)[reply]
@Metaknowledge: Done. Fytcha (talk) 18:44, 29 October 2021 (UTC)[reply]

Template idea: {{der?}}[edit]

Having just done some maintenance in a language that I don't know, I was asking myself why there isn't a template exactly like {{der}} only that it places the entries in a hidden category (similar to how etyl does it) so that they can be checked and replaced with {{bor}}/{{inh}} by people that maintain that language. Fytcha (talk) 16:56, 29 October 2021 (UTC)[reply]

@Fytcha: We have {{etystub}} and {{rfe}} for that. In this case, I'm pretty sure it's an inheritance, but in other cases you could write something like "Ultimately from" and then {{etystub}}. Thadh (talk) 17:24, 29 October 2021 (UTC)[reply]
@Thadh: They both produce visible text in the article however. I'm not sure if I'm comfortable deploying this en masse. There are lots of articles using {{der}} as the first derivational step that need cleanup by somebody knowledgeable in the language. Fytcha (talk) 18:21, 29 October 2021 (UTC)[reply]
That is the problem with en-masse replacement of {{etyl}} by {{der}}... There's not much we can do about that except restoring {{etyl}} or adding these bulky templates. Thadh (talk) 18:38, 29 October 2021 (UTC)[reply]
A simple approach is to define {{der?}} in such a way that "{{der?|L1|L2|...}}" expands to "{{etyl|L2|L1}} {{m|L2|...}}". If the point is merely being able to find applications, just define {{der?}} as a synonym of {{derive}}, and use "What links here" + Show transclusions / Hide links on page Template:der?.  --Lambiam 19:36, 29 October 2021 (UTC)[reply]
Someone who doesn't know how to view the documentation of templates might not understand the difference between {{der}} and {{der?}}. For translations, we have {{t}} vs {{t-check}}. Maybe something more like {{der-check}} or {{der-chk}}? This would also be useful for those working on {{etyl}} cleanup, so they don't have to choose between doing nothing and adding {{der}}. For that matter, it could be used to mass-replace all remaining instances of {{etyl}} in entries so we can finally deprecate the template for all languages. Chuck Entz (talk) 20:20, 29 October 2021 (UTC)[reply]
Good points, I agree that {{der-check}} would be a better name. Fytcha (talk) 20:27, 29 October 2021 (UTC)[reply]

Best way to nominate a list of pages for deletion[edit]

In playing around with the form creation/validation bot, I've identified 80 pages that contain only a Spanish form of that either references a non-existent lemma/part of speech, or references a valid lemma that does not list the given form in its header. Here's the list. Can I just have the bot add a {{d}} tag to every page, or would that create a bunch of pings or otherwise complicate life for the mods? JeffDoozan (talk) 21:15, 29 October 2021 (UTC)[reply]

@JeffDoozan Some of those should have entries like írrito for írrita, but otherwise I'd send them en masse to RFV or RFD respectively. I've seen that happen a few times. AG202 (talk) 03:29, 31 October 2021 (UTC)[reply]
No need for that, I've taken care of them all. I only created a few pages (including one stub, marucho); most were inflections of misspellings or entries that failed RFV/RFD. Thanks for posting it rather than tagging everything. Ultimateria (talk) 04:54, 1 November 2021 (UTC)[reply]

Deletion of Wiktionary:Wanted entries[edit]

I'm here to announce that this project has failed RFD and I'm in the process of exporting the links to the appropriate Category:Requested entries pages before deleting, and I'll store the remaining links in a userpage. You've probably already noticed the banner missing from Recent changes and your Watchlist; this is why. Ultimateria (talk) 01:53, 1 November 2021 (UTC)[reply]

Unfortunately WT:REE was already a very large page and this has made it much larger. Should we split it into 26 pages by first letter now? People add far more words than they define or remove. Equinox 21:06, 7 November 2021 (UTC)[reply]

November 2021

The sorry state of gsw[edit]

I want to majorly change the policies for this lect and I need input over at Wiktionary_talk:About_Alemannic_German#The_sorry_state_of_gsw. Fytcha (talk) 17:04, 1 November 2021 (UTC)[reply]

Category:Tuoba language[edit]

I made this as it was a redlink but it's not already included in Module:languages, Module:languages/canonical names, and Module:languages/code to canonical name; it also evidently doesn't have an ISO code or even Glottolog). I want to be conservative about editing those modules so I'm providing transparency so someone else can add it more thoughtfully. —Justin (koavf)TCM 17:29, 2 November 2021 (UTC)[reply]

Macedonian secondary imperfectives[edit]

In Macedonian, if an imperfective verb is derived from a perfective verb which is itself derived from an imperfective verb, provided that there are no semantic shifts across the derivational chain (e.g. that the perfective verb does not add any meaning beyond perfectivity (such as an inchoative, diminutive or intensive meaning) to the original imperfective verb), the secondary imperfective combines the telic meaning of the perfective verb with grammatical imperfectivity and consequently appears in a limited range of contexts compared to both the perfective and the primary imperfective verb, namely the following:

1. Narrative present (where perfective actions are being recounted, possibly ones which were actually completed in the past, but expressed as imperfective forms to create an impression that they are ongoing and involve the reader or hearer)

2. Iterative or habitual contexts (where there is a series of sub-events which are individually perfective in that they attain their telos, but which comprise a global event which is not itself perfective; this sense of the secondary imperfectives is not to be confused with lexically frequentative verbs whose basic semantic content implies that an action is repeated several times in succession, e.g. with verbs like "knock" or "hum")

3. Irrealis clauses which allow or require an imperfective verb, whereas the context requires a perfective (telic) meaning.

Unlike the primary imperfective verbs, the secondary ones cannot be used in canonical present context, where the time of the event overlaps with the time of speech, e.g. in contexts where English would use a progressive construction like "I am cleaning".

There might be more uses of the secondary imperfectives than the three enumerated above, but I have not researched the subject well enough to come up with an exhaustive account.

An example:

  • скока - primary imperfective - to jump (to be in the process of jumping or to habitually jump, with an atelic meaning, i.e. with the endpoint of the event being neither asserted, implied nor denied)
  • скокне - primary perfective (to jump, with a semelfactive, instantaneous, telic meaning, with explicit assertion of the endpoint)
  • скокнува - secondary imperfective, used in contexts such as (authentic English examples which could be translated with the Macedonian secondary imperfective):
    Narrative present: "Enter MONKEY leisurely, looking about, throws up one or two things, then jumps in the box" (stage directions from a 1875 play)
    Iterative, habitual: "When an exception occurs, the processor always jumps to this instruction address, regardless of the cause" (from a modern IT textbook)
    Irrealis: "Intuitively, the sentence is not true if uttered in our world, whether the speaker actually jumps out of the window or not" (from a modern linguistics textbook")

My question is: how can I label these secondary imperfectives on Wiktionary? Currently, I've labelled them {{lb|mk|iterative}} to group them together in a list, but I think that some more meaningful label should be used, especially since {{auto cat}} has printed a misleading explanation at Category:Macedonian_iterative_verbs and since "iterative" can be confused with the frequentative meaning of verbs like knock, as mentioned above A label that reads "secondary imperfectives" would also be unsatisfactory, since imperfectives derived from perfectives derived from imperfectives with a change of meaning are also technically secondary imperfectives, but do not display the properties laid out above. For example, in the derivational chain оди (to go, walk) > изоди (to walk until the end of a route, to complete a route) > изодува, the secondary imperfective is just a normal imperfective counterpart to изоди, because the meaning changes during the derivation of изоди to оди. Consequently, изодува does not mean "to go, walk" in narrative, habitual or irrealis contexts, and must be systematically distinguished from cases like скокнува. Finally, not labelling secondary perfectives in any way is no good, because translating both скока and скокнува as "to jump" would mislead readers unfamiliar with Slavic languages.

The phenomenon in question should exist in other Slavic languages too, so perhaps Slavic-speaking contributors might be able to contribute more pertinently than others. Martin123xyz (talk) 08:26, 3 November 2021 (UTC)[reply]

@Martin123xyz: I don't think Russian has this construction (unprefixed derivations are already rare, and double derivations don't exist as far as I know), but I think {{lb|mk|frequentative}} is what fits best from what I understand. Do I understand correctly that second imperfectives denote repeated action over different occasions? Iterative means repeated action on one given time, rather than spread out over time. Thadh (talk) 09:21, 5 November 2021 (UTC)[reply]
@Thadh Thank you for the reply. Russian definitely has secondary imperfective verbs, i.e. imperfective verbs derived from a perfective verb itself derived from an imperfective verb (тратить > утратить > утрачивать; казать > показать > показывать). It's just that in most such derivational chains, the meaning changes at some point, mostly during the derivation of the perfective verb. Consequently, the secondary imperfective is not semantically equivalent to the primary imperfective, as is the case with the Macedonian secondary imperfectives at Category:Macedonian iterative verbs. I proposed починять as a Russian verb of this kind, since it seems to mean the same as чинить, but the fact you have not picked up on it leads me to assume that I was on the wrong track.
It should be noted that prefixation vs. suffixation is irrelevant. In Macedonian there are also cases like гори > изгори > изгорува (perhaps comparable to Russian гореть > сгореть > сгорать?), where the perfective is prefixed rather than suffixed, but the secondary imperfective still exhibits the properties listed above (narrative, habitual/frequentative and irrealis use), with the same semantics as the basic гори. This is different from a cases like бие (to beat) > убие (to kill, perfective) > убива (to kill, imperfective), where the meaning changes during the perfectivization, such that убива and бие are two imperfective verbs with different meanings and a full range of imperfective uses, whereas гори and изгорува are two imperfective verbs with the same meaning such that only the former has a full range of imperfective uses, whereas the second is restricted and statistically rarer.
Going back to the term "iterative", if you say that "frequentative" means repeated over a longer period, that's fine - we can use that term. However, we will not capture the narrative and the irrealis use discussed above. Could this be resolved by writing a Macedonian-specific definition for "frequentative", to be displayed at the top of the category of Macedonian frequentative verbs? As for verbs like "to knock", "to hum" and to "to hop", would they then be labelled "iterative"?
Currently, the Wiktionary entries for хаживать and сиживать describe them as "iterative" rather than "frequentative", even though they refer to events spread out over time, contrary to what you are now proposing. Should this be changed? Martin123xyz (talk) 09:38, 5 November 2021 (UTC)[reply]
@Martin123xyz: According to en.wp (which we mostly use for grammatical explanations), frequentative is spread out repetition and iterative is momental repetition. I don't know what the options are for language-specific category explanations (you should ask others), but in the worst case scenario you could just say that's a grammatical feature that you expect readers to know, or you could create a dedicated template that redirects to a Macedonian appendix explaining this feature and automatically adds the cat.
Re Russian: yes, most prefixed (im)perfectives carry a certain change in meaning, like сгорать means "to burn up", so the second imperfective carries that meaning. Words like "починять" don't exist and would mean the same thing as the regular imperfectives. Thadh (talk) 09:50, 5 November 2021 (UTC)[reply]
Thank you for the discussion. I will look into defining "frequentative" and "iterative" for Macedonian somewhere. As for починять, there is a link to it at чинить and a page at the Russian Wiktionary with a real quote. It has also been included in several dictionaries, as you can see here. It is for this reason that I mistook it for an existing word. Martin123xyz (talk) 09:57, 5 November 2021 (UTC)[reply]
Oh, huh, guess I learned a new word today. Looks like an archaism, and it seems to mean the same thing as чинить, without any frequentative/iterative meaning attached to it apart from the regular imperfective functions. Thadh (talk) 10:14, 5 November 2021 (UTC)[reply]
Then you agree that this починять is a secondary imperfective not only in terms of its position in the derivational chain, but also in terms of how usable/normal it is. This is the case with all the verbs from Category:Macedonian iterative verbs which must somehow be subordinated to their primary imperfective counterparts. If we just create an entry for изгорува and write that it means "burn", foreigners might think that it is the basic word for "burn" and start saying things like "??селото изгорува" for "селото гори" (the village is burning). Similarly, if we just create an entry for починять and write that it means "repair", foreigners might wrongly use it everywhere where they should be usingчинить instead. This is the reasoning that prompted me to start labelling Macedonian secondary imperfectives and start the discussion. For the Russian cases, perhaps usage labels like "archaic", "rare term" and the like suffice, but because in Macedonian, the secondary imperfectives also have a special grammatical status and are perfectly natural in the examples I gave in my original post (starting with the one with the monkey), a grammatical label is necessary. Martin123xyz (talk) 11:00, 5 November 2021 (UTC)[reply]

We need more and better labels (and glossary entries) for the higher register[edit]

At present, there are (literary), (formal), (solemn), which are good labels but don't cover the entirety of the higher registers. What's missing are labels (and corresponding glossary entries) for educated/erudite speech (Bildunggssprache) and a general higher register (gehobene Sprache). I've heard this complaint uttered by at least one other editor before and I think there are non-standard labels like (lofty), (posh), (exalted) in circulation right now. What's more, loads of German articles currently lack the proper denotation of them belonging to the higher register which probably goes back to the lack of adequate labels in general. Fytcha (talk) 13:21, 3 November 2021 (UTC)[reply]

I agree, but I'm a bit divided on this. Clarity and order are usually nice things, but too much bureaucratic paternalism (of the authors) can also be discouraging. The user, on the other hand, regards online reference works as a tool to be used and does not appreciate the work of art. The engineer is always inculcated: "as filigree as necessary, not as possible! The work is sufficient if it fulfills its purpose." And a user who knows how to look up is not too stupid to think and understand. So I have no problem finding labels in completely different categories after or in front of a term. I prefer that to inventing 3 different formats.
On the one hand, the attributes, like the text examples, should be easy to understand, even for people with 30% language skills. Well and good. On the other hand, sometimes more precision doesn't hurt. Not every phrase from fairy tales, for example, is automatically correct under dated or obsolete. Freedom is needed here because it is impossible to foresee all cases. I therefore advocate recommendations instead of hard rules and ask for a sense of proportion with regard to hair-splitting and usefulness - especially with terms that hardly anyone looks up. And have respect for the work of generations before us, including those before the invention of the Internet. Everyone who reinvents the bicycle should have a good reason for doing so.
But it can't hurt to create a correspondence table showing which word fields could be used in which superordinate expression. You can perhaps orientate yourself on existing, recognized dictionaries. A bot could try to gently unify different things that mean the same thing. (rare, seldom, rarely, uncommon, unusual) -> rare. You have to agree beforehand whether to abbreviate or not. I am in favor of obs. instead of writing obsolete. Short wins.
And if an input mask offers a selection of attributes when editing, these will certainly be used more often. With all the goodwill, I always have problems finding formatting templates and instructions at the right time. And it will be the same for many committed helpers. There is no ill will in this; the help system could still be expanded a little. Herr de Worde (talk) 17:53, 3 November 2021 (UTC)[reply]
I'm inclined to agree. Wiktionnaire regularly uses terms like "soutenu" for an educated register (though "formal" could be our approximate equivalent, I suppose). English tends to have fewer distinctions when it comes to formality, which makes it hard to come up with universally understandable terms. That's likely why we haven't implemented them. Andrew Sheedy (talk) 18:44, 3 November 2021 (UTC)[reply]
I would also be on board with language-specific labels. A language usually has more precise terms to meta-describe its own registers and jargons than outsider languages (in this case English). See e.g. the already mentioned terms Bildungssprache, gehobene Sprache, soutenu, or things like 敬語, Burschensprache. Fytcha (talk) 18:50, 3 November 2021 (UTC)[reply]
Russian lexicography makes wide use of the label просторечие (prostorečije, simple speech), to denote a non-geographical register that should be avoided by educated speakers in most contexts, which doesn't really correspond to "informal". Allahverdi Verdizade (talk) 13:49, 5 November 2021 (UTC)[reply]
@Allahverdi, Fytcha In Russian we have typically rendered просторечье as {{lb|ru|low|_|colloquial}} for want of a better term; this classifies under CAT:Russian informal terms although perhaps it should do something else. Benwing2 (talk) 06:03, 10 November 2021 (UTC)[reply]
@Allahverdi Verdizade Oops. Benwing2 (talk) 06:04, 10 November 2021 (UTC)[reply]
Which is exactly why I think that forcibly merging the categories "informal terms" and "colloquial terms" for all languages was a naughty thing to do. Logically, the antonym of "informal terms" is "formal terms", right? However, "formal terms" is not the opposite of просторечье. Allahverdi Verdizade (talk) 12:56, 10 November 2021 (UTC)[reply]
@Fytcha: I use both Wiktionary and DWDS (in German) to look up German words. One issue with DWDS is that the labels are rather cryptic to me as an English speaker. I've resorted to keeping a mini-dictionary just for German dictionary labels, and it's one of the main advantages of and English dictionary that I don't have to keep referring to an additional list of words that don't often appear in everyday language. The words I've collected so far are: abwertend, derb, papierdeutsch, spöttisch, gespreizt, umgangssprachlich, bildlich, übertragen, salopp, gehoben, landschaftlich, veraltend, scherzhaft, fachsprachlich, & vertraulich. I don't think there are any which don't have a reasonably close English equivalent. RDBury (talk) 14:08, 19 November 2021 (UTC)[reply]
Interesting, maybe we could collect them in some appendix. – Jberkel 14:43, 19 November 2021 (UTC)[reply]

Proposal: add a colon in {{audio}} if the third argument is present[edit]

See for instance house#Pronunciation. All other pairs in the pronunciation sections are formatted like <Key>: <Value> so the audio stepping out of the line in that regard is really inconsistent. To answer any concerns about double colons, I can run a quick grep on the Wiktionary database dump or we could just run a bot to remove trailing colons from the third argument. Note that editors are manually adding the semicolon to the third argument. Fytcha (talk) 16:29, 3 November 2021 (UTC)[reply]

  1. Symbol support vote.svg Support Martin123xyz (talk) 12:31, 4 November 2021 (UTC)[reply]
  2. Symbol abstain vote.svg Abstain I don't have strong feelings on this but I will say that the Key/Value thing isn't visually jarring for me, since the other entries are "Text: Text" and this is "Text: Media". —Justin (koavf)TCM 15:53, 4 November 2021 (UTC)[reply]
  3. Symbol support vote.svg Support I had thought of doing this myself; I agree it looks better with a colon. Benwing2 (talk) 04:39, 10 November 2021 (UTC)[reply]

Request for those who work on Samoan or Tongan (or anyone else)[edit]

I made matalafi and want to make sure that it looks correct to others. Thanks. —Justin (koavf)TCM 18:30, 4 November 2021 (UTC)[reply]

I'm very surprised that you don't even know how to format an English etymology! —Μετάknowledgediscuss/deeds 18:37, 4 November 2021 (UTC)[reply]
I want to be conservative about "borrowed", "descended from", "adapted from", "derived from", "cognate of", etc. so I don't use any structured data or templates but standard running text in English. Someone else who knows better can fix it. Cf. all the problems with the etyl template: no need to introduce more errors. —Justin (koavf)TCM 19:44, 4 November 2021 (UTC)[reply]
This is the wrong approach. Never use running text like this. If you don't know something, don't add it (or add a request template like {{rfe}} with your guess, or ping someone like me). This just goes under the radar and forces someone else to fix your mess. —Μετάknowledgediscuss/deeds 20:55, 4 November 2021 (UTC)[reply]
Another way to think of it is that having nothing there for etymology is a mess that needs to be fixed. Yes, templates should be used whenever possible but something is better than nothing. In the future, I'll add {{rfe}} along with the running text. —Justin (koavf)TCM 00:55, 5 November 2021 (UTC)[reply]

Names of people referred commonly by their surname[edit]

For example, at Gandhi, there is a sense “Mohandas Karamchand Gandhi”; similarly at Hitler, there is the sense “Adolf Hitler”. The problem is that there are countless people with the surname who are referred so, for example, Indira Gandhi (the WP article itself refers to her many times as simply Gandhi). This also doesn't seem like dictionary stuff. While there are words like Gandhian which is logical to be included in a dictionary, because it means "relating to that particular person with surname Gandhi", their etymology can be given as From {{w|Mohandas Karamchand Gandhi}} {{suf|en||an}}. Hence, I do not think that we should include such senses. —Svārtava [tcur] 04:43, 5 November 2021 (UTC)[reply]

I disagree with eliminating these senses. If a figure has become famous enough that they have subsumed the meaning of the word, they should be included as one of its definitions. Here, "Mahatma Gandhi" is essentially the definition of Gandhi. When someone refers to "Gandhi," it can safely be assumed they are referring to Mahatma. That's useful information to include in a dictionary. Imetsia (talk) 15:55, 5 November 2021 (UTC)[reply]
My inclination is to include these when (a) the last name is used without any prior reference to the first name (people are expected to know which Hitler is being referred to; not so with Lincoln--could be Abraham Lincoln, but more context is required to make it clear); (b) the name is used in a general way outside of a given subject field (is "Montessori" unambiguously Maria Montessori outside of pedagogical literature?); (c) the name is only used this way in reference to one person (for instance, I think "Gandhi" without context always means Mahatma, not Indira Gandhi). Andrew Sheedy (talk) 16:49, 5 November 2021 (UTC)[reply]
I agree with most of this except that I think Lincoln also refers to Abraham Lincoln in most situations (at the very least in Europe). Thadh (talk) 17:02, 5 November 2021 (UTC)[reply]

Proposal: New abuse filter for Rhyme categories[edit]

As we have some pretty strict rules as to the permissible characters in {{IPAchar}} and its derivatives, it is only logical that we apply those same standards to the titles of rhyme category pages and as such trigger an abuse filter whenever somebody tries to create a rhyme category page containing such an impermissible character. See for instance the erroneously created page Category:Rhymes:Polish/aga. I'm not sure whether there needs to be special care taken on the part of bots or whether an abuse filter is enought. @Benwing2 as the owner of User:WingerBot. Fytcha (talk) 15:39, 5 November 2021 (UTC)[reply]

An abuse filter doesn't seem like the correct method- I would suggest putting these rules in a module and throwing an error. DTLHS (talk) 18:03, 5 November 2021 (UTC)[reply]
I agree with User:DTLHS here although it has to be done carefully so as not to disallow legitimate rhymes. Benwing2 (talk) 06:00, 10 November 2021 (UTC)[reply]

Other names for Narua[edit]

The Narua language (nru) seems to have many names. Wikipedia calls it "Na", while there is also "Naxi", "Mosuo" and "Moso". I originally made a direct request at Module talk:languages/extradata3/n for these to be added, but Surjection didn't feel comfortable unilaterally adding these names. Any thoughts, objections, comments? This, that and the other (talk) 02:44, 6 November 2021 (UTC)[reply]

@Surjection No-one seems to care. Any objections to adding them? As far as I can tell, this only impacts the display of the table on Category:Narua language, but I may well be missing something. This, that and the other (talk) 05:23, 24 November 2021 (UTC)[reply]

Usefulness of translations for most Latin present participles[edit]

I was recently astonished to find that Latin present participles contained translations unlike German present participles that merely redirect to the infinitive by means of Template:present participle of (like schwimmend). The entries dedicated to the German infinitive and to the Latin 1st person singular present form offer comprehensive and detailed translations. Only a few Latin present participles have a meaning that evolved beyond the mere translation with the English -ing gerund (like repens, colens etc.), but even those can be equipped with two sections, one with the aforementioned template and one with the additional meaning (like the German entry laufend). Is there any justification for the widespread use of full-fledged translations for those verb forms instead of the Template:present participle of? Bogorm converſation 10:52, 7 November 2021 (UTC)[reply]

Agreed that it is (in most cases) unnecessary to provide a translation. Is it worth going to the trouble of deleting them, though? The Nicodene (talk) 21:25, 8 November 2021 (UTC)[reply]

Translingual animal emojis[edit]

Looking through some of the animal emojis, I've noticed that @Kephir and @Koavf have deleted quite a few of these entries (🐥, 🐪, 🐫, 🐭) while leaving many others up that lack attestation just as much (click on the ones I've listed and scroll to the left or right, you'll come across many, many emoji entries that consist of nothing more than the Unicode name, e.g. 🐬). I personally don't really care about what we do: I don't think these entries are particularly useful but they're not harmful either. The only peculiar thing is the difference in enforcement. We should either allow them all without attestation (and as such recreate the already twice deleted ones I've listed above) or get rid of the others too. I don't like inconsistency. --Fytcha (talk) 21:37, 8 November 2021 (UTC)[reply]

I think they should all be kept and we should have entries on all emoji characters and sequences. I just deleted them due to consensus. —Justin (koavf)TCM 01:16, 9 November 2021 (UTC)[reply]
They should all be deleted. DTLHS (talk) 01:17, 9 November 2021 (UTC)[reply]
  • IFF these develop specific senses beyond just the literal representation of the animal (or whatever), then those specific senses might be includable. Otherwise, delete. ‑‑ Eiríkr Útlendi │Tala við mig 01:38, 9 November 2021 (UTC)[reply]
I am of the opinion that we should provide an entry (or at least a redirect) for every printable Unicode codepoint, with a description and obvious definitions (🐫 = camel, ☂ = umbrella). It seems like an eminently useful thing for a modern Internet dictionary to do. Most other sites that have a page per codepoint are fully automatically generated. I'm fully aware that CFI as it stands does not necessarily allow this, but that doesn't shift my opinion. This, that and the other (talk) 11:25, 9 November 2021 (UTC)[reply]
Meh. I bet all this will get more complicated over time, e.g. Apple changing the gun emoji to a water-pistol in response to complaints about violence; will the meaning of the codepoint change from "gun" to "water pistol"? And that "information kiosk woman" who has somehow become a complaint emotion on Twitter. Ha, emoji in Unicode were always a bad idea. I don't regard explaining 🐫 as "camel" as a definition, more of a technical mapping. Not to mention the appallingly-thought-out flags based on ISO codes, not properly allowing for historical flags and future changes to flags. Equinox 11:33, 9 November 2021 (UTC)[reply]
We have Translingual entries for Han characters that lack a definition, so there is precedent for that. (Incidentally I've always wondered why the backstory-type info for Han characters is under the "Chinese" header. There must be a good reason, but it's lost on me.) Flag emojis, well, that's a whole different ball game (the pedant in me is obliged to point out that they're technically not codepoints). This, that and the other (talk) 11:50, 9 November 2021 (UTC)[reply]
IMO, the ones like 💁, 🍑, 💅 that have acquired special meanings beyond their Unicode character names are among the emojis most worthy of entries, since someone could be unaware of the significance, or possibly even have a font where it renders differently, and therefore look for the meaning. I'm not sure what someone looking up 🐬 would expect to find, though, other than the obvious "dolphin". 01:50, 15 November 2021 (UTC)[reply]
I'd say they should all be deleted unless they have some additional idiomatic senses, as discussed above. We don't have technical characters and just list their Unicode names as their definitions either, and this isn't far removed from that. — surjection??⟩ 21:38, 9 November 2021 (UTC)[reply]
I'm open to the possibility of putting them all in an appendix and setting up redirects to that appendix in case someone tries to look them up. But I think the Mainspace should only have the idiomatic ones. Andrew Sheedy (talk) 22:49, 9 November 2021 (UTC)[reply]
I "sub-commented" above. I feel it's in our interests to be able to "say something" about every character; however, I don't think it's worth creating special entries to say that 🐫 is a camel (when that is a reiteration of Unicode standards rather than an actual definition). We already have some nice templates. Under current rules (and pretending this was RFD; of course it isn't) I would say delete to these; however, in the long term, I wouldn't object to some automatic "Unicode page" thing that merely shows the codepoint, numbers, and official name, etc. Because someone will look these things up. It just isn't lexicography. Equinox 07:44, 10 November 2021 (UTC)[reply]
And yes I know we already have the special template that shows what a character is. But right now it isn't a page until it's created, and usually this would require a meaning beyond the "picture of a camel". Oh well you get the idea. Equinox 07:45, 10 November 2021 (UTC)[reply]

Looking for Community Consensus[edit]

There is an inactive discussion on Wiktionary talk:Administrators. If this subject is not an allowed topic, please remove kindly. GareginRA (talk) 19:47, 9 November 2021 (UTC)[reply]

Complaints about administrators can be brought to the Beer Parlour (i.e. here), as can discussions about language-specific policies. But since Armenian looks to me like chicken scratches drawn with an Etch A Sketch I will defer to the Armenian administrators on policy and blocking related to that language. Vox Sciurorum (talk) 18:04, 10 November 2021 (UTC)[reply]

Make Template:rhymes point to the new category pages[edit]

@Surjection I see you've done a lot of good work moving info (e.g. the syllable count) from the old Rhymes:... pages to the {{rhymes}} template. I think we should change {{rhymes}} to point to the new category pages instead of to the old Rhymes:... pages, and eventually delete the latter. Thoughts? Apologies if this has been discussed already, I've been gone for a couple of months and don't see any such proposal in the Beer Parlour in Sep, Oct or this month. Benwing2 (talk) 04:43, 10 November 2021 (UTC)[reply]

It has been discussed before at Wiktionary:Beer parlour/2021/August#Retiring Rhymes:. Out of my eight-step program, basically only the first step has now been done. — surjection??⟩ 10:10, 10 November 2021 (UTC)[reply]
@Surjection Thanks. However, I think it's too conservative to defer changing the {{rhymes}} template to step 7; at this rate, this will never happen, and people will still feel the need to manually update the old Rhymes: pages. I'd instead suggest moving the Rhymes:... pages to the appendix, like you suggest, and then going ahead and changing the {{rhymes}} templates to point to the category space (or switching the order of these two steps). Once we change the {{rhymes}} template, we can delete any of the old Rhymes: pages that don't have any extra information on them (by bot if done carefully), which makes it clearer which Rhymes: pages need to manually have info moved to the corresponding Category: page. BTW it should also be possible, I think, to autogenerate intermediate pages like the category equivalent of Rhymes:Italian/u-. Benwing2 (talk) 04:08, 12 November 2021 (UTC)[reply]
I don't have the time to look into it in greater detail, but I'm not strictly opposed to any plans to expedite the change. There are other considerations to be taken as well, such as the "Rhymes" link on the main page. — surjection??⟩ 11:44, 12 November 2021 (UTC)[reply]

Sudovian (Narew Baltic) language: code, orthography?[edit]


There is a manuscript called "the pagan speeches of Narew" (which I have copied here), and you can read more about it on Wikipedia. Basically, it's a dictionary of ~200 words in an unknown Baltic language, copied by an amateur (the original source is lost).

Many scholars believe that the language attested is Sudovian/Yotvingian (language code: xsv), but others claim that it is e.g. a dialect of Lithuanian with strong Germanic (Yiddish) influence. A lot of Latvian etymologies here include {{cog|xsv|...}}, and some other sources about Baltic etymology just list it as Sudovian/Yotvingian, although the Altlitauisches etymologisches Wörterbuch (ALEW) calls it "narewisch" ("nar." for short) to be agnostic. This raises the question, should the code xsv be used? I think it could be appropriate, but maybe a disclaimer could be added that the language is uncertain. I'm not sure that Category:Undetermined language is the right way to go though.

Although the more well-known extinct Baltic language, Old Prussian, is a bit of a quagmire, I think that Sudovian/Narew Baltic doesn't have to end up like that. The main problems with Old Prussian are outright neologisms, confusion over normalized/reconstructed forms, and lack of standardized orthography.

Like Old Prussian, there are people who have created Sudovian neologisms or reconstructions, e.g. the "Suduva" website, which states: "Today, the Prussian language is enjoying a revival [...]. Perhaps a restored Sūdovian-Yotvingian language [...] will also fare as well.". Even lt.wiktionary.org has a bunch of neo-Yotvingian entries: "Category:New Yotvingian words", e.g. lt:wendorėdas is not attested. Since there is only one source for Narew Baltic (not even definitely Sudovian) words as far as I am aware, except for reconstructions based on toponyms, the issue of telling whether a word is attested is pretty easy to resolve.

That still leaves orthography as an issue. The actual script used in the Narew manuscript is based on Polish, so the characters ż and ł occur in some words. Moreover, an s-character that looks more like ſ or ʃ is used. (ALEW uses ſ, but uses a font that looks like ʃ. The original papers explaining the document used ʃ. lt.wiktionary.org and some other sources use s.) I'm not sure how faithful we want to be to the original writing format.

Thanks, 23:57, 11 November 2021 (UTC)[reply]

Suggestion on Malay: Soft-redirecting Jawi entries to Latin[edit]

Now, we have separate full-fledged entries for the same Malay word in both Jawi (Arabic script) and Rumi (Latin script). For example, we have ribu and ريبو‎, both defining the word as "thousand". I suggest soft-redirecting the less commonly enquired Jawi entries to their equivalences in Rumi, like how we soft-redirect entries in simplified Chinese to traditional Chinese (e.g. 单位 to 單位), or entries in hiragana to kanji when the word is more commonly spelt in kanji (e.g. たんい to 単位). Jonashtand (talk) 14:20, 14 November 2021 (UTC)[reply]

Support. This will help to reduce redundant (and potentially also divergent) information about the same thing. Basic information (word class, meaning) should IMO however remain visible in the Jawi entry, even it is just one click away. I have done something similar (with the kind help of User:Fenakhay) for Makassarese Lontara entries. –Austronesier (talk) 18:26, 14 November 2021 (UTC)[reply]
You can add a gloss to the link, like:
  1. Jawi spelling of tupai (squirrel).
Vox Sciurorum (talk) 14:56, 15 November 2021 (UTC)[reply]
We are doing that ↑ with the template {{ms-jawi}}, unless we are lazy because they are a lot. You can help. It will be appreciated. --Octahedron80 (talk) 00:46, 16 November 2021 (UTC)[reply]
@Octahedron80 OK. So there has been a consensus that Jawi entries are to be soft-redirected to Rumi entries? Should we write this in WT:About Malay?
May someone write the consideration please. --Octahedron80 (talk) 00:35, 21 November 2021 (UTC)[reply]

Punjabi pairī̃[edit]

Why should I be unable to find the transliteration pairī̃ for Punjabi for 'in the foot'? The Punjabi noun ਪੈਰ (pair, foot) is in Wiktionary, and a sparse declension is shown for it. I wanted to link to it from Wikipedia to explain English pairin ("Gurmukhi subscript") (includability TBD). --RichardW57m (talk) 13:04, 15 November 2021 (UTC)[reply]

Wikipedia states that the locative/instrumental case is now considered vestigial and is mostly confined to a few set adverbial expressions. It looks like an IP removed both the locative/instrumental and ablative cases from the main Punjabi noun declension table template in this 2017 edit. I could not find any discussion about the change. Apparently nobody has cared enough until now to complain about the cases' absence. 00:40, 16 November 2021 (UTC)[reply]
I suspect the 'IP' was actually @AryamanA. Anyway, it seems that the answer is that it now counts as a derived term, and would be a lemma of its own. Or are there objections to that approach? --RichardW57 (talk) 08:18, 17 November 2021 (UTC)[reply]
I don't think AryamanA was the IP, given that the two talked on the anon editor's talk page. Anyway, I'm not sure whether it would be a noun form or a lemma adverb, or both. Wiktionary:About Punjabi doesn't exist. Perhaps someone else would know the policy here, or you could just be bold with what makes the most sense to you. 21:42, 17 November 2021 (UTC)[reply]
I think I should be bold, it's just that the quotations won't be very good, and might even be wrong. The form is common enough. I brought the question here because it's a matter of policy, but we don't seem to have a community of editors of Punjabi. --RichardW57 (talk) 20:14, 18 November 2021 (UTC)[reply]
The situation reminds me of the illative case in Lithuanian, which we also don't provide in declension tables, despite still being used in spoken language. However, I don't think the templates ever included that. (Edit: on the other hand, Hindi's vocative is considered "obsolete" according to Wikipedia yet we provide that form of nouns. I wonder if a compromise of providing the terms, but with a little asterisk and footnote at the bottom of the table, would be okay. Demo here.) 00:40, 16 November 2021 (UTC) edited at 21:42, 17 November 2021 (UTC)[reply]

{{female equivalent of}}[edit]

A user has been mass-replacing {{female equivalent of}} with {{n-g|feminine equivalent of}} in (mostly) German entries. The reason they are doing this is because these nouns can not only be used to refer to female people but also to other female nouns in similes/metaphors for which they provide correct examples in their edit messages: Lebensgefährtin, Komplizin. It's worth noting that the documentation of {{female equivalent of}} states: "It is used for nouns which occur in pairs for different natural genders of the referent,". The word Lebensgefährtin is overwhelmingly used in the sense of Lebensgefährte while referring to a woman but it is also true that it can and has been used for merely grammatically female entities in the context of metaphors and other rhetorical devices. The annoying thing about their replacement is that the word doesn't show up anymore in Category:German_female_equivalent_nouns, a category bearing the description "German nouns that refer to female beings with the same characteristics as the base noun." which indisputably applies to Lebensgefährtin, it's just that in addition to referring to female beings, it may also refer to grammatically female entities in general. If we were to go with the logic this editor is applying, then the above category would be rather sparse.

Can we discuss and come to a consensus on this? I'm really not a fan of the status quo; I believe that these terms belong in that category and I don't see any reference to metaphorical senses over at Lebensgefährte either (I think there's even a policy against that), so why honor the rare metaphorical use of Lebensgefährtin by removing it from the category? If we really wanted to honor it, the way to go in my opinion would be to add a second sense below {{female equivalent of}}. Fytcha (talk) 14:47, 15 November 2021 (UTC)[reply]

This anon's (perhaps B-Fahrer (talkcontribs) ?) main problem seems to be the use of "female" instead of "feminine". There used to be a template {{feminine equivalent of}}, but it was deleted with a redirect to {{female equivalent of}} (see related discussion). As pointed out there, we already have {{feminine of}} which can be used in these cases. Removing entries from the category is not ok, I wasn't aware the edits had this side-effect. – Jberkel 15:25, 15 November 2021 (UTC)[reply]
Revert these edits and add additional senses if necessary. Ultimateria (talk) 16:11, 15 November 2021 (UTC)[reply]
@Ultimateria: In that case, tweaking the definition of {{female equivalent of}} might be helpful. The only reason why I hesitated to roll back those changes was because of the wording "It is used for nouns which occur in pairs for different natural genders of the referent, one referring to a male individual and another referring to a female individual.". Fytcha (talk) 16:17, 15 November 2021 (UTC)[reply]
@Ultimateria: I've reverted them but the editor reverted my reverts. To reiterate: All those words are grammatically the female equivalent but they are not exclusively used to address entities with a natural gender of female. We should either 1. change the documentation of {{female equivalent of}} so as to make it clear that it is the grammatically female form without necessarily having to only refer to entities of female natural gender or 2. create a new template that we can paste into all these articles as a second sense. However, writing out a {{ngd}} and applying a category really can't be the solution (inconsistent, error prone, much more typing) and the longer we wait now, the more we will have to fix later. --Fytcha (talk) 23:09, 17 November 2021 (UTC)[reply]
@Fytcha: There is nothing wrong with the template. Option 1 doesn't work; see my comments below about productora. Just because it refers to a company doesn't mean it isn't also "a productor who is female". If I had to guess, B-Fahrer is assuming that stubs with "female equivalent of X" as the only definition are as complete as they'll ever be, but doesn't realize that they are missing senses (even though BenWing made it clear in the RFD discussion a year ago and I mentioned it below a couple days ago). I'll revert the rest of their edits. Ultimateria (talk) 02:15, 18 November 2021 (UTC)[reply]
Your reverts were reverted again (with snarky summaries). In the majority of these cases {{female equivalent of}} is correct as the primary sense, but some need to be checked individually. – Jberkel 13:53, 18 November 2021 (UTC)[reply]
@Ultimateria: Unfortunately, half your edits were reverted again so now it's all very inconsistent again. Some articles now look like this: Unterstützerin. Is there any value in having both female equivalent and feminine equivalent as different senses? To cite them separately perhaps? B-Fahrer's only argument seems to be that female doesn't apply to words that are only grammatically female but sexless regarding the natural gender (which is debatable, see sense 4 of female; which is why I initially proposed changing the wording in the documentation of {{female equivalent of}} because it currently does bolster his claim). The changes made to Unterstützerin are equivalent to changing the article Unterstützer to having three senses: 1. supporter (male human) 2. supporter (human of unspecified sex) 3. supporter (sexless entity). I don't see any value whatsoever in doing this. Even if we decided that we wanted female and feminine as two separate senses (I hope not), {{n-g}} is not the way to go. Fytcha (talk) 20:45, 20 November 2021 (UTC)[reply]
@Fytcha: No value. I have no idea how that's supposed to be parsed; I'll remove the feminine equivalent definition from that page. I'll clean up the rest of these entries and deal with B-Fahrer if they come back. I did block one of their IPs for 24 hours for edit warring and invited them to participate in the discussion. Ultimateria (talk) 02:02, 22 November 2021 (UTC)[reply]
I think there are a few forms where a special treatment is warranted (like Herstellerin, Klägerin), but a usage note is probably better than having confusing separate senses. – Jberkel 10:57, 22 November 2021 (UTC)[reply]
@Jberkel: I haven't touched the pages with quotes yet because I'm undecided on how to handle them. How about the definition "female equivalent of X" with the note "May also refer to non-human entities of feminine grammatical gender"? It explains the situation, but I'm hesitant to propose anything that could probably be spread across thousands of pages, when the information belongs in a grammar and not a dictionary IMO. I guess I can live with it. Thoughts on the wording? Ultimateria (talk) 00:28, 23 November 2021 (UTC)[reply]
@Ultimateria: Is reintroducing {{feminine equivalent of}} a possibility? We could just enable it for German for the time being and harshly demand citations for every use of the template, which would alleviate the issues brought up in the RfD (namely that it is just used interchangeably with {{female equivalent of}} with no semantic distinction). Fytcha (talk) 16:21, 25 November 2021 (UTC)[reply]
@Jberkel: Thanks for linking to that discussion. By the way, I don't think {{feminine of}} may be used here. Its documentation states: "This template should be used when there is no singular/plural distinction, or this distinction is irrelevant." I also think that it's likely that that user is behind the IP: The IP started editing right about when the user stopped[13] and additionally that user was very vocal in the discussion surrounding exactly these templates. Fytcha (talk) 16:26, 15 November 2021 (UTC)[reply]
As pointed out before, not just by me but also e.g. @Mahagaja (here), "female" simply is incorrect as dozens of examples (provided in version history and sometimes in entries) proof.
And in Romance languages too, it's often about gender than sex (see here for more if needed).
And as for the category, albeit it only fits partially (for a limited usage, when the -in term refers to living beings like humans or some animals), it can also be added manually by adding [[Category:German female equivalent nouns]] to the bottom of the entry.
--18:40, 15 November 2021 (UTC)
Re: "it's often about gender [more] than sex": those are simply separate senses, and we already treat them as such. A feminine term for a type of company is obviously not the "equivalent" of anything, it just means that that specific definition is not covered by the "female equivalent" template. I've rearranged Spanish productora to preserve your quotes while matching the page to our formatting norms. (Unfortunately the English glosses weren't great, so I had to improvise.) Ultimateria (talk) 00:11, 16 November 2021 (UTC)[reply]
Re: ""female" simply is incorrect" See female: 4. (grammar, less common than 'feminine') Feminine; of the feminine grammatical gender. --Fytcha (talk) 23:09, 17 November 2021 (UTC)[reply]

"sufficient", "ample" etc. as determiners[edit]

(This issue has arisen out of the RFD for the adjective sense of "enough".) It appears to me that the underlined words in the following contexts, and perhaps some other similar words too, are not adjectives, as we (and other dictionaries) presently imply, but are in fact determiners.

we have sufficient bread
we have ample bread
we have adequate money

This is on the basis that these words do not describe what kind of bread, or kind of money, as adjectives should. Certainly, if we believe that "enough" in "enough bread" is a determiner, then there seems no reason why e.g. "sufficient" in "sufficient bread" should not also be a determiner (yet in "a sufficient reason" it could be construed as an adjective).

However, before I make potentially quite wide-ranging changes along these lines, please say whether you agree/disagree that we should add determiner sections for words in usages such as these (there is in fact already a determiner section at "sufficient", but it is presently an oddity dealing only with pronoun-like usage). Mihia (talk) 18:35, 15 November 2021 (UTC)[reply]

CGEL (2002) (which calls the word class determinatives) lists enough and sufficient as sufficiency determinatives. The authors mention that enough and sufficient can appear in "fused head" constructions, which seems to be a requirement for something the be a determinative, eg.
You've said enough to convince me.
I don't have much money with me, but sufficient for a taxi.
I don't think ample and adequate can be used in such constructions.
When used as a determinative, sufficient has a quantifying sense. Otherwise it seems like an adjective. I think enough may not ever be an adjective in current English, at least not in my idiolect. DCDuring (talk) 16:35, 16 November 2021 (UTC)[reply]
@DCDuring: Thanks for that information, which confirms my own feelings about "sufficient" at least. To me, "I have ample for a taxi", implying ample money, is fine, while "I have adequate for a taxi" is a bit more marginal. I did originally think that other similar words ("quantifying" determiners presently listed as adjectives) would come to light, but so far I haven't been able to think of any. Mihia (talk) 19:01, 20 November 2021 (UTC)[reply]
In fact, another example appears to be scant. In the usage example for sense #1, allegedly an adjective, "Mary had scant reason to believe John", it seems that Mary did not have reason that was "scant", but in fact had little reason, and indeed even the definition of #1, "Very little, very few" is that of a determiner. Use of "scant" in the "fused head" construction seems somewhat unusual, but even so I readily found e.g. "Grace smiled, while inside her heart fury brewed and boiled, and it had scant to do with Brenda", "Cap had scant to say but he began to do John small kindnesses in return", "As a philosopher of religion I have had scant to offer", etc. Mihia (talk) 22:17, 20 November 2021 (UTC)[reply]
Those cites, if durable, are enough for inclusion, but should some of these be marked rare for now? Sufficient seems to clearly have become a determiner. I wonder how long ago. It's not easy for words to break into the relatively closed sets of function words. DCDuring (talk) 15:31, 21 November 2021 (UTC)[reply]
I would say that "sufficient" has always been what it is and meant what it does, but of course traditionally all(?) determiners were called adjectives, so I think it's a question of the terminology catching up to the meaning rather than the meaning changing. Ditto for "ample" and "scant". "adequate" is perhaps slightly more borderline, but even here I see two interpretations of e.g. "We have adequate supplies", one where "adequate" describes a property of the supplies (adjective), and one where it describes the quantity of supplies (determiner). Furthermore the determiner uses e.g. "she has scant reason", or "she has ample reason" seem to me to be routine uses of the words. The only cases mentioned that I see as not common/normal are the "fused head" uses of "scant" and "adequate". Mihia (talk) 18:46, 21 November 2021 (UTC)[reply]

can the page for "reexida" be removed?[edit]

The creation of the page "reexida" was an accident and it's a spelling mistake for the Catalan word "reeixida". Kyning (talk) 04:02, 16 November 2021 (UTC)[reply]

Done.   AugPi 04:29, 16 November 2021 (UTC)[reply]

Section statistics[edit]

I made a page with some statistics about existing sections. It shows how many times a section is used, and at what level. The tables are split into Languages, POS sections mentioned in WT:POS, other sections mentioned in WT:ELE, and other Nonstandard sections.

The list of Nonstandard sections gives some interesting insight into possible additions to WT:ELE: At the top of the list is Statistics, with 27643 entries, which looks like just a bunch of information about the popularity of a given name. Meanwhile, Trivia is explicitly allowed, but only used 48 times.

Other non-sanctioned categories with more than 1000 uses include: Compounds, Readings, Idiom (explicitly disallowed), Derived characters, Alternative scripts, and Adjectival noun

Per WT:POS: There are a number that are explicitly forbidden:

  • Abbreviation, Acronym, Initialism
    Abbreviation (5), Abbreviations (383)
  • “(POS) form”: Verb form, Noun form, etc.
    Affixed forms (537), Runic forms (7), and many more with just a handful of uses
  • “(attribute) (POS)”: Transitive verb, Personal pronoun, etc. (with the exception of Proper noun)
    Adjectival noun (1389), Verbal noun (408), Dependent noun (37), Stative verb (23)
  • Cardinal number, Ordinal number, Cardinal numeral, Ordinal numeral
    Ordinal number (519)
  • Clitic, Gerund, Idiom
    Clitic (25), Gerund (75), Idiom (7266), Idioms (428)

I'm not making any suggestions here, but maybe someone with more knowledge than I have can use this to propose something concrete. In the meantime, it's proven to be helpful for catching some existing typos. If a given section has less than 100 uses in any level, its name is clickable and you can see which pages it's used on.JeffDoozan (talk) 01:16, 19 November 2021 (UTC)[reply]

Nice. Just FYI, the display of the last section, "WT:ELE", screws up in my browser (Edge) such that not all the numbers are visible. However, if I reduce the zoom from 100% to 80% it is all visible. It may be a browser bug. On a point somewhat related to the list, present usage of sections "Derived terms", "Related terms" and to some extent "Coordinate terms" is presently a random and inconsistent mess across many English-language entries. I sometimes wonder whether we should auto-merge at least "derived" and "related" since it seems impossible to enforce usage of these to be kept to what the documentation stipulates. Mihia (talk) 20:25, 20 November 2021 (UTC)[reply]
I don't know why it's not displaying for you, it's just a wiki table wrapped with {{rel-top}} and {{rel-bottom}} to make it collapsible. If anyone knows of a a better way to display it, I'm happy to make changes. JeffDoozan (talk) 14:39, 21 November 2021 (UTC)[reply]
The columnised table may be "too much" for browsers. I looked at it in Chrome and the section that breaks in Edge displays OK, but other sections are wrong, with overlapping and clipped text. Mihia (talk) 23:25, 21 November 2021 (UTC)[reply]
Are you using custom CSS or an extension that might be altering the table? The table should only be 7 columns wide, it fits comfortably on screen even on my phone's browser. JeffDoozan (talk) 01:06, 22 November 2021 (UTC)[reply]
As far as I know, I have not customised anything. When I say "columnised table", I mean that the whole table is run across two vertical columns on the page, side by side, reading all the way down the left column, then back up to the top of the right column and down again, so 14 columns in all across the screen. In certain cases the right page column (seven table columns) overlaps the left page column and/or is right-clipped so that content is not visible. Mihia (talk) 10:28, 22 November 2021 (UTC)[reply]
That's weird, on my devices I see just a single 7 column table with nothing alongside it. I wonder if the skin you're using has some affect. If you open the page in a new private browser window, where you're not logged in to Wikimedia, does it still show up as a "columnised table" for you? Is anyone else seeing this? JeffDoozan (talk) 00:29, 23 November 2021 (UTC)[reply]
Yes, it still displays the same. Doesn't the "rel-top" .. "rel-end" templates always create this two-column format? What do you see below? Do you not see two columns?
To me, it looks as if applying this two-column layout to a table is "too much" for browsers (well, Edge and Chrome anyway), or cannot be made to fit, even though probably you have not actually done anything "wrong". I don't know why you would not be seeing two columns though. Mihia (talk) 15:31, 23 November 2021 (UTC)[reply]
You're absolutely right, I do see two columns with your example, but not on my page (using Firefox). I didn't realize that {{rel-top}} split the data into two columns. I've removed the {{rel-top}} and just added <div> tags to apply the same formatting without the two column weirdness. I hope it's more readable now. Thank you for your help troubleshooting this. JeffDoozan (talk) 00:27, 24 November 2021 (UTC)[reply]
Yes, it all looks to display correctly for me now. Mihia (talk) 13:03, 26 November 2021 (UTC)[reply]
Thank you for generating this list. I've had a lot of fun doing cleanup thus far.
The only proposal I could think of right now is codifying that parts of speech may appear in their plural under a ====Derived terms==== header as is the case in e.g. ق_ط_ف. This only really applies to Arabic roots from what I've seen so it should be written into Wiktionary:About_Arabic but on the other hand, there's no real reason why other languages' derived terms sections may not also be categorized by the part of speech (there's already =====Compounds===== after all) apart from the fact that lemmas usually don't have so many derived terms of so many different parts of speech that this is merited. Fytcha (talk) 19:42, 24 November 2021 (UTC)[reply]

Is there a reason why we don't have a template for interlinear glosses?[edit]

It would be useful for some {{ux}}es, see for instance the examples that I wanted to clean up in yardli or gardidi. Wikipedia seems to already have a template for that: Template:Interlinear. Fytcha (talk) 02:36, 19 November 2021 (UTC)[reply]

I guess the reason is we're a dictionary, not a grammar book. Interlinears help to understand grammar, but are next to useless to understand words. MuDavid 栘𩿠 (talk) 03:57, 19 November 2021 (UTC)[reply]
But they would help with understanding examples and quotations if one had only a shaky grasp of the language. --RichardW57 (talk) 05:00, 19 November 2021 (UTC)[reply]
@MuDavid: They should just be one more possible mode of translation ({{ux}} already allows literal translation in addition to idiomatic translations) which could be very useful for documenting some of the rarer and more obscure languages. If I were to go through a textbook of a rare language that contains examples in interlinear, wouldn't I want to conserve this information here on Wiktionary too? Fytcha (talk) 20:50, 20 November 2021 (UTC)[reply]

Clarify what web pages count as "permanently recorded" for WT:ATTEST[edit]

After this post, I made my own proposal to officially define Internet-Archived pages as "permanently recorded", and it failed with a tie, just like a similar proposal in 2012. From the discussion, it appears likely that there's a supermajority of voters who would support some reform in this direction; the tricky part is nailing down a specific compromise that would get enough support. I have little faith in my ability to craft one, both because I'm not very active on Wiktionary and because I ultimately failed to understand some of the details of the objections, but I hope somebody else tries their own proposal for this. —Kodiologist (t) 21:36, 19 November 2021 (UTC)[reply]

What about applying those laxer attestation criteria only to internet and gaming jargon terms as a first step and seeing how it goes? Fytcha (talk) 21:59, 19 November 2021 (UTC)[reply]
The reasons for people's "oppose" votes are there to see, but my impression is that what counts as "permanently recorded" was not much of an issue. I think the main concerns were around whether allowing entries on the basis of three Internet-sourced attestations risks opening the floodgates to lots of crap. While the present CFI wording in fact does not clearly exclude such entries, I think the feeling amongst objectors was that if we are doing anything around this wording, we ought to codify stronger "reliable source" requirements for Internet material. I think this is the area that needs addressing. Mihia (talk) 18:49, 20 November 2021 (UTC)[reply]

Example sentences copied?[edit]

Am I correct in understanding that the various pages using Template:RQ:ja:Xin Shidai Ri-han Cidian (101 transclusions at the moment, according to Toolforge) have copied usage examples from that dictionary? I'm not sure if that is a violation of copyright – it may well be perfectly legal, and even consistent with Wikimedia policies – but it feels wrong to me.

If I understand correctly, 新時代日漢辭典 (Xin Shidai Ri-Han Cidian) is a Chinese-Japanese bilingual dictionary. It appears to me that some portion of those 101 pages, which I think are all Japanese entries, contain usage examples copied from the dictionary. I am not suggesting that there is anything wrong with the examples as examples. But it feels contrary to norms around intellectual property to copy example sentences from that dictionary into this one, even with proper attribution. It seems to me that de minimis quotations from a range of sources would be better than multiple quotations taken from a published dictionary. Am I alone in feeling this way? Or, am I mistaken about what is going on here?

@Onionbar, Suzukaze-c, Benwing, I think you have been involved with the pages that include the template. Apologies if you're not interested in this discussion.

Cnilep (talk) 03:37, 21 November 2021 (UTC)[reply]

@Cnilep: Talk:相応しいSuzukaze-c (talk) 03:40, 21 November 2021 (UTC)[reply]

Extended mover[edit]

I would like to have the extended mover right, so that I can move pages without redirects. It's quite a useful right to have. I will use this tool responsibly and rationally, and should too many doubts/controversies arise over the pages I move, I will willingly surrender this. —Svārtava [tcur] 16:23, 26 November 2021 (UTC)[reply]

Symbol support vote.svg Support Good editor and, judging by their contribution, they could use these rights. Those redirects can sometimes really be annoying. Fytcha (talk) 16:37, 26 November 2021 (UTC)[reply]
Symbol oppose vote.svg Oppose. He was given this right on a temporary basis before and abused it so quickly that Surjection had to remove it. —Μετάknowledgediscuss/deeds 20:35, 27 November 2021 (UTC)[reply]

Talk to the Community Tech: The future of the Community Wishlist Survey[edit]

Magic Wand Icon 229981 Color Flipped.svg


We, the team working on the Community Wishlist Survey, would like to invite you to an online meeting with us. It will take place on 30 November (Tuesday), 17:00 UTC on Zoom, and will last an hour. Click here to join.


  • Changes to the Community Wishlist Survey 2022. Help us decide.
  • Become a Community Wishlist Survey Ambassador. Help us spread the word about the CWS in your community.
  • Questions and answers


The meeting will not be recorded or streamed. Notes without attribution will be taken and published on Meta-Wiki. The presentation (all points in the agenda except for the questions and answers) will be given in English.

We can answer questions asked in English, French, Polish, Spanish, German, and Italian. If you would like to ask questions in advance, add them on the Community Wishlist Survey talk page or send to sgrabarczuk@wikimedia.org.

Natalia Rodriguez (the Community Tech manager) will be hosting this meeting.

Invitation link

We hope to see you! SGrabarczuk (WMF) (talk) 20:03, 26 November 2021 (UTC)[reply]

Etymologies of usernames page in the Wiktionary namespace[edit]

A number of years back, I created User:PseudoSkull/Etymologies of usernames, which has been extensively used by the community since its creation. I believe it's useful enough and has enough community interest that it could be moved to the Project space. That would make it easier to find, and would be more inviting to edit than a userspace page. Any objections? PseudoSkull (talk) 16:22, 27 November 2021 (UTC)[reply]

The project space is for pages that serve the project. You called it "useful", but it actually doesn't serve the project in any way — it's actually just interesting. I don't think we should fill up project space with content about ourselves — it's fine where it is, and you can even add a notice encouraging people to edit it if you so choose. —Μετάknowledgediscuss/deeds 20:30, 27 November 2021 (UTC)[reply]

Gender-neutral Spanish neologisms (amigx, maestrx, etc.)[edit]

We've had entries for Latinx/latinx, Chicanx/chicanx, lxs, novix, amigx, and probably a few others for many years now. Recently, however, the gender-neutral x in Spanish has reached a level of acceptance (within very limited circles) where we are starting to see entire journal articles, academic papers, or even books[14] in which all generically-gendered words are replaced by their gender-neutral equivalents. It's still a very limited phenomenon, but at this point it's relatively easy to find three durable citations for almost any common gendered Spanish word in an x form. My questions are:

  1. Should we start creating entries for all these words now that they pass WT:CFI?
  2. Should we modify {{es-noun}} and similar templates to accommodate these forms?
  3. Some of the existing x entries are marked "informal", but I'm not sure that's accurate at this point. Would "uncommon" be the best usage label?
  4. Should we create a special category for them?
  5. Should we create a standardized usage note for them?

Nosferattus (talk) 05:09, 28 November 2021 (UTC)[reply]