Wiktionary:Beer parlour/2022/August: difference between revisions

Content deleted Content added

Inline

Revision as of 16:40, 28 August 2022

“Chinglish” should be added to Module:labels/data/lang/zh!

Chinglish, means poorly translated English, with inappropriate Chinese elements. Although Chinglish is a kind of English (and it’s already put in Module:labels/data/lang/en), it is also should be counted as Chinese (and be put in Module:labels/data/lang/zh). It is because there are some Chinglish words belong to Chinese instead of English, for example: “un頂able”, “gay裡gay氣” Beefwiki (talk) 15:05, 1 August 2022 (UTC)[reply]

Is it OK to create a WIP module?

Normally, I would put such an unfinished module in a user subpage of mine, especially as I'm a new module writer, but is it possible to invoke Lua code not written in a Module namespace? If not, would it be OK to make a page in the global Module: namespace? Thanks, Kiril kovachev (talk) 10:21, 3 August 2022 (UTC)[reply]

@Kiril kovachev You could make Module:User:Kiril kovachev or a subpage of that page. This, that and the other (talk) 11:51, 3 August 2022 (UTC)[reply]

Ah, thanks very much, will do. Kiril kovachev (talk) 14:13, 3 August 2022 (UTC)[reply]

Collocations Namespace

One issue I face, and WF too if memory serves, is words with many collocations (abonament is an exmaple of a Polish entry with a medium-high amount). Would there be value in having a special namespace akin to our thesaurus? We can keep inline and the header for words with minimal amounts, but seeing as things such as collocations dictionaries exist, there might be some sense to it. Vininn126 (talk) 22:36, 3 August 2022 (UTC)[reply]

Honestly, I don’t feel like I’ve seen them enough to justify having a separate namespace (yet). Maybe a collapsible table would be better? AG202 (talk) 00:08, 4 August 2022 (UTC)[reply]

As someone who regularly adds lots of collocations, I think this would have many upsides, such as clearer sorting of collocations by POS. Perhaps by facilitating it more we will see them more. I also have considered having inline collocations be collapsible after a certain amount, always showing three with a button that says "see more". I asked for this in the grease pit but got ignored... Vininn126 (talk) 08:53, 4 August 2022 (UTC)[reply]

Also, this is an extreme example, but I can extract around 500 collocations for absolutnie from a corpus. I doubt we'd want that many, but I mean I don't think have them on the page is a good idea, even in a collapsable box. Vininn126 (talk) 17:07, 4 August 2022 (UTC)[reply]

Also I'd like to ping @BigDom and @Thadh as people who have shown interest in this topic before. Vininn126 (talk) 09:06, 4 August 2022 (UTC)[reply]

Putting anything in a different namespace makes it less likely to be seen. I say just make them collapsible. Ultimateria (talk) 22:37, 5 August 2022 (UTC)[reply]

Abstract Nouns

I just reverted edits by @Skiulinamo to the category modules because they caused module errors in our noun categories. They were, however, trying to address what may very well be a legitimate problem: abstract nouns aren't included in our category structure. Category:English abstract nouns is currently a hand-coded one-off thing that probably could use some reworking.

Skiulinamo created at least one new abstract noun category that now has an error due to my reverts, which raises the question: how should we treat abstract nouns? And how can we implement it without trashing our category structure? Chuck Entz (talk) 14:19, 4 August 2022 (UTC)[reply]

I was working to create a parent category for Category:Proto-Germanic abstract nouns. I suppose I could have hard-coded it but I didn't see any issue with going the extra mile to add a catboiler instead. (Chuck, you also unfortunately reverted my changes to the vrddhi gerund cats) --Skiulinamo (talk) 20:10, 4 August 2022 (UTC)[reply]

Category:English abstract nouns currently has 39 members, while I estimate that there are tens of thousands of English abstract nouns. There are already 9,758 English words suffixed with -ness, the vast majority of which denote an abstract concept. The concept (a term not included, although it is itself an intangible concept) is hard to apply; are terms for measurable quantities like weight or temperature abstract concepts? Is proverb an abstract concept? Assiduous attempts to populate this severely underpopulated category will inevitably lead to a great deal of arbitrariness, so I wonder which is the lesser harm, keeping it or zapping it. --Lambiam 16:06, 5 August 2022 (UTC)[reply]

It does seem like a can of worms (abstract noun?). Weight and temperature have some definitions that are not "abstract". Some definitions of abstract noun have it that the referent of abstract noun "a non-material, non-perceptible entity". (This would be an improvement over our existing definition IMO.) But that definition doesn't really help with items that are outside our normal terrestrial experience (black hole, absolute zero). Is thing an abstract noun? DCDuring (talk) 16:27, 5 August 2022 (UTC)[reply]

If you have a look at Category:Proto-Germanic abstract nouns, it's a silo for abstract noun suffixes, so in cases like this, it's pretty straightforward, i.e. canned wormless. Category:Turkish nouns has a pretty built-out system. --Skiulinamo (talk) 21:22, 5 August 2022 (UTC)[reply]

Common gender in Portuguese

According to @Ultimateria, it does not exist. But in Portuguese Wiktionary there's a common-gender template used in all common-gender words entries. Is comum aos dois gêneros any different from common gender? Tazuco (talk) 22:30, 5 August 2022 (UTC)[reply]

Yes, see w:Grammatical_gender#Gender_contrasts for an overview of gender contrasts that align with our practices on English Wiktionary, including the common gender of certain Germanic languages. I believe this is just an issue of semantics; I don't think our current masculine-feminine labels for Portuguese should change. Ultimateria (talk) 22:36, 5 August 2022 (UTC)[reply]

That article does not issue Portuguese; that section has citation needed template since 2013. I wonder why English Wiktionary is something a part. Another issue is demonstrative pronouns, in enwikt isso is classified as neuter while in Portuguese dictionaries (not ptwikt) it's masculine. So what's the truth? Tazuco (talk) 22:51, 5 August 2022 (UTC)[reply]

Thanks for pointing out that inconsistency. I'd argue for changing isso et al to masculine, unless anyone has a reason not to. (Notifying Ungoliant MMDCCLXIV, Daniel Carrero, Jberkel, Svjatysberega, Cpt.Guapo, Munmula, Koavf): Ultimateria (talk) 17:24, 6 August 2022 (UTC)[reply]

the enwiki page on Spanish grammatical gender also classify Spanish as having common gender. Is that wrong? There doesn't seem to have much source Tazuco (talk) 22:41, 6 August 2022 (UTC)[reply]

There are some nouns in Portuguese that can be either masculine or feminine, such as those ending in -ente/-ante (e.g. estudante, participante, residente) – those are what we call comum aos dois gêneros because they don't have a specific masculine or feminine form. As to isto and isso, they are indefinite neutral forms of este/esta and essa/isso, but are treated as masculine when referred to by adjectives (e.g. isto é bom). - Munmula (talk) 06:07, 7 August 2022 (UTC)[reply]

The term "common gender" in Wiktionary (code c) refers to a specific gender in certain languages, such as Dutch and Danish, that is the historical merger of masculine and feminine. It does *NOT* refer to terms that can be either masculine or feminine according to the sense. There is a special code mfbysense for these terms. Benwing2 (talk) 03:17, 8 August 2022 (UTC)[reply]

IP Editor Reality Check

See Special:Contributions/191.95.0.0/16.

This block of IP addresses geolocates to Colombia, and has been extremely busy in the Reconstruction namespace since February. Among the proto-languages edited: Proto-Hellenic, Proto-Italic, Proto-Indo-Iranian, Proto-West Germanic, Proto-Celtic, Proto-Semitic, Proto-West Semitic, and Proto-Turkic. I can't even guarantee this is everything, because the WHOIS utility says they have access to a /13 address space, and Special:Contributions can only show blocks of /16- 1/8 of the total.

This all strikes me as a bit... ambitious..., though I don't have the background to spot all the potential errors. It does bother me that they're adding IPA templates to reconstructions of languages that haven't been spoken in thousands of years- if at all. It's hard enough to reconstruct Latin pronunciation, even with descriptions of it by Latin grammarians. Proto-languages are educated guesses based on comparing incomplete data from descendants, requiring all kinds of arbitrary choices and rather vague about chronology and regional variation. Try coming up with a pronunciation that covers Los Angeles, Dallas, Atlanta, Louisville, Chicago, Boston, London, Manchester, Dublin, Sydney, Pretoria and New Delhi from Elizabethan times to the present, and you'll have an inkling of the uncertainties involved.

At any rate, this needs attention from a lot of people to check all of these edits in such a wide range of proto-languages. Pinging @Mahagaja, Fenakhay, (Notifying Wikitiki89, ZxxZxxZ, Ruakh, Qehath, Mnemosientje, Isaacmayer9, Metaknowledge, Rua, Wikitiki89, Benwing2, Mnemosientje, The Editor's Apprentice): to start with. Chuck Entz (talk) 05:09, 7 August 2022 (UTC)[reply]

@Chuck Entz I don't know about the declensions but I'm skeptical in general of pronunciations of reconstructed languages. I think we should consider blocking the IP range from edits to the mainspace and Reconstruction spaces until we can get them to discuss their changes. Benwing2 (talk) 05:25, 7 August 2022 (UTC)[reply]

I'm not thrilled about including pronunciation section in proto-language reconstructions either – especially not for Proto-Celtic, where we don't even know where stress fell – but this IP is definitely not the first or only person to add them. Unless we have a policy prohibiting them, we're going to have difficulty persuading people to knock it off. —Mahāgaja · talk 06:33, 7 August 2022 (UTC)[reply]

Should we develop such a policy? I've asked people to remove pronunciation sections from Proto-Japonic entries, for instance, not least as there is no clear consensus on several of the vowel values. A clear policy that covers all proto-languages would be a good idea, I think. ‑‑ Eiríkr Útlendi │^{Tala við mig} 07:45, 7 August 2022 (UTC)[reply]

I personally don't really see any issue with phonemic pronunciation of most proto-languages: If we can reconstruct the word, we should be able to provide the IPA values of the signs we are using. I know it gets tricky with languages like PIE, but overall reconstructed IPA pronunciations seem quite straightforward. Thadh (talk) 09:23, 7 August 2022 (UTC)[reply]

@Thadh: Well, no; often we can't provide the IPA values of the symbols we use. The symbols stand for phonemes, but the reconstructed realization of those phonemes is often controversial, for example the PIE laryngeals or the Proto-Semitic fricatives, not to mention the Proto-Celtic stress placement and Proto-Japonic vowel qualities already mentioned. I would say the About page (WT:AINE, WT:ASEM, etc.) should include information on how the symbols are to be pronounced, including info on any controversies or uncertainties, but IPA info should not be included on the reconstruction pages themselves. —Mahāgaja · talk 14:40, 7 August 2022 (UTC)[reply]

@Thadh, Mahagaja I agree with Mahagaja here; the symbols used in reconstructions are convenient approximations to what the sounds might have been but there are significant disputes concerning a large fraction of the symbols in the majority of reconstructions. Furthermore because the reconstructed symbols are supposed to represent phonemes already, it's IMO not clear what adding IPA gains you other than expressing one particular researcher's view on what the symbols stand for. Benwing2 (talk) 03:14, 8 August 2022 (UTC)[reply]

@Benwing2: I think it gains you exactly the same thing as any phonetically written language would with a phonemic pronunciation: It gives you an idea how the language is/was approximately pronounced. The phonemic pronunciation of a language like Dutch would be possible to determine on the spelling and stress alone 99% of the time, even more so with German. Does that mean we don't need to give phonemic pronunciations there, either? Thadh (talk) 08:35, 8 August 2022 (UTC)[reply]

@Thadh Actually it's very hard to automatically convert German spelling to pronunciation; I have written an experimental module to do this and it runs to almost 3,000 lines and requires respelling in many situations. Dutch may be easier, but for a language like Proto-Germanic as normally written, it's pretty simple to do. But the bigger issue is that we really don't know many details of how proto-languages were pronounced. E.g. for Proto-Slavic we don't know even the phonemic vowel length/stress/tone in many cases, much less how these were actually pronounced; nor do we know very well whether and when various consonants were phonetically palatalized, what the quality of the /v/ phoneme was, how ť ď ь ъ y, yat or the nasal vowels were pronounced, how e and ě were distinguished, etc. We only have general ideas, e.g. that the yers were quite short and respectively front and back and high-ish, likewise that the nasal vowels were nasalized and front/back, etc. It is purely a guess, for example, that the nasal vowels were mid vowels, and that the yat vowel was maybe something like /æ/, and there is a lot of disagreement in scholarly sources about all these issues. By giving a pronunciation, we're necessarily taking a position in all these debates. Do we really want to do this? For attested languages, it's totally different; there's a definite way they were pronounced and we can ask a native speaker about this. Benwing2 (talk) 04:00, 9 August 2022 (UTC)[reply]

Resource for Italian languages and gestural communication

Interesting book on non-verbal communication: https://www.themarginalian.org/2012/12/13/bruno-munari-speak-italian-gestures/. —Justin (koavf)❤T☮C☺M☯ 20:37, 7 August 2022 (UTC)[reply]

Chinese Glyph n headers and template `{{Han-glyphref}}`

A small number of Chinese entries use headers Glyph 1, Glyph 2, etc. apparently to include multiple distinct glyphs/glyph forms under one entry. The entries: Search results for ""Glyph 1"". The headers are level 3 with nested level 4+ contents for each glyph; for example in 径. Most of these headings use {{Han-glyphref}} to indicate which glyph is being described. In a handful of cases, {{Han-glyphref}} is being used without the headers, along with nonstandard ordering and nesting of other headers: 凿, 屮, 彐, 蘗, 𣏒.

I have not been able to find documentation for the the usage of these headers nor for the template. Would it be safe to refactor these pages to conform to the standard layout? If so, what would the best approach be? BKalmar (talk) 11:44, 11 August 2022 (UTC)[reply]

@BKalmar: I think this should be refactored to conform to standard layout, but we'd have to check case by case. — justin(r)leung _{{ (t...) | c=› }} 16:02, 11 August 2022 (UTC)[reply]

Shanghainese IPA update and Wugniu Romanisation

Wiktionary's romanisation for Shanghainese has been relatively rarely used by speakers and learners of SH. I, as well as several other editors, would like to update the romanisation to one that is most commonly used and redo some of the IPA.

The current IPA in use is way too narrow to properly notate to all speakers of SH. Notable issues include the use of /ɜ/, devoicing all voiced consonants, as well as the incredibly idiolectal /vʷ/. The update aims to solve all of these issues so that the current IPA would be able to cater to all speakers of SH’s speech.

The current romanisation was made before the widespread adoption of Wugniu, however, the scheme has now been around for around six years and since then has gained widespread adoption. Scores of learners have adopted this romanisation and is one of the two that is covered by the Rime IME, along with the MiniDict romanisation. Wugniu is also relatively versatile in that it aims to be usable for all Wu lects and hence, if adopted for SH, would pave the way for future expansions into other lects. (Also, frankly speaking, the fact that the scheme has a pentagraph is enough reason to replace it)

I have consulted with people teaching Northern Wu and those educated in Northern Wu linguistics including but not limited to @Musetta6729 and created a scheme for the update to follow. A prototype module has been prepared by @Manishearth at a user page, and has been improved upon by @Wpi31 in a sandbox. We aim to completely phase out the WT romanisation and replace it with Wugniu, though this can be changed if necessary.

Looking beyond, Suzhounese and Wenzhounese modules are to come, and the data has been gathered for SZ. Wugniu information on WZ only draws from one source, so another discussion is to be held regarding how to deal with it. Information about Hangzhounese is already prepared and a module can be made whenever one wants.

For further reading, I have a user page regarding the change and a page for every single lect covered by Wugniu’s romanisation in the works. If there are any questions, feel free to ask. 義順 (talk) 15:22, 11 August 2022 (UTC)[reply]

@ND381: Thanks for starting off the discussion. I'm very much in support of shifting to something that's more widely used. I think a few things need to be ironed out for transition into the new system:

Display - do we completely remove Wiktionary romanization in display, or do we want to keep it for legacy?
Input - how do we plan on switching the input from Wiktionary romanization to Wugniu? Other than changes in {{zh-pron}}, it would also involve changes in {{zh-x}} usage, translation tables ({{t}}, {{t+}}) and possibly uses in {{zh-l}} (though I haven't seen a lot of that). This probably requires some bot action (@Fish bowl, would you be able to help?)

I also want input from other editors who work on Wu (who may not be aware of the discussion that's been going on mostly on Discord) @Atitarev, Thedarkknightli, ChromeGames, Mteechan. — justin(r)leung _{{ (t...) | c=› }} 15:42, 11 August 2022 (UTC)[reply]

Thank you for notifying other Wu editors. The following are merely my opinions:

1. My plan was to keep Wiktionary in the expanded zh-pron menu and only display the Wugniu in the collapsed version. The scheme has seen some use outside of Wiktionary and I believe it may be beneficial to keep it around so that in the future the origin of this romanisation would be easier to find

2. In an idealistic sense I would be all for switching to Wugniu completely, however, this may require a lot of work. Therefore I would not be against keeping the input as WT but change the display to Wugniu. Alternatively, we can keep both (rename them {{w-wt}} and {{w-wg}} mayhap?), though that may pose other problems.

I am somewhat on the fence as to whether or not {{zh-x}} should switch to Wugniu display. On one hand, this is probably the common romanisation nowadays (excluding faux Pinyin), like Jyutping or POJ, but on the other hand, it is nowhere near as common as them due to Shanghainese not having large amounts of government support and what we will display not being "proper" Wugniu, as sandhi chains will only be able to display the tone in the head.

On the topic of sandhi chains, how would you like to see it displayed? I have a few ideas, do share more if you have them (the first in each line is my preferred choice):

Notate left prominent sandhi with a dash: eg. khu⁵-i 可以 / ⁶le-se 來三 / ciq-kuen⁷ 結棍 ?

Notate right prominent sandhi with a plus: eg. me¹+hau⁵ 蠻好 / ⁶lau+gaon⁶ 老戇 ?

It may also be beneficial for those who have not been to their official website to take a quick look around 義順 (talk) 16:12, 11 August 2022 (UTC)[reply]

I am relatively unfamiliar with most of the specifics regarding wiktionary's templates and modules so take this with a grain of salt - but I personally would not be opposed to keeping the wiktionary romanisation for prosperity and having two sets of input {{w-wt}} and {{w-wg}} since I myself am unfamiliar with the wiktionary scheme and find it somewhat hard to use, though I understand if having two input modules presents excessive technical challenges.

Regarding the "properness" of only displaying the tone in the head in {{zh-x}} - I would like to say instead that, how to actually notate tone (if at all) in large bodies of romanisation has always been a bit of a controversy of sorts. Writing out the tone for every single character can be arguably a bit redundant and hard to read at times due to the overarching prominence of tone sandhi and the substantial amount of words whose tone sandhi do not match up with the component character's individual tones (even more so for something such as Suzhounese potentially in future). Among some of the more influential romanisation systems (most of which were created online during the 2000s), the Wu Society (Wu-Minidict) recommended not writing the tone at all and just dividing the characters into vague meaning- and tone sandhi-based blocks. Wugniu being largely decentralised it is not uncommon either for people to omit the tones altogether when writing or to mark tone in their own ways (including simply giving the tone sandhi pattern for a word at times), from what I have seen.

I would prefer marking left-prominent sandhi such as ⁶le-se 來三 out of the three ways proposed mainly because I find it both easier to read and more aesthetically pleasing. With right prominent sandhi (or at least the more phonemic type that wiktionary takes into account afaik) I would suggest having the original tone on the left character too - i.e. as in me¹+hau⁵ 蠻好 since this type of tone sandhi does take the original tone of the character into account and so marking the left character's tone in the display might be helpful in this case.

— @Musetta6729 Musetta6729 (talk) 09:11, 12 August 2022 (UTC)[reply]

@ND381, Musetta6729: Thanks for your replies. For input, I don't see why we should have both Wiktionary and Wugniu in input because that's redundant unless they give us different information. In a transitional period, we could perhaps have |w-sh= (or something like that) as the new format in Wugniu so that the old format could still be there as |w= in some entries without breaking the whole template, but the end goal would be to remove all old format |w= if we are to transition to Wugniu for input. — justin(r)leung _{{ (t...) | c=› }} 15:58, 15 August 2022 (UTC)[reply]

The way I see it is that there are a couple steps here:

Show Wugniu in the panel but keep showing Wiktionary by default. Wugniu can show up by default or be under the expanded section
Show Wugniu as the default, with Wiktionary under the expanded section
Use Wugniu as the input to the template
Perhaps fully deprecate the Wiktionary one and stop showing it

We can probably start with Step 1 or 2 since it just requires updating a couple templates. We let it bake for a few months and then discuss moving to 3. ManishEarth^{Talk • Stalk} 23:20, 16 August 2022 (UTC)[reply]

Change at Lexico

A big change is afoot, according to a message on the website "We will be closing the Lexico.com website and redirecting it to Dictionary.com starting August 26, 2022". This has implications for references from them. DonnanZ (talk) 23:35, 11 August 2022 (UTC)[reply]

Anyone fancy writing a bot/script to find all our links to them and make archive.org or archive.is copies, at least in cases where we're <ref>-ing them for something specific and not just listing them in Further reading? (Or if they're moving all the content, perhaps just updating our templates to link to the new place / new name etc will work.) - -sche (discuss) 02:00, 12 August 2022 (UTC)[reply]

(And if we don't have a template that is amenable to this, now is the chance to introduce one.) Equinox ◑ 21:41, 12 August 2022 (UTC)[reply]

Admin inactivity

Hi, I happened to notice that there are a number of admins that would be considered inactive [1] per Wiktionary:Administrators#Removal_for_inactivity. Could a bureaucrat do the appropriate removals, since on this wiki they have the ability to do so? Rs chen 7754 01:31, 12 August 2022 (UTC)[reply]

@Chuck Entz my fave bureaucrat might look at these. Equinox ◑ 21:40, 12 August 2022 (UTC)[reply]

Assyrian Neo-Aramaic transliterations

Hi! (Tagging @Antonklroberts and @Shuraya, since it looks like that they're the most active editors for this language)

I've noticed that Assyrian Neo-Aramaic transliterations don't really follow Wiktionary transliteration rules for that language (my understanding is that they are the same as for Wiktionary:Classical_Syriac_transliteration). See for instance:

ܠܸܫܵܢܵܐ (liššānā): transliterated as "lišana", but should be "leššānā"
ܚܘܼܠܡܵܢܵܐ (ḥulmānā): transliterated as "ḵulmānā ", but should be "ḥulmānā"
ܐܲܚܵܐ (aḥḥā): transliterated as "aḵā", but should be "aḥḥā"

etc.

Is there anything that can be done to improve the situation? Unfortunately, I don't have the technical skills, but it would really be great if we had a transliteration module or something to grant consistency (like the Arabic one). Assyrian Neo-Aramaic consistently writes vowel signs, so it shouldn't be too impossible? Sartma (talk) 20:53, 12 August 2022 (UTC)[reply]

I agree, I’ve been trying to push to use Classical Syriac as the standard transliteration because of the different varieties of Assyrian Neo-Aramaic. There are different transliterations used but I believe we should add different dialects’ versions of words and plurals on the page instead of assuming one dialect to be the standard. Shuraya (talk) 20:59, 12 August 2022 (UTC)[reply]

Hello there, although I understand why Classical Syriac is sometimes considered a Standard Assyrian, it’s not consistently corresponding enough with the rest of the regional dialects and the already standardised dialect that we have called “Iraqi Koine”, I am going through quite a few articles and tweaking the transliterations slightly. I don’t think that going full Classical Syriac mirroring is the way to go because Assyrian Neo-Aramaic doesn’t even descend from Classical Syriac in the first place. In the future the ܠܸܫܵܢܵܐ article should actually be “lišānā”.

Antonklroberts (talk) 05:52, 13 August 2022 (UTC)[reply]

@Antonklroberts, @Shuraya: Thank you for your comments!

I believe some terminological clarifications are in order (sorry for not addressing that on my first message). For languages not written with the Latin alphabet we generally have 2 options: transliteration or transcription.

With transliteration we mean merely a letter to letter correspondence, a "conversion" from one script to another. Transliterations might not say much about pronunciation. "leššānā" gives exactly the same information as ܠܸܫܵܢܵܐ‎ (liššānāˀ‎), just in a different script. All pronunciation related issues should be dealt with in the Pronunciation section. There one can give all the different dialectal pronunciations. So when I wrote about following transliteration rules for Classical Syriac, I only meant that we should use the transliteration table I linked above, generally used for all languages using a Syriac alphabet.

On the other hand (and what I think @Antonklroberts is talking about), one can give a transcription. A transcription would be the representation of how a word sounds. In this case it wouldn't make sense to use Classical Assyrian rules, since Classical Assyrian is a different language, with a different pronunciation.

My suggestion would be to avoid a transcription (representation of sounds) and instead stick to the transliteration (script to script conversion), especially since different dialects might have different pronunciations, so it's better to stick to the letters, not to the sounds.

I don't know enough about Syriac dialects to suggest what to do with different plurals, etc, but why don't you two guys create a WT:AAII page, where you can discuss and agree on common rules for Assyrian Neo-Aramaic? (you can take Arabic as a reference WT:AAR). That would be a good place to specify that the variant given in aii entries is the Iraqi Koine, give a transliteration table you agree on, talk about different dialects, etc.

I for one would love to have a WT:AAII page to refer to! Sartma (talk) 11:39, 13 August 2022 (UTC)[reply]

That sounds like a good idea but I’m not sure how to create or edit one myself! Shuraya (talk) 11:54, 13 August 2022 (UTC)[reply]

@Antonklroberts, @Shuraya: It's just like any other Wiktionary page. You can write on it whatever you want, organise it how you think is better, and you can use the discussion page to discuss things and agree on what you want to do. I created the page for you (see WT:AAII), but you guys have to fill it up with content. Sartma (talk) 15:39, 13 August 2022 (UTC)[reply]

@Antonklroberts, @Shuraya: You could start with deciding whether you want to go for a transliteration or a transcription, and create a table like the one they have for Arabic. Sartma (talk) 15:42, 13 August 2022 (UTC)[reply]

Taggin also: 334a, Fenakhay, Metaknowledge, Fay Freak. You guys might want to help/have something to say? Sartma (talk) 15:52, 13 August 2022 (UTC)[reply]

Template:short for - redundant?

Do we need this template? Can somebody please provide three English definitions where neither {{clipping of}} nor {{ellipsis of}} is applicable? I clicked around on the what links here page but I only found definitions that should be changed to one of those templates. — Fytcha〈 T | L | C 〉 02:36, 13 August 2022 (UTC)[reply]

Yes, we do. I prefer it to the other two you mentioned, which I never use. DonnanZ (talk) 08:35, 13 August 2022 (UTC)[reply]

Okay, but what's the difference? Saying you prefer it doesn't explain what the difference is in function. Vininn126 (talk) 10:01, 13 August 2022 (UTC)[reply]

I have added an entry for short for, which might help you to understand this phrase. DonnanZ (talk) 13:28, 13 August 2022 (UTC)[reply]

I understand the meanings (I am an English native...) but I'm talking about the functions of the templates. Vininn126 (talk) 13:30, 13 August 2022 (UTC)[reply]

Misleading. A native speaker of American English is not an English native, from Stoke-on-Trent perhaps? DonnanZ (talk) 16:57, 13 August 2022 (UTC)[reply]

🙄 Vininn126 (talk) 17:04, 13 August 2022 (UTC)[reply]

@Donnanz: It's less specific than {{clipping of}} and {{ellipsis of}}. Are there actually definitions using {{short for}} where these two are not applicable? — Fytcha〈 T | L | C 〉 13:36, 13 August 2022 (UTC)[reply]

Fytcha has slapped an RFD on short for. Is he denying this phrase exists? Or trying to get his wicked way? DonnanZ (talk) 14:32, 13 August 2022 (UTC)[reply]

If you READ the RFD you'd see that it says SOP, might be better as a collocation. Vininn126 (talk) 14:37, 13 August 2022 (UTC)[reply]

In my mind, 'Wake' is neither an ellipsis nor a clipping of Wake Island or Wake County. --Geographyinitiative (talk) 14:54, 13 August 2022 (UTC)[reply]

That one is definitely an ellipsis, if you look at other words in the category. Vininn126 (talk) 15:05, 13 August 2022 (UTC)[reply]

@Geographyinitiative: OK. What are the grounds on which you make this claim? Equinox ◑ 15:06, 13 August 2022 (UTC)[reply]

Whatever you fuckers say (I love you really), apparently people aren't happy with the terms. We have got Wiktionary:Glossary and I think we could expand that. For example, is a "clipping" (not in glossary) something where we cut off the end, like "pedo" for "pedophile"? Or what if we cut off the first part? etc. Equinox ◑ 15:09, 13 August 2022 (UTC)[reply]

I think that is accurate, but I'm not sure that clipping is limited to just the ending. I also am under the impression that an ellipsis is when a multi-word term is shortened by X amount of words. Vininn126 (talk) 15:12, 13 August 2022 (UTC)[reply]

Yeah, our definition suggests it. Also reminds me of the "poetic" gloss, which has no relation to (post-)modern poetry at all but is really a way to say "weird, formal, nonce, I hate Edmund Spenser" and would never be used on anything newer than about 1900. Equinox ◑ 15:14, 13 August 2022 (UTC)[reply]

I have a kind of confused view on this issue because there are deeper concerns and confusions wrapped up in the issue for me. Of the three alternatives (short for/ellipsis/clipping), "short for" was the least worst alternative but none of these terms correctly describe the relationship. I have used 'ellipsis' elsewhere: Zhenbao/Zhenbao Island, but I like that less because it doesn't communicate in plain English in a way that I would immediately understand if I were coming to the website new. I never use 'ellipsis' or 'clipping' in this manner outside Wiktionary, because my deeper core issue is that, for me, short for/ellipsis/clipping only applies between two bona fide words. Since 'Wake County' would not be considered an actual word to the normal reader (as I understand them), there is some other kind of relationship between it and 'Wake'. --Geographyinitiative (talk) 15:24, 13 August 2022 (UTC)[reply]

So your issue is with linguistic jargon as a whole? Vininn126 (talk) 15:26, 13 August 2022 (UTC)[reply]

Let's break this down into separate issues: one of them is that you (maybe) don't want to use words that everyday users don't understand (I see this all the time: "ambitransitive" would barely pass CFI). And indeed most people who aren't word nerds probably won't understand "ellipsis" (but "clipping" is probably fairly obvious, and "short for" is totally obvious). If there is an issue with terminology, fine, but we need to talk about that one separately. Equinox ◑ 15:28, 13 August 2022 (UTC)[reply]

The word/concept "ellipsis" was part of my high school education. I don't think it's any more rocket science than transitive/intransitive honestly. — Fytcha〈 T | L | C 〉 15:32, 13 August 2022 (UTC)[reply]

Oh wait hold on, what I really mean above is that, to me, "short for" can apply between a word and a multi-word phrase, whereas 'ellipsis' and 'clipping' seem to be confined to word-word relationships to me. But that may not be the state of how these words are used in academia. --Geographyinitiative (talk) 15:35, 13 August 2022 (UTC)[reply]

And honestly the whole point of the glossary is to allow us to use these jargonistic terms. If the explanations are too little satisfying we should update that, instead of making a brand-new template. Vininn126 (talk) 15:37, 13 August 2022 (UTC)[reply]

Be tolerant of the poor English language skills of us USers (not allowed to say Americans). Ellipsis is probably too technical for most US college grads and is probably thought to be the plural of ellipse by many. DCDuring (talk) 21:19, 13 August 2022 (UTC)[reply]

And yet donnanz is from England and couldn't handle the terms... Curious... Vininn126 (talk) 21:21, 13 August 2022 (UTC)[reply]

Actually, they're from New Zealand, though they live in England. They also have a long history of ignoring logical and technical arguments to focus on their personal tastes. Chuck Entz (talk) 22:52, 13 August 2022 (UTC)[reply]

I was more going off of the the Great Britain English native claim in their babble box. Vininn126 (talk) 22:54, 13 August 2022 (UTC)[reply]

To put the record straight, NZers use British English and British spellings. DonnanZ (talk) 11:15, 14 August 2022 (UTC)[reply]

Having all three is redundant. I'm sympathetic to the point that "short for" is plainer English, but is it too much less precise to use? Someone could say e.g. is "short for" exemplī grātiā but I don't think that's what the template is for. Still, if people are wedded to "short for", could we (in the other direction) merge "ellipsis of" into it? Meh. No matter what we do, we still have to resolve the fact that some people aren't maintaining a distinction between any of these templates, e.g. wildcat is glossed as a clipping of wildcat cartridge when it'd be better to speak of it as either short for or an ellipsis of the longer phrase, since we define clipping as being for removing syllables of a word, not removing words from a phrase. - -sche (discuss) 15:55, 13 August 2022 (UTC)[reply]

What about 1600 Penn? The word "Pennsylvania" is clipped and the word "Avenue" is ellipsed, so both apply. On the other hand, the term BZ reaction doesn't involve either clipping or ellipsis. 24.137.99.97 16:08, 13 August 2022 (UTC)[reply]

Chappy is another example; the -y prevents it from being a clipping in a strict sense, and it has elements of an ellipsis too. All of these entries came from the first page of Category:English short forms and I didn't even have to look hard. 24.137.99.97 16:23, 13 August 2022 (UTC)[reply]

(PS I will get out of this conversation here (out of my intellectual league), but I want to say that if clipping and ellipsis are use to describe the Wake County-Wake relationship, I expect there will be complaints and confusion for as long as that stands as policy. I personally don't care because I see it as something that will eventually be fixed, so if I want to make a "short for" entry, I will use ellipsis or clipping to get to my general goal/intent for an entry. It's one of those "don't let perfection be enemy of the good enough" things for me. I reiterate that I support whatever conclusions are reached.) Geographyinitiative (talk) 16:32, 13 August 2022 (UTC)[reply]

Definitely redundant, irrespective of the snark from DonnanZ. Theknightwho (talk) 17:15, 13 August 2022 (UTC)[reply]

It's not redundant as long as there are terms that don't neatly fit into "clipping" or "ellipsis", as illustrated above. 24.137.99.97 18:07, 13 August 2022 (UTC)[reply]

A word can certainly be both, but I’m not sure I agree that Chappy isn’t a true clipping. The change of spelling isn’t hugely relevant. There are other terms which “short for” covers, too, such as contractions. There’s a reason we group all of these under shortenings (see Category:English shortenings). The category Category:English short forms is currently completely useless, though, as it’s a random assortment of terms that may or may not be easily categorisable into other categories, and if they aren’t, they should really just go straight into “shortenings”. There’s certainly no need for the subcategorisation, though. Theknightwho (talk) 19:14, 13 August 2022 (UTC)[reply]

What what what? Wiki is short for Wikipedia but is neither a clippong nor an ellipsus. Beaurocracy at it's best, why make things complicated, just describe it as a clipping. 109.40.242.91 21:28, 16 August 2022 (UTC)[reply]

Straw poll: quotation marks around the linked term in reference templates

Some reference templates look like this:

term in Source identification

Other reference templates look like this:

"term" in Source identification

There are also other appearances, but this is about the first one vs. the second one; it is not about whether the quotes are plain or curly or what kind of quotes they are since there are also other kinds.

We could unify the appearance if we find consensus for unification, hence this straw poll. Having a unified appearance is better if it can be achieved. I think that even 60% majority or 50%+ at large turnout should be decisive; to achieve that, if there is going to be no 2/3-supermajority but a plain majority, a subsequent vote could confirm the winner if voters agree that a plain majority is good enough for what is a matter of taste, not a matter of accuracy. There have been some forth and back as concerns quotation marks, and these would be solved. The subsequent vote, which I have called amplification vote, seems like a lot of bureaucracy, but it would be the price to pay for lack of policy for lower majority threshold for polling on cosmetic issues. Or there could be no subsequent vote and editors could feel free to make switches even if supported only by a plain majority in a poll. Either way, it is preferable to solve the issue if possible, which is outstanding probably over a decade, and there is a way if there is will, even if it takes a bit of bureaucracy.

The supporters of quotation marks say they should be there to mark the use-mention distinction. The opposers say that the term is already doubly typographically marked (blue color and link icon) and that adding the third mark is just more visual noise; the use-mention distinction does not present any significant barrier to anything: we place terms to lists of terms such as derived terms without quotation marks yet these are mentions, not uses; the boldface headword term is without quotation mark, yet this is a mention, not use; etc. Having no quotation marks obviates the choice of quotation marks style: there was some disagreement on which style to use on Ancient Greek reference templates, from what I remember. Other reasoning will probably appeal in the poll; this is a poll and a request for comments.

--Dan Polansky (talk) 18:29, 14 August 2022 (UTC)[reply]

Support term without quotation marks

Support Though I'll be ready to switch if this is too split. Either one beats both. brittletheories (talk) 08:18, 18 August 2022 (UTC)[reply]
Support Per the intro at the top of this poll; I hate visual noise. I admit that there may be reference templates where the target term is not linked, but these should be a tiny minority and they do not serve well as further reading. I give in to 55% majority and perhaps even less given sufficient turnout. --Dan Polansky (talk) 15:14, 18 August 2022 (UTC)[reply]
Support The link is already colored so I don't think the quotes add any additional information or value, but I admit that I had never noticed or cared that they existed until you brought it up here. JeffDoozan (talk) 19:22, 18 August 2022 (UTC)[reply]

Support term with quotation marks

You can use "# {{support}}".

Support Consistency with unlinked terms and {{cite-book}}, etc. J3133 (talk) 08:30, 18 August 2022 (UTC)[reply]
Yeah, I weakly prefer/support this. In this particular situation it doesn't seem to matter much whether there are or aren't quotation marks, although they do help a little bit to set the term apart, but having quotation marks here allows for consistency with other situations where there's more need to set the term apart in some way, like when it's not set apart by being a link, and/or when it comes at the end of a ===Further reading=== that someone has formatted with the word at the end (Example Dictionary, page 3, "word". - -sche (discuss) 09:35, 18 August 2022 (UTC)[reply]
The location of the word in another thing: it should be the first item, IMHO. This used to be our practice nearly everywhere, but unfortunately, someone started to push for putting the word in the middle, hence the current partial inconsistency. But this is only about the templates that have the form shown at the top of the poll, where the word is the first item. --Dan Polansky (talk) 15:19, 18 August 2022 (UTC)[reply]
Support This seems most appropriate, as we're mentioning the term and not using it. —Justin (koavf)❤T☮C☺M☯ 16:08, 18 August 2022 (UTC)[reply]
As I said, we're mentioning terms in list items such as derived terms as well, and there is no issue with lacking quotation marks. The double typography (bluelink, icon) already does all the marking job more than sufficiently. --Dan Polansky (talk) 16:14, 18 August 2022 (UTC)[reply]
But that marks something different: that it's a link. They are just completely different thing semantically. I'd be in favor of including more instances of quotation marks, but I don't see these as inconsistent. —Justin (koavf)❤T☮C☺M☯ 16:38, 18 August 2022 (UTC)[reply]
My main point is that it is not always true that mentions are or should be in quotation marks. The reader stands no chance to think it is a use of the term so there is no actual function of the quotation marks, merely a received dogma from those who say that mentions should always be in quotation marks. --Dan Polansky (talk) 16:47, 18 August 2022 (UTC)[reply]
As I just wrote above you, I don't see an inconsistency here with some mentions having them and some not. If they're used in running text or in a phrase, that is different to me than a list (e.g.). —Justin (koavf)❤T☮C☺M☯ 16:54, 18 August 2022 (UTC)[reply]
I am not raising inconsistency; I am raising lack of compulsion to use quotation marks. But it is a matter of taste, and if you actually like them, they are fine. My point is that the use-mention distinction has no force on our use of quotation marks there; they are optional. --Dan Polansky (talk) 17:15, 18 August 2022 (UTC)[reply]
Support Thadh (talk) 17:48, 18 August 2022 (UTC)[reply]
Support, per reasons already given. AG202 (talk) 20:49, 18 August 2022 (UTC)[reply]
Support, but only if we'll be able to choose what kind of quotation marks we want depending on the language. English quotation marks on Japanese entries would be a no for me (i.e., not this: "言わす"). They would have to be Japanese quotation marks (these: 「言わす」). At the moment we have English quotation marks on Sumerian Cuneiform entries, for example, and they just don't look right (see for instance 𒂼𒅈𒄄). I'd like to consider other options for that too.
Support; looks most natural to me. The issue that the unsigned opinion just above me brings up (customizing the kind of quotation marks per language) is not hard to solve; you just need a table somewhere (e.g. in Module:languages/extradata2 etc.) that specifies the per-language quotation marks. Benwing2 (talk) 06:44, 24 August 2022 (UTC)[reply]

Delay of the 2022 Wikimedia Foundation Board of Trustees election

You can find this message translated into additional languages on Meta-wiki.

More languages • Please help translate to your language

Hi all,

I am reaching out to you today with an update about the timing of the voting for the Board of Trustees election.

As many of you are already aware, this year we are offering an Election Compass to help voters identify the alignment of candidates on some key topics. Several candidates requested an extension of the character limitation on their responses expanding on their positions, and the Elections Committee felt their reasoning was consistent with the goals of a fair and equitable election process.

To ensure that the longer statements can be translated in time for the election, the Elections Committee and Board Selection Task Force decided to delay the opening of the Board of Trustees election by one week - a time proposed as ideal by staff working to support the election.

Although it is not expected that everyone will want to use the Election Compass to inform their voting decision, the Elections Committee felt it was more appropriate to open the voting period with essential translations for community members across languages to use if they wish to make this important decision.

The voting will open on August 23 at 00:00 UTC and close on September 6 at 23:59 UTC.

Best regards,

Matanya, on behalf of the Elections Committee

Mervat (WMF) (talk) 20:56, 15 August 2022 (UTC)[reply]

Wrong entry

Hi, I made a mistake due to not being used to the new editor UI. Can someone please remove Speciaal:Zoeken? TIA H. (talk) 08:17, 16 August 2022 (UTC)[reply]

In the future you can mark it with {{d}} or have it moved. Vininn126 (talk) 08:39, 16 August 2022 (UTC)[reply]

Certain Labels as LDL's

One thing I have been pondering about lately is treating certain varieties of given languages as LDL's. This would include dialects of Polish, which often have very few works written on them, let alone quotes, but these varieties are very much real. This would also affect Middle Polish, which is categorized as a variant of Modern Polish with a label, instead of as a separate L2, as there would be a lot of repetition. However, documents from that era are not abounding.

Having said that, I wonder what the ramifications would be of such a change. We'd obviously have to be careful which variants we apply this to, if we decide that such a thing would be beneficial. Vininn126 (talk) 09:53, 16 August 2022 (UTC)[reply]

The same is true for Armenian dialects, the majority of which have now disappeared with only one or two works describing them. I think we should blanket-classify all dialects of all languages as LDLs or add a note to WT:WDL saying it concerns only the standard written variety. Non-dialects like Middle Polish can be handled case by case. Vahag (talk) 10:35, 16 August 2022 (UTC)[reply]

I've suggested this here. P U C – 19:17, 16 August 2022 (UTC)[reply]

I’m keen on this, yes. There’s no reason some regional English dialectal term should be held to a higher standard simply because it’s part of a large language, even though the region it’s used in might be less than a million people, for example. There’s also the simple fact that dialects are often much harder to attest anyway - especially historically. Theknightwho (talk) 19:29, 16 August 2022 (UTC)[reply]

Strong support on my end. The current policy makes it a lot harder to cite usages of minority lects that may only appear in dictionaries or in mentions, ex: Louisiana French (not Louisiana Creole which is another thing). AG202 (talk) 21:19, 16 August 2022 (UTC)[reply]

I feel the same way about English seamstresses, lumbermen's, cobblers', DARPA researcher's, etc. vocabularies. Let them all be LDLs. There are probably more users of each such vocabulary than there are of, say, Torre Straits Creole. DCDuring (talk) 21:44, 16 August 2022 (UTC)[reply]

Hm? What does Torre Straits Creole have to do with this, and wouldn't having more users give them less of a rationale for being an LDL (not that they necessarily shouldn't)? AG202 (talk) 01:10, 17 August 2022 (UTC)[reply]

I am pretty sure that there are more users of lumbermen's terms that there are speakers of some things we call languages (eg. Torre Straits Creole), let alone dialects. Lumbermen don't normally get published in newspapers, books, or scholarly journals; nor do they much use UseNet or Twitter. Thus our attestation criteria are certainly biased against including their vocabulary, just as LDL vocabulary. I would like to see more effort to include such terms using the same criteria that we use for LDLs, minimal attestation, inclusion in specialized glossaries, etc. DCDuring (talk) 01:21, 17 August 2022 (UTC)[reply]

I think that's a good idea. There are small-but-stable communities with longstanding terminology that is often very difficult to attest in a durably archived way. Theknightwho (talk) 13:48, 17 August 2022 (UTC)[reply]

While I'm sympathetic to this idea — it's undesirable that when deciding whether to merge a dialect into its parent language, we don't just have to consider linguistic factors but also the fact that it might mean deleting most of our entries by suddenly requiring them to have three uses instead of one — the obvious danger is that this would, especially with DCDuring's suggestion of allowing jargons, just remove the WDL three-cite standard entirely. Almost any word we can find one cite of can be claimed to be a dialectalism of whatever region its author is from, or whatever jargon the work's topic is associated with. And how would you prove that a term isn't a dialectalism / jargon word but instead a rare but non-dialect-specific word, if there's only one cite of it? - -sche (discuss) 04:51, 17 August 2022 (UTC)[reply]

This would imply one citation be enough. The three citations rule, however, is needed to avoid trashing the dictionary with idiosyncratic coinages. The crux is that the rules demand sources to be reliable and durably archived. An unfortunately consequence would be arguments about the reliability of certain citations, for example when there is no standard body orthography to follow or when the meaning is unclear. One citation at foreslay seems to be express nonsense (WT:TR) no less from a professor with a penchant for poetry, the word could refer to sley (of reed), meaning that the spelling can make a difference. If credentials alone don't make for a reliable citation, it's usually the editor who needs to be reliable at the least. This is in line with the premise that LDLs are decided case by case, by reliable editors.

So, there can be no catch-all rule to cover the Middle Polish issue. It does have the advantage of being a relic, but the current method implies that the words may still be found in elevated registers. 109.40.240.91 05:31, 17 August 2022 (UTC)[reply]

I believe any quotations would have to follow our current CFI - as in being an obvious representative of that definition. And 109 brings up a good point. We already have to use our judgement with WDL's and even then we can't always agree. Vininn126 (talk) 08:59, 17 August 2022 (UTC)[reply]

Although I am very much a proponent of making dialects citable, I must agree with @-sche: There is no objective way to determine what is a language and what isn't, nor is there any way to determine where a historical variety stops and the modern variety starts. As such, I propose the following:

Any print cite before 1800 (or any other similar cut-off date, but one true for all languages) would be subject to the same requirements as extinct languages.
Dialects/Jargons will need to be defined per-language using scientific research as a back-up and forming a well-defined dialect tree beforehand. The dialects will then need one use, but that use will need to be either in a text completely written in that dialect, or mentioned in a scientific work that specifically addresses this dialect, after the work has been accepted as authoritative by the language's community.

This way, we are conscious of the sources we use for the languages, and also don't dabble too much in OR. Thadh (talk) 09:37, 17 August 2022 (UTC)[reply]

I think this is (somewhat unfortunately) the smarter approach. I am not a fan of the blanket approach. I think within each language it should be determined based on what (little) research has, using said research as a backup for our claims on dialectology. I am unsure about using 1800 as a cutoff, but given the history of the printing press there might be a certain logic to that. I still think it might be easier based on linguistic historical divisions as opposed to a date (i.e. Middle x, archaic/obsolete x, etc). Vininn126 (talk) 09:42, 17 August 2022 (UTC)[reply]

Our CFI for LDLs actually stipulate that “the community of editors for that language should maintain a list of materials deemed appropriate as the only sources for entries based on a single mention”. Apparently this is not actually applied much that I know of, but applying this rule could solve many of the issues raised above. — This unsigned comment was added by MuDavid (talk • contribs) at 09:47, 17 August 2022 (UTC).[reply]

Well, without such a list we need two mentions, which does help. It's unfortunate that they don't need to be independent, nor in any sense reliable. However, LDLs only need one use, and the applied quality issue there is of durability. I don't know how we challenge for reliability. Formally, there is no list of acceptable sources for use, and there is always a risk of spelling mistakes (cerebellar interferences?), and typographical errors if typeset. --RichardW57m (talk) 12:58, 17 August 2022 (UTC)[reply]

I could maybe see allowing dialectalisms with one mention in a reference work (to establish that it is particular to that dialect) and one use (to establish that it is/was actually in use and correctly transcribed by the mention). A big danger with allowing just one use is that even if a work is completely written in X dialect, it's impossible to tell whether its hapaxes are nonces by the author (which we voted long ago to stop allowing even when the author was well-known) or actual dialectal terms, unless we also have a mention in a reference work about the dialect assessing it as a dialectal term. In turn, a danger with allowing just one mention (with no use) to attest a term is that there are a lot of 1700s and 1800s works documenting ostensibly-then-current but now-unused English dialectal terms which are otherwise completely unattestable, so we have no way of knowing whether one was correct in transcribing the word or assigning it to a particular dialect, but I have occasionally found cases where multiple such works mention a word (and assign it to some dialect) in sufficiently different forms that it suggests one of them is misspelling it, or where there seems to be some other kind of error (and we've seen at RFV that dictionaries sometimes copy erroneous ghost senses or words from each other); if we let words in with one mention and no use, we have no way to discern such errors or weed them out. (Distinguishing authorial or work-specific idiosyncratic definitions — or outright errors, think how many (grammar)-jargon reference works define a proper noun as a noun that's capitalized — from genuine use also seems hard if we're talking about jargon terms.) - -sche (discuss) 16:53, 17 August 2022 (UTC)[reply]

I'd be more sympathetic to allowing just-one-use dialectalisms in Polish or Armenian, if that's what the community of Polish or Armenian language editors think is best there, than in e.g. English, or German (where the "dialects" are often languages which have their own codes already), because of the issues outlined above, so maybe this should be a language-by-language thing. (I agree with Thadh we'd need to make sure the dialects are recognized dialects; consider the ongoing discussion about how our "Westrobothnian" entries aren't a thing but just a few hobbyists' invention.) - -sche (discuss) 17:03, 17 August 2022 (UTC)[reply]

I don't think you can generalise by language like that. I'm certain that a fair proportion of users here with a very high-level of English proficiency would not be able to understand a Geordie speaker, for instance, and I include native speakers in that. Doesn't mean I think it should have its own language code, though. Theknightwho (talk) 17:34, 17 August 2022 (UTC)[reply]

So the working proposal as it stands is to modify the CFI in a way such that a specific community of editors within a language can come to a consensus whether certain lects within it should be LDL or WDL? Vininn126 (talk) 18:27, 17 August 2022 (UTC)[reply]

Only problem here is that looking at WT:WDL, a lot of these languages do not have active communities at all. AG202 (talk) 18:33, 17 August 2022 (UTC)[reply]

And also like a lot of potentially big changes to something decided by a smaller group. Vininn126 (talk) 18:45, 17 August 2022 (UTC)[reply]

Breathing life into this thread again.

One possible proposal: have a vote proposing to change WT:Attest to have a clause saying "a community of editors may choose to set a variety of a language as and LDL and provide a list of potential resources on the appropriate About Page by coming to a consensus (on said about page).

Proposal 2: Just set all non-standardized/non-official varieties as LDL's. I still think providing a list of resources here would be a good idea. I prefer option 1. Vininn126 (talk) 08:06, 22 August 2022 (UTC)[reply]

Transcription of foreign names in Chinese compounds/derived terms

In the derived terms section of 特, 克, 爾, 蘭, etc. there are many transcription of foreign names, which are usually transcribed based on some rules, such as the one documented at w:Transcription_into_Chinese_characters#Transcription_table. Often these names only take the phonetic value of the characters, and contribute very little, if not none, to the entry. They end up cluttering the section and hiding the genuine derived terms. What should we do with them? (Removing them? Use a category instead? A separate table? Keeping them?)

A few details I would like to see further input:

How should we determine which words are to be removed/changed? Some of the names also convey meaning since they are PSM, and somehow probably should be included. (e.g. 碧仙桃, 翡冷翠, also 首爾 is a partial PSM)
Should the scope of this change include transcriptions that predate the standardised method, e.g. names of some European countries (英吉利, 法蘭西), short forms (英國, 法國)), or names from the surrounding non-Sinosphere areas (天竺, 暹羅, etc.)?
What about compounds of these? (e.g. 凡爾賽條約, 荷蘭水)

(Note: this problem was briefly discussed on Discord, and we generally agree that something should be done about the situation, but we're not sure about the exact details, and it's perhaps better to have a further in-depth discussion here, where more people can participate)

(Notifying Atitarev, Tooironic, Fish bowl, Justinrleung, Mar vin kaiser, RcAlex36, The dog2, Frigoris, 沈澄心, 恨国党非蠢即坏, Michael Ly): -- Wpi31 (talk) 13:19, 16 August 2022 (UTC)[reply]

@Wpi31: I think a good alternative would be making a separate sub-entry for it under a different etymology maybe? With the definition of "used as transliteration for foreign languages", then placing all the compounds under that. --Mar vin kaiser (talk) 13:37, 16 August 2022 (UTC)[reply]

@Wpi31: I was also thinking of something similar to Mar vin kaiser. PSMs would be left in the relevant section rather than the transliteration section. — justin(r)leung _{{ (t...) | c=› }} 14:05, 16 August 2022 (UTC)[reply]

Yeah, I think separate tables could be used for the difference senses. So the ones that are derived from transcriptions of foreign names can be put in a separate table. The dog2 (talk) 14:20, 16 August 2022 (UTC)[reply]

I have proposed creating a category for these “Common characters used in transliteration” in the Discord server, perhaps that could be implemented in place of what currently exist? Though of course there would be lots of more specific things to iron out first — 義順 (talk) 14:23, 16 August 2022 (UTC)[reply]

@ND381: While I think that would be a good idea, I don't think it would be a full replacement for the current practice because we would lose the information of which compounds use those characters. I think there could be a template that could be used to generate a definition similar to the one that Mar vin kaiser suggested and at the same time categorize the character into the category you proposed. — justin(r)leung _{{ (t...) | c=› }} 14:50, 16 August 2022 (UTC)[reply]

@justinrleung, ND381, Mar vin kaiser I have made a mockup of this idea here based on 托. The code for template will be {{n-g|Used in transcription of foreign names.}}[[Category:Chinese characters used in transcription of foreign names]], which will be created later (but I'm not sure what the template should be called, so please do make suggestions).

I am wondering whether we want to keep words such as 托辣斯 under the first etymology, since these words are not proper nouns, but the use of 托 in these words do not carry its original meaning, only its sound. (or edge cases like 烏托邦 which are phono-semantic matching)

What do you think of this solution? (Personally, I don't feel like duplicating the pronunciation table would be something ideal) --Wpi31 (talk) 13:52, 18 August 2022 (UTC)[reply]

@Wpi31: Actually, I just remembered that we already have a template {{zh-used2|t}} which would work. I don't think the compounds list needs to only include proper nouns. As for duplicating the pronunciation table, I think not all the readings in etymology 1 would be used (thuh in Hokkien is unlikely to be used, for example), so it might be useful to have separate pronunciation tables. — justin(r)leung _{{ (t...) | c=› }} 15:06, 18 August 2022 (UTC)[reply]

Interesting find on that template, I have updated the mockup accordingly, and also made some other changes. But would the template have categorization, as suggested above? (or should we add that to the template now?)

Also I suppose a good rule of thumb of what to move to the new tables would be whether the compound word only uses the character's pronunciation or not?-- Wpi31 (talk) 02:31, 19 August 2022 (UTC)[reply]

@justinrleung since there isn't any further response on this matter, could I therefore assume that some sort of consensus was reached, and hence start implementing the changes (the ones based on your suggestions)? Wpi31 (talk) 15:19, 26 August 2022 (UTC)[reply]

@Wpi31: I think so. The categorization isn't part of {{zh-used2}} yet, but it shouldn't be hard to add on to the template. I guess we would just need to agree on what to call the category. Probably CAT:Chinese characters used in transcription of foreign words, which would be a category under CAT:Chinese terms by usage? — justin(r)leung _{{ (t...) | c=› }} 18:32, 26 August 2022 (UTC)[reply]

It would be good for any implementation to have an optional qualifier of some kind, as there have historically been various systems (for e.g. Manchu etc.). Theknightwho (talk) 19:32, 26 August 2022 (UTC)[reply]

@theknightwho: That's probably a good idea, since there have been several systems for transcriptions, and it makes sense to distinguish them. Most of the ones in scope are modern transcriptions based on the Xinhua system (which should probably be subdivided by the original language), meanwhile there are also equivalents used in Taiwan and other variants. Further back in the 19/20th centuries there are ones orthographically borrowed from Japan and ones that are based on the phonology of Cantonese and Shanghainese. Then there are the historical transcription systems for Sanskrit, Tibetan, Mongolian, Khitan, Manchu/Jurchen, Thai, etc., some of which are still used in modern times…… Sometimes each dynasty would have a different system, but there isn't too many words from these languages to begin with, plus it is difficult and time-consuming to tell which exact system the word falls into, so I think it is sufficient to only divide them by language.

@justinrleung: For the categories, I agree on CAT:Chinese characters used in transcription of foreign words. Then the categorisation for the various systems (if they were ever implemented) would be subcategories of that, say CAT:Chinese characters used in transcription of English words for example, and maybe with CAT:Chinese characters used in modern transcription of English words and CAT:Chinese characters used in historical transcription of English words as subsubcategories, where the cutoff point between the two would be same as the switch from Classical Chinese to Modern Standard Chinese. For minor languages without a specific transcription method, I don't think there is the need to put them into subcategories, unless the transcription follows the phonology of other languages (most of the time English) instead. For modern transcriptions, I don't think we need to make further subcategories, perhaps with the exception of the Xinhua ones. Wpi31 (talk) 03:05, 27 August 2022 (UTC)[reply]

Special characters tool spacing

The new spacing in the special characters tool is really annoying, IMO. Is there a discussion on this? Skiulinamo (talk) 21:21, 18 August 2022 (UTC)[reply]

Wiktionary:Grease pit/2022/August#The_combining-diacritic_section_of_the_IPA/enPR_character-input_menu_is_borked. I agree it would be good if it were possible to find a functional middle ground between "excessively spaced" and "unclickably zalgo'd on top of each other". - -sche (discuss) 21:58, 18 August 2022 (UTC)[reply]

Meaning of underlined formatting in Wiktionary Thesaurus entries

Can someone please remind me what is the meaning of underlined formatting in Wiktionary Thesaurus entries? No doubt it is already explained somewhere that my searches are just failing to uncover. In a list of synonyms, for example, a few of them will be lightly underlined with a thin underline. Quercus solaris (talk) 17:36, 19 August 2022 (UTC)[reply]

Do you mean such as the example of "beer" here? In that case, it's because there is a tooltip, so you can hover your cursor over it to get additional information. —Justin (koavf)❤T☮C☺M☯ 17:41, 19 August 2022 (UTC)[reply]

Aha, that was it, thanks. I did not realize it because the popups from the widget preferences are so prominent whereas the other kind of popups are quite small. Quercus solaris (talk) 20:32, 19 August 2022 (UTC)[reply]

Political and non-durably archived online blogs in Further Reading section

[2] How do others feel about this? WT:EL#Further reading clearly states: "This section may be used to link to external dictionaries and encyclopedias, (for example, Wikipedia, or 1911 Encyclopædia Britannica) which may be available online or in print." — Fytcha〈 T | L | C 〉 18:34, 19 August 2022 (UTC)[reply]

Can you clarify what you mean by this being "political"? —Justin (koavf)❤T☮C☺M☯ 19:08, 19 August 2022 (UTC)[reply]

Often an academic blog will contain some etymological assay, and the format of a blog is neutral towards the quality of its content. The linked entry layout section hardly allows to deduce a rule about a type of publication being disallowed. Neither is this “Language Log“ of political in its general direction, though often too tight to gain agreement, nor in the specific post—but it is enough to link it at conspiracy and it has little use at conspiracy theory. Fay Freak (talk) 19:44, 19 August 2022 (UTC)[reply]

Reading that section strictly, it seems that that link would be better served in the References section rather than Further Reading. AG202 (talk) 20:16, 19 August 2022 (UTC)[reply]

It's not clear to me what is "political" about the blog (just the fact that users of the word it discusses include politicians?); Breffni O'Rourke is a linguist, Mark Liberman is a linguist, and Language Log is a well-known language blog.
From my perspective, the main reason not to include this link in further reading is not politics or durability, it's that it doesn't add anything as just a link at the end of the entry, with no explanation of why it's included, to a short blog post that does little more than restate what sense 6 of conspiracy already says. Perhaps we should try, perhaps I will try, to write some simple usage notes about the scope of conspiracy and conspiracy theory and how some speakers confuse them, using Language Log as one ref for that, along with the fact that citations of such use exist and are under the relevant sense, and along with other refs for other aspects, e.g. this article pointing out that "the phrase is regularly used to describe fringe views that do not involve alleged conspiracies [while simultaneously] people tend to avoid using the phrase when describing conspiracy claims embraced by the mainstream, even when those ideas are highly dubious [like] popular narratives about terrorism". - -sche (discuss) 20:34, 19 August 2022 (UTC)[reply]

Adding it as a reference is probably stronger and has more utility, but until/unless it's added as an explicit reference for a certain claim, I think it's hi-value and useful to have this as a link. Language Log also has good comments as well. —Justin (koavf)❤T☮C☺M☯ 21:15, 19 August 2022 (UTC)[reply]

Almost any academic source discussing something that touches on politics will reflect the views of the authors and participants, which usually reflects their class(?) interests. But such sources can have useful content, as this one seems to be, notwithstanding the customary gratuitous solidarity-reinforcing expressions of hostility toword favorite targets. DCDuring (talk) 23:56, 19 August 2022 (UTC)[reply]

On second thought, I was probably jumping the gun when I called it political. Still, it should definitely not be included in the Further Reading section (that's just not what we generally do in that section; the policy page also clearly states that). It can however be changed to a reference if it is used to support a specific claim within the article (which doesn't appear to be the case for conspiracy theory though). See also WT:EL#References. — Fytcha〈 T | L | C 〉 13:12, 21 August 2022 (UTC)[reply]

WT:EL does not say what cannot be included in this section, it just gives examples of what can be. —Justin (koavf)❤T☮C☺M☯ 18:34, 21 August 2022 (UTC)[reply]

Hyphenated Forms

For all Wiktionary's vaunted notions of descriptivism, I can play a child's video game and encounter alternative forms that I can easily predict Wiktionary will not have an entry for; Internet Archive will have three cites for these alternative forms too (see my edits on the History on these pages). You can do it too- any time you notice a word with a hyphen in your daily life, it's a fifty-fifty chance Wiktionary will not have it.
antiindependence anti-independence
autocorrection auto-correction
electrochemically electro-chemically
extrajudicial extra-judicial
extravehicular extra-vehicular
halfway half-way
hypergrowth hyper-growth
microprocessor micro-processor
recalibrate re-calibrate
sociotechnological socio-technological
suborbital sub-orbital
superhot super-hot
rearm re-arm --Geographyinitiative (talk) 14:11, 20 August 2022 (UTC)[reply]

Well, go ahead and add them as alt forms. Personally I'm more interested in adding words we lack entirely. Note that, over time, English has tended to drop the hyphen from a lot of words (e.g. type-writer and good-bye are now rare forms). Equinox ◑ 17:39, 20 August 2022 (UTC)[reply]

(Relatedly I think it's stupid that we delete hyphenated attributive forms but allow hyphenated non-attributive forms. Why not have soft or at least hard redirects for them all?) - -sche (discuss) 02:25, 21 August 2022 (UTC)[reply]

As far as I'm concerned, the attributive forms should've never been deleted. They're formulaic, sure, but so are plurals and we still include those. Binarystep (talk) 19:26, 21 August 2022 (UTC)[reply]

Some of the discussion above indicates that alternative spellings of words might sometimes be noticed but passed over. This seems like a glaring flaw in Wiktionary: you wouldn't want to omit some forms of words in a bona fide descriptivist dictionary enterprise- all forms (with qualifying cites) would be included. I propose that a script or mechanical method for detecting words that may have alternative forms with hyphens in them be created, and that the results be dumped onto some page or something. Maybe there are certain categories that have words that are likely to have hyphenated alternative forms. Any accidental systemic bias affecting coverage of alternative forms on Wiktionary needs to be mitigated to some degree! --Geographyinitiative (talk) 15:49, 24 August 2022 (UTC)[reply]

Galician Reintegrationism and how Wiktionary deals with it

Wiktionary's appendix on Reintegrationism states that Reintegrationism is treated here in Wiktionary as an alternative spelling, and that's okay. However, I feel like there are a few more considerations to be made that aren't covered by the appendix yet and I think that, in a way, it's not really treated as alternate spellings in practice.

In cases like corazón and coraçom, they work fine as alternates of each other. But then, seeing as Reintegrationist Galician has its own system for conjugations and things like that, I think this 'alternatives' approach sort of breaks down. There are pretty much 0 pages made for Reintegrationist spellings for verbs -- they should most definitely be made and I'm sure we can all agree on that, especially with Wiktionary's "vaunted notions of descriptivism" as someone else put it, but then, how do we approach that?

Do we add them as alternates in the main conjugation tables and get things like "amaban/amavam"? But then what about verbs where the infinitives aren't the same, like pasar/passar? The tables would have to be overhauled so they convert the spellings from one ortography to the other, but that sounds like a lot of effort.
Do we make a new conjugation table for it? But then what about verbs where the infinitives do stay the same in both orthographies, like amar? Would cases like that just get two tables? And how would passavam#Galician actually look like? Making an entire table of links that all point to descriptions like "alternative form of XYZ" doesn't seem too great nor very equal at all. This goes for the previous point too; showing two forms and making sure that one is clearly an alternate version of the other form instead of them both being alternatives of each other isn't particularly amazing.
Norwegian has two spellings as well, and they're treated here as if they were separate languages, in a way. In both cases, the ortographies are regulated by associations and their dictionaries. ...But out of the two languages, only one has both of its spellings be 100% official. But then again, in both Norwegian and Galician, one of the spelling variants is only used by a minority and seen as (slightly?) harder to comprehend by the majority... Could this approach work here?

The appendix also says that "usually, the traditional[sic] spelling is the main article". Does that mean it's OK to create the reintegrated spelling page as main article in cases where previously, neither spelling was registered in the Wiktionary?

I'm sure I could have worded this entire post better, but I think all of these are matters worth discussing. How do we want to treat Reintegrationism here? Because despite what the appendix says, there's a really clear difference in treatment between the two variants here at the moment; it's very different from how, say, European and Brazilian Portuguese variants are treated. There aren't even proper tags set in place that talk about whether the reader is looking at a Reintegrationist/AGAL/International spelling or an Isolationist/ILG/RAG one (there are tags like that for EU and BR Portuguese). There's no template set in place for the Estraviz dictionary the way there are some for English dictionaries or for the RAG dictionary. 191.255.100.241 21:14, 20 August 2022 (UTC)[reply]

Do you think that Galician should be treated as a dialect of Portuguese? For what it's worth, I also think that our treatment of Norwegian standards as essentially different languages is off-base as well. —Justin (koavf)❤T☮C☺M☯ 21:23, 20 August 2022 (UTC)[reply]

Wiktionary sees Galician as a separate language, and at a political level, it is one at the moment, I feel. Considering how there's even a Galician Wikipedia/Wiktionary, I think it'd be best (and simpler) to respect that and keep Galician separate from Portuguese here. 191.255.100.241 22:51, 20 August 2022 (UTC)[reply]

With varying orthographies, the first issue is, seeing as though the reintegrationist orthography is the minority usage by a lot, I'd be a bit more concerned about citing it in the first place (though TIL that Galician is not a WT:WDL). As for the conjugation tables, I'd point to the 1990 differences in French orthography and how the different spellings, ex: arguer vs argüer have different conjugation tables (illustrated much better on fr.wikt). I personally wouldn't make them alternative forms of each other as one clearly is official and has more usage from the general population. Dialectal forms are often already considered alternative forms, and there've been conversations recently about the bloat that happens when definitions and data are repeated. The case of Norwegian is truly an outlier; in my honest opinion, there really shouldn't be 3 separate L2 headers for Norwegian, Norwegian Bokmål, and then Norwegian Nynorsk; they really should've just been labels under a Norwegian header, but alas, consensus said otherwise. (Meanwhile, on the complete other end of the spectrum, we have how Chinese is treated...) AG202 (talk) 21:44, 20 August 2022 (UTC)[reply]

It's true that the reintegrationist orthography is the minority one, but it's not a mere dialectal thing nor, from my experience, is it actually that rare when compared to official Galician. It's weird that Galician isn't a WT:WDL (TIL that too!); I wonder if its status as an actual separate language being debatable is a reason for it not being there. Perhaps it's just that nobody's tried to put it there yet? But I digress. Reintegrationism really isn't a dialect as it's not regional at all and despite the disparity in usage between the two forms, it's not like you can count AGAL spelling users on your fingers -- it's a little like Norwegian Nynorsk users; a minority, but not a teeny tiny one. There are several websites utilizing the reintegrated form, some people on social media do it too, and even then, Wiktionary includes misspellings as separate entries. Among the three options I talked about in the original post and the ones you mentioned/hinted at, I'm really undecided as to which one should/could be used -- maybe there's even some other option I didn't think about -- but keeping it as-is or not including it at all should be off the table. 191.255.100.241 22:51, 20 August 2022 (UTC)[reply]

Oh yeah, I'm in support of separate entries, and that's the current practice. I was mainly focusing on whether or not the entries would be full lemmas with definitions and such or if they'd just use {{alternative spelling of}} or something similar, similar to English centre vs. center, Portuguese oxigénio vs. oxigênio, or Korean 여 (yeo) vs. 녀 (nyeo). You're right they're not dialectal forms, though, and there is a difference. AG202 (talk) 01:22, 21 August 2022 (UTC)[reply]

First of all, that appendix is the work of a single user from India who has no background in the language, and has a a less-than-stellar edit history. Normally, I would refer you to a page called Wiktionary:About Galician, which would have an explanation of the practices decided on by the community of Galician editors- but there isn't one. This is odd, since we have Galician editors who have been regular, prolific contributors for a very long time. I'll ping a couple of them @Sobreira, Vivaelcelta.

At any rate, we tend to defer to the community of editors for a given language on such matters, except in rare cases where there are larger issues of how to treat lects that some treat as dialects and others treat as separate languages: Serbo-Croatian vs. Serbian/Croatian/Montenegran, Norwegian vs. Norwegian/Norwegian Bokmal/Norwegian Nynorsk, Albanian vs. Tosk/Gheg, and Malay vs. Malay/Indonesian come to mind. I suspect that no one in a position to have an informed opinion has really given much thought to the issue as far as addressing it on Wiktionary is concerned. Chuck Entz (talk) 23:15, 20 August 2022 (UTC)[reply]

AFAICT, it seems like this can be handled the same way as French reformed vs unreformed spellings or Moldovan Cyrillic vs Latin-script Romanian, where each attested(!) spelling has an entry (with all the others defined as alternative forms, of whatever stripe, of whichever main form) and its own inflection table; we could make a specific template for "reintegrationist spelling of..." like we have "superseded spelling of...", specific form-ofs for Moldovan Cyrillic spellings, etc. For cases where the reintegrationist vs classical spellings of the lemma are the same but the conjugations differ, we could have multiple tables; compare e.g. balneum (there must be better examples of entries with multiple inflection tables but it's hard to find offhand). If the classical(?) spellings are more common than the reintegrationist spellings, they should be the lemmatized spellings (or vice versa). The Norwegian situation, where the least standard terms from dialects which don't fit either Nynorsk or Bokmal orthography are the (only) ones presented as pure ==Norwegian==, is comical but liked by some Norwegian contributors, so whatever. - -sche (discuss) 02:23, 21 August 2022 (UTC)[reply]

I am have similar issues with inflections of Pali in both the Thai and Lao scripts, where Thai has two live spelling systems and Lao has many spellings. I've set the inflection template options up for them so that they can generate multiple systems; whether they do so depends on the options in the specific call. For verbs, I can merge the combinations of two stems in the same table; this ability was originally provided for cases where the verb is a free mix of two stems in a particular tense. I think I've overdone the merger of the two systems in the past, and I'm now tending to set up individual inflection tables for the different writing systems.

Now, one could set up individual conjugation tables for non-integrationist and integrationist orthographies, but for Portuguese one could consider merging them in general but having separate tables for the imperfect, where the spellings massively diverge. Footnotes in the table would help for the other cases; see examples for the contracted perfects of Latin verbs like amo. --RichardW57 (talk) 16:48, 21 August 2022 (UTC)[reply]

I will only comment on a portion of the issues:

Norwegian is not a good analogy to use. First, the cases are not parallel (Nynorsk is an official orthography, the reintegrationist orthography isn't; also the issue of reintegrationist vs. traditional Galician orthography is highly political and touches on whether Galician is a separate language from Portuguese, which has no parallel in Norwegian). Secondly, the current state of Norwegian in Wiktionary is (IMO, and I know many people agree) rather stupid, and should not be used as an example of good practice of anything.
In regards to "usually, the traditional[sic] spelling is the main article" implying that it's OK to create the reintegrated spelling as the main article if no article exists, this is definitely not the case, because it would give undue weight to the reintegrated spelling, which (as others have also observed) is very much in the minority.
I personally think the best way to handle the spelling differences is to treat reintegrated spellings as alternative spellings using a specific template that generates reintegrationist spelling of ... or similar, just as User:-sche mentioned above. In cases like amar where both spellings are the same, I think it's fine to have a reintegrationist conjugation following the traditional one, as long as both are collapsed by default, so they don't take up lots of space. But someone would have to create the appropriate conjugation module or templates, which is not an insignificant amount of work. In general, things boil down to the interest of individual editors; the lack of reintegrationist spellings would indicate that there aren't any Galician editors interested in this particular spelling system. Benwing2 (talk) 03:25, 22 August 2022 (UTC)[reply]
Please create an account. There are several knowledgeable IP editors currently, and it's impossible to keep track of who is who if you don't create an account. By creating an account, people will be able to leave you messages, thank you, etc. You'll get more editing privileges and a lot more respect. Benwing2 (talk) 03:25, 22 August 2022 (UTC)[reply]

I've made an account, as asked. Your comments all make sense to me, and I agree with what -sche (as well as you and AG202) proposed. If, so far, no one's shown interest in adding those pages, I'm happy to be the first one to -- though, indeed, templates ("reintegrationist spelling of" and support for Estraviz) and conjugation modules need created, and I'm not skilled or experienced enough to do it myself. I'm obviously new to Wiktionary itself, too: Assuming a consensus has been reached (three users proposed roughly the same thing)... what now? MedK1 (talk) 01:23, 26 August 2022 (UTC)[reply]

@MedK1 Thanks for creating an account. I can probably create the reintegrationist spelling of template but adding a conjugation module is a significant amount of work, and I don't have time for it currently. As for support for Estraviz, what would that look like? Benwing2 (talk) 06:18, 26 August 2022 (UTC)[reply]

I was thinking something like other already-existing dictionary templates such as {{R:DDLG}}, {{R:TILG}}, {{R:TLPGP}} or {{R:Priberam}}; a {{R:Estraviz}} that'd return something along the lines of "“página” in Dicionário Estraviz de galego". MedK1 (talk) 19:57, 26 August 2022 (UTC)[reply]

@MedK1 I created {{gl-reintegrationist spelling of}} with shortcut {{gl-reinteg sp}}, along with {{R:gl:Estraviz}} (it is preferred to include the language name in reference templates; we should rename the existing templates appropriately). Please edit Template:R:gl:Estraviz and fill in the bibliographic info; I looked at the site but I'm not quite sure what to include (e.g. the year). Benwing2 (talk) 19:05, 27 August 2022 (UTC)[reply]

Note also, this puts the spelling into a category Category:Galician reintegrationist forms; this is consistent with how the generic {{spelling of}} template works. Benwing2 (talk) 19:06, 27 August 2022 (UTC)[reply]

@Benwing2: ga is Irish, gl is Galician. Chuck Entz (talk) 19:33, 27 August 2022 (UTC)[reply]

@Chuck Entz Ahh fuck, thank you, I will fix. Benwing2 (talk) 19:52, 27 August 2022 (UTC)[reply]

@Benwing2 Thank you for adding the templates! About the bibliography, I think only adding the year should be fine. I was looking around some templates for references on how to format the year properly when adding it, and noticed that on some of them, the year the dictionary was compiled is written out as a range, like "2013–2022", whereas in other cases, it's written out as a single year... The current version of the Estraviz dictionary was published in 2014, but new features (a verb conjugator to be specific) were added to it as recently as 2019 and I'm confident new words are added every now and then since there's a button to e-mail the people behind the dictionary about it. So with that in mind, I marked the year as "2014", but maybe it should've been "2014–2019" or "2014–2022" instead? MedK1 (talk) 20:26, 27 August 2022 (UTC)[reply]

@MedK1: I think just 2014 is fine; the range is appropriate if the dictionary itself gives a range of years, otherwise we should go with what the dictionary says. Benwing2 (talk) 20:52, 27 August 2022 (UTC)[reply]

Borrowings from Parent and Child

Do we have a policy for borrowings from parent and child when we use {{desctree}} in the parent? Under its 'Descendants' section, do we show both the direct and indirect routes? This is particularly significant for SE Asian languages, where words may have been borrowed from Sanskrit or Pali, but it can be difficult to tell which. Often the word may have been borrowed from both. There's another complication in Thai, where the word can show many signs of having been borrowed from Pali, but then had its spelling Sanskritised. I would expect there to be a policy, for it can be difficult to tell whether some English words were borrowed from French or from Latin. --RichardW57 (talk) 16:58, 21 August 2022 (UTC)[reply]

Borrowings from a parent are listed like any other borrowings, no? That's how e.g. Latin abdōmen lists French abdomen (with bor=1). Descendants sections can list indirect borrowings, like Latin homō → Old Spanish omne → Spanish hombre → French hombre (notice not all of those are present in the wikitext of homō, it fetches some further descendants from the immediate descendants' entries). Where a word's etymology is unclear, I don't know what you could do beyond list it in either neither or both applicable places with some unc=1 qualifier. If you can tell it's from Pali but Sanskritized I think that'd be treated as coming from Pali in the descendants sections/tables, and then you'd explain the Sanskritization in its etymology section; at least, this is what's done for German-spelling-influenced Yiddish borrowings like mensch, fleischig, milchig. - -sche (discuss) 19:14, 21 August 2022 (UTC)[reply]

I wasn't talking about borrowing from a parent. For the first example, English abdomen says the word is borrowed from Middle French, but that lemma hasn't been entered. The Latin entry contradicts the English entry, and says that the English word is borrowed from Middle French. I think that if the borrowing from Middle French had not coincided with the Latin form, the Latin form would have replaced it. Now, if the English word is borrowed from Latin as well as from Middle French, should the English word show up in the entry for the Latin in two words - once as a direct borrowing from Latin, and again as a borrowing from Middle French? --RichardW57 (talk) 20:10, 21 August 2022 (UTC)[reply]

@RichardW57 I think you mean that the Latin entry says the word was borrowed directly from Latin. A lot of pages, however, are sloppy in listing descendants, and list indirect borrowings as if they're direct, so you should trust the Etymology section of the borrowed page. I agree here that if the spelling of the borrowing from Middle French had differed from Latin, the spelling would likely have been Latinized. I don't think we have a specific policy about this sort of thing. This is not specific to SE Asian languages, but happens with all Classical languages with descendants that are also literary languages, and it happens not only w.r.t. a third language but w.r.t. the child itself; tons of Italian terms, were inherited from Latin but then had their spelling adjusted to be more Latinate where it had diverged. Benwing2 (talk) 00:02, 22 August 2022 (UTC)[reply]

template editor permission please

Please grant me template editor permission please. My contributions can be found here. I typically make small changes, and all are tested on my private fork of Wiktionary that I use for academic purpose. PS. I also happen to have template editor permission on Wikipedia since 2013. Thanks. Dpleibovitz (talk) 19:12, 21 August 2022 (UTC)[reply]

User:Zomilai

This person has decided to start a Zolai-to-Zolai dictionary on their user page, with the idea being that anyone can add entries and definitions. I'm not sure what language it is, but I believe it's something similar to what we have as Zou (language code zom), a Sino-Tibetan language or group of languages.

First of all, this is completely separate from English Wiktionary's mission to build an English-language dictionary of terms in all languages. It also will probably trigger an abuse filter if any IPs or new accounts try to edit it. I have nothing against promoting little-known local languages, but this seems to be the wrong place and the wrong method.

The question, then, is what do we do about this? Chuck Entz (talk) 21:50, 21 August 2022 (UTC)[reply]

Dude, the guy has just 2 edits! Why bring this up here? Dunderdool (talk) 21:52, 21 August 2022 (UTC)[reply]

I think you're getting a little paranoid actually, Chuck. First, mass-protecting thousands of pages in case anyone wanted to move them, and second, worrying about this... Dunderdool (talk) 21:54, 21 August 2022 (UTC)[reply]

I'm not exactly trying to incite the villagers to storm a castle with torches and pitchforks. I'm asking what we should do. As I said, there's nothing wrong with what they want to accomplish, but this doesn't seem like the right way to do it. Chuck Entz (talk) 22:15, 21 August 2022 (UTC)[reply]

We don't provide free hosting, so a little bit of off-topic or tangential content on your user page is fine, but not volumes of it. The current amount of content is fine, but if what he's trying to do is work in some other language than English, he should go to incubator: and if what he's trying to do is show some differences and similarities between closely-related dialects using English for an explanation, that should be an appendix. —Justin (koavf)❤T☮C☺M☯ 22:26, 21 August 2022 (UTC)[reply]

Copying

All our Kayan/Padaung entries appear to have been copied, definitions and all, from Webonary. The material is under copyright. What to do? This, that and the other (talk) 02:11, 22 August 2022 (UTC)[reply]

@This, that and the other My thoughts:

What is the license for Webonary content? This isn't obvious from reading the copyright link you included. Is it user-submitted content? If so, it might be OK depending on who submitted the content (e.g. if the creator of the content on Webonary is the same person who submitted the Wiktionary content).
Who created the entries? Were they all created by User:咽頭べさ? Whoever it is needs to be aware of the issues in copying from other dictionaries.
Many of the definitions are common English single words, like "princess", "late" or "sound". These surely cannot be copyrighted. For multiword definitions like "reap the consequences of past misdeeds", it's less obvious. From looking at some of the Wiktionary entries, I don't see any usexes or other obviously-copyrightable things.
User:BD2412 is a lawyer who might be able to comment more.

Benwing2 (talk) 03:00, 22 August 2022 (UTC)[reply]

Also, these entries need to be cleaned up somewhat, e.g. the name of the language needs to agree in the header and categories. Benwing2 (talk) 03:01, 22 August 2022 (UTC)[reply]

Webonary asserts a pretty definitive copyright - there is no "license". Everything with more than a scintilla of creative input copied from that source must be deleted here. For the one-word definitions, I would see if there are any other sources that use a different word or words, to insure that there is no other conceivable most direct definition. bd2412 T 03:04, 22 August 2022 (UTC)[reply]

I did not copy the website dictionary you mentioned, but my Kayan dictionary 2016 free phone software was copied, so let me ask you a question, User Octahedron80 are not experts in the Mon language and do not understand Mon vocabulary, User Octahedron80, like me, took the Mon vocabulary from another dictionary, so you don't say anything about User Octahedron80, , but why do you want to blame me alone? if you don't want me to be on this English Wiktionary, speak openly, I will leave. --咽頭べさ (talk) 03:45, 22 August 2022 (UTC)[reply]

@Benwing2:

It seems that Webonary (run by SIL) hosts various dictionaries/glossaries, each with its own copyright statement. It doesn't appear to be user-editable; the credited compilers seem to be a group of native speakers and linguists.
Yes, all created by that user.
For the one-word-definition entries, I was concerned whether some kind of "database right" existed that prevents the copying of bodies of definitions. However, the entry that concerned me the most was ba#Padaung, with a range of info and usexes that has been directly copied from Webonary. (The quality of the entry is also poor - see verb sense 3 and its usex - but that is another matter.) There are a few usexes in other entries too.

@BD2412 I have not been able to identify other dictionaries for this language. Apparently the Catholic Church has published one, but it's very difficult to identify even any libraries that might hold it (and I'm not sure I can be bothered tbh). What's more, the Webonary project seems to use an orthography promoted by the Kayan Literature Committee and has not previously been used by other dictionaries or texts.

@咽頭べさ we welcome your contributions to Wiktionary, but it is essential that you respect copyright in making your contributions - this is a fundamental principle of wiki editing. The Android app you refer to is provided by Webonary, so is subject to the same copyright requirements. If other users have breached copyright by directly copying definitions, examples and so on from another dictionary, please provide examples and evidence so we can deal with that problem as well. This, that and the other (talk) 03:50, 22 August 2022 (UTC)[reply]

@This, that and the other, BD2412 Thank you. Looks like User:Chuck Entz already deleted some of the obvious copyright violations. It didn't occur to me what you've mentioned about the orthography, but it seems it could be an issue as well. Benwing2 (talk) 03:59, 22 August 2022 (UTC)[reply]

To start with, I deleted the 8 entries with usage examples, including ba. It looks like all but maybe one or two had the usage examples copied verbatim, but I didn't see English translations of the usage examples on Webonary so those probably were original rather than copied. Chuck Entz (talk) 04:12, 22 August 2022 (UTC)[reply]

I was working on Mon Wiktionary alone, so I edited it in a hurry, so I forgot to add the reference.--咽頭べさ (talk) 04:07, 22 August 2022 (UTC)[reply]

@咽頭べさ adding references to these entries will be a big help. However, please remember not to make direct copies of other dictionaries' definitions (unless they are extremely simple and obvious), usage examples, and so forth. Most of the time, this a breach of copyright. You can only directly copy another dictionary when that dictionary is very old and out of copyright, or it has been explicitly released by its author under a free license like Creative Commons. This, that and the other (talk) 05:35, 22 August 2022 (UTC)[reply]

[Invitation] Join the Movement Strategy Forum

Hello everyone,

The Movement Strategy Forum (MS Forum) is a multilingual collaborative space for all conversations about Movement Strategy implementation.

We are inviting all Movement participants to collaborate on the MS Forum. The goal of the forum is to build community collaboration, using an inclusive multilingual platform.

The Movement Strategy is a collaborative effort to imagine and build the future of the Wikimedia Movement. Anyone can contribute to the Movement Strategy, from a comment to a full-time project.

Join this forum with your Wikimedia account, engage in conversations, and ask questions in your language.

The Movement Strategy and Governance team (MSG) launched the proposal for the MS Forum in May 2022. There was a 2-month community review period, which ended on 24 July 2022. The community review process included several questions that resulted in interesting conversations. You can read the Community Review Report.

We look forward to seeing you at the MS Forum!

Best regards,

Movement Strategy and Governance team Mervat (WMF) (talk) 15:54, 22 August 2022 (UTC)[reply]

Template editor permission request

I want to edit some protected modules (such as Module:bo-pron and Module:languages/data*). Please grant me template editor permission. Thanks. 沈澄心 ✉ 05:23, 23 August 2022 (UTC)[reply]

The 2022 Board of Trustees election Community Voting period is now open

You can find this message translated into additional languages on Meta-wiki.

More languages • Please help translate to your language

Hi everyone,

The Community Voting period for the 2022 Board of Trustees election is now open. Here are some helpful links to get you the information you need to vote:

Try the Election Compass, showing how candidates stand on 15 different topics.
Read the candidate statements and answers to Affiliate questions
Learn more about the skills the Board seeks and how the Analysis Committee found candidates align with those skills

If you are ready to vote, you may go to SecurePoll voting page to vote now. You may vote from August 23 at 00:00 UTC to September 6 at 23:59 UTC. To see about your voter eligibility, please visit the voter eligibility page.

Best,

Movement Strategy and Governance

This message was sent on behalf of the Board Selection Task Force and the Elections Committee
Mervat (WMF) (talk) 12:15, 23 August 2022 (UTC)[reply]

Rename 'LANG words suffixed with SUFFIX' -> 'LANG terms suffixed with SUFFIX'

It's always bothered me that affix categories use "words" in them instead of "terms", as every other similar category does. I propose renaming these categories to have "terms" in them. Not only will this make the category names more consistent, sometimes the terms in these categories aren't even single words. For example, in Category:English words suffixed with -ism, we have Archie Bunkerism, ca' cannyism, Church Slavonicism, etc. which aren't single words, as well as things like Church-of-Englandism that are only arguably single words. If people agree with this, I can do the renaming by bot without too much difficulty. Benwing2 (talk) 06:39, 24 August 2022 (UTC)[reply]

Support. I assume the use of "words" rather than "terms" was to prevent people from adding suffix categories to multiword compounds like animal magnetism, but, as you said, there are plenty of suffixed entries that consist of multiple words despite not being compounds. Binarystep (talk) 08:18, 24 August 2022 (UTC)[reply]

Landing page for Sociology

Hi all - this follows on from a conversation I had on the information desk. I wondered if it was possible to create a landing page (my terminology as I am not sure what the correct Wiki version of it should be called) for Sociology to help navigate users to specific resources found on Wiktionary around the subject.

The main reason I am seeking this is that in Wikidata the Wiktionary entry for sociology itself is not allowed to be linked to the wider Sociology Wikidata project, as well similarly related areas like Wiktionary sociology categories already have their own Wikidata entries, so cannot be linked to help guide users. It would be grand to have a Wiktionary resource that could be used as a central page and linked to the wider Wikidata project for Sociology so that these can be better connected.

There seems to be a broad use of different kinds of naming across the wiki projects to achieve this, including prefixes like "Subject:Sociology", "Portal:Sociology", as well as creating stand alone pages that are just addressed as "Sociology". I think as well having a central page for Sociology also makes sense as dictionaries outside of generalist ones are normally thematically presented, so grouping together links to key entries, categories, and other relevant information for Sociology on this project would be beneficial to reflect the thematic entries for sociologically-related terms. This suggestion may also be beneficial for other areas as well, such as having landing pages for other areas like Physics or Graphic Design, for example, where users can be signposted to relevant resources.

It would be great to see what other people think of this and potentially experiment on a landing page design/ layout for wider group consensus/ approval. Jamzze (talk) 09:12, 24 August 2022 (UTC)[reply]

I guess the main question for me is if you're trying to list out and explain the relationships between various sociology terms so that someone can be oriented for the jargon of that field or if you're trying to give someone an overview and introduction to sociology as a topic of study. If it's the latter, that is more appropriate for Wikiversity. —Justin (koavf)❤T☮C☺M☯ 17:50, 24 August 2022 (UTC)[reply]

The former. Having a page to essentially summarise/ list top-level key terms, point users towards categories of interest to sociology, etc. — This unsigned comment was added by Jamzze (talk • contribs).

The I think that can in principle be the sort of thing in the appendix namespace, as long as it's terminology-focused. I don't think we have anything quite like that a the moment and would be interested to see what others think. —Justin (koavf)❤T☮C☺M☯ 21:13, 24 August 2022 (UTC)[reply]

@Koavf, the closest I can think of are the various Category pages, but I'm not sure if we have any categories for sociology as a subject. ‑‑ Eiríkr Útlendi │^{Tala við mig} 22:16, 24 August 2022 (UTC)[reply]

Category:Sociology, but that doesn't serve the purpose that Jamzze has in mind of giving some kind of overview or direction, it's just an alphabetical listing. —Justin (koavf)❤T☮C☺M☯ 22:20, 24 August 2022 (UTC)[reply]

The Category:Sociology page itself could be updated to include header information as an overview, no? Or at least links to other pages that would provide that? ‑‑ Eiríkr Útlendi │^{Tala við mig} 22:26, 24 August 2022 (UTC)[reply]

I guess, but that would make it radically different than every other category. I don't think there would be a will for just one category page to be a kind of topical overview. —Justin (koavf)❤T☮C☺M☯ 15:36, 25 August 2022 (UTC)[reply]

@Koavf: I was later thinking further about this, more along the lines of including a link on the cat page to an appendix page with fuller detail. The category itself is, pretty much by definition, topical, no? But maybe I'm overthinking it. Wouldn't be the first time. :D ‑‑ Eiríkr Útlendi │^{Tala við mig} 17:37, 25 August 2022 (UTC)[reply]

Template:quote-wikipedia

I just removed a bunch of mainspace additions of this template by @StuckInLagToad that were basically a lazy way to provide filler.

I can understand some limited usage in citation pages for nonexistent entries as sort of a starter to focus ideas on what sort of more suitable content to look for. I don't, however, see the point in using it to support definitions in mainspace- there's a certain aspect of circularity to a wiki quoting from a sister wiki.

Another point: this template lists the Wikimedia Foundation as publisher (which it is), but the lack of any explanatory text gives the impression that the Wikimedia Foundation wrote the text quoted. All the citations I've seen in scholarly journals give the author(s) as "Wikipedia contributors", but even having the template say something like "Wikimedia Foundation, publisher" would be a good idea. Chuck Entz (talk) 22:53, 27 August 2022 (UTC)[reply]

Alright, I'll limit my use of this quoting template. I have no statement over your other point, but thanks for bringing it up. I'll use other quoting types in the future. StuckInLagToad (talk) 23:01, 27 August 2022 (UTC)[reply]

Let's talk Classical Syriac dotting

@Fenakhay, @Metaknowledge, @Fay Freak, @Antonklroberts, @Shuraya, who else?

I would like to take This ID's edits as an opportunity to talk about Classical Syriac dotting (vowel marking). I don't think we ever decided what system to use here on Wiktionary. The choice is between Western and Eastern vowels. Now, my understanding is that the most recent trend (scholarly) is to use the Eastern system (mainly because it allows to distinguish /o/'s from /u/'s). My understanding is that it's always possible to move from the Eastern system to the Western, but not the other way round, so my personal preference would be for the Eastern system too. What do you guys think? — Sartma ^{【𒁾𒁉 ● 𒊭 𒌑𒊑𒀉𒁲】} 00:41, 28 August 2022 (UTC)[reply]

Also for this reason, and apparently less frequent editors from the region prefer it too, otherwise I just arbitrarily used either system, or as was in front of me or attested. Fay Freak (talk) 07:56, 28 August 2022 (UTC)[reply]

Secondhand Attestation of Extinct Languages

(Notifying Mahagaja, Vahagn Petrosyan, Wikitiki89, Brutal Russian, The Editor's Apprentice): We currently have a mainspace Illyrian entry for ῥίνος in CAT:E because Illyrian is set as a reconstruction-only language in the modules.

According to w:Proposed Illyrian vocabulary#Illyrian lemmas, this is only known from a sentence in an Ancient Greek work: Οί δέ λέγουσιν Ἰλλυριούς ῥινόν λέγειν τήν άχλύν, which I translate as "The Illyrians say 'ῥινόν' for άχλύν (ákhlún, “mist, fog”)" with both the Illyrian and its translation being in the (Ancient Greek) accusative, which ends with "-ν".

First of all, is this sufficient attestation to have an Illyrian term in mainspace, and secondly, can the lemma ῥίνος be inferred from "ῥινόν"? After all, it's Ancient Greek that has a nominative singular ending in -ος that corresponds to an accusative singular ending in -ον. I don't think we know enough about Illyrian grammar/morphology to say what the lemma form should be. To be fair, I should point out that the entry has a reference.

The reason I'm bringing this here is that I'm not sure whether to simply delete this, or to move it to the reconstruction namespace, and if the latter, to what spelling. Chuck Entz (talk) 16:02, 28 August 2022 (UTC)[reply]

I also see a discrepancy regarding the placement of the tonos. It doesn't bode too well... P U C – 16:10, 28 August 2022 (UTC)[reply]

The "Illyrian" stuff is too controversial. I would not keep this even as a reconstruction. A table listing all proposed Illyrian words, like the Wikipedia one you linked to, is sufficient. Vahag (talk) 16:40, 28 August 2022 (UTC)[reply]

@@ Line 561: / Line 561: @@
 The reason I'm bringing this here is that I'm not sure whether to simply delete this, or to move it to the reconstruction namespace, and if the latter, to what spelling. [[User:Chuck Entz|Chuck Entz]] ([[User talk:Chuck Entz|talk]]) 16:02, 28 August 2022 (UTC)
 :I also see a discrepancy regarding the placement of the tonos. It doesn't bode too well... [[User:PUC|P]][[User talk:PUC|U]][[Special:Contributions/PUC|C]] – 16:10, 28 August 2022 (UTC)
+:The "Illyrian" stuff is too controversial. I would not keep this even as a reconstruction. A table listing all proposed Illyrian words, like the Wikipedia one you linked to, is sufficient. [[User:Vahagn Petrosyan|Vahag]] ([[User talk:Vahagn Petrosyan|talk]]) 16:40, 28 August 2022 (UTC)