Wiktionary:Grease pit/2024/May

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Lithuanian collation[edit]

I've been cleaning up some coding errors in Lithuanian entries, and I noticed that seven years ago @エリック・キィ had ineffectually added |sort=skesti to the headword line of Lithuanian skę̃sti? What problem was it hoped to address? The word currently gets sorted (except when separated from its tagging as Lithuanian between "ske" and "skė" and in particular before "ski"; it looks as though sorting was much worse when the entry was created. It's possible that the hope was that it would be sorted between Lithuanian skėrys and Lithuanian skėtis as would be done by Mimer SQL. (Notifying Agamemenon, Apisite, BigDom, GabeMoore, Insaneguy1083, Helrasincke, Hippietrail, RichardW57, Sławobóg, 70.175.192.217): . I have removed the ineffectual parameter from the page. RichardW57m (talk) 13:13, 1 May 2024 (UTC)[reply]

And we can find successive dictionary entries Lithuanian skersvėjis, Lithuanian skėsti, skęsti, sketera,[1] showing the problems with promoting secondary collating differences to primary differences. --RichardW57m (talk) 14:50, 1 May 2024 (UTC)[reply]
Where is the design rational for our current Lithuanian collation? As far as I can tell, if it was controlled from Module:languages/data/2, it was implemented on or after 2 December 2022, in Module:lt-sortkey, which was deleted on 7 January 2023. Essentially - why are secondary collation differences promoted to primary, whereas they are simply ditched in French, so sort French e, é, è and ê the same, but make Lithuanian e, ę and ė sort like completely different letters? Was it a conscious decision? I suspect the decision was taken by @Theknightwho, and it can always be justified by being better than what went before. --RichardW57m (talk) 17:25, 1 May 2024 (UTC)[reply]
@RichardW57m What do you mean by "problems with promoting secondary collating differences to primary differences"? Can you clarify? BTW I doubt User:Theknightwho intentionally made a decision to change Lithuanian sorting order. He did a lot of work restructuring the *handling* of sort keys, but AFAIK the intent was to preserve whatever sorting rules were already present. Benwing2 (talk) 23:58, 1 May 2024 (UTC)[reply]
@Benwing2 Richard is referring to the Unicode Collation Algorithm, which uses primary, secondary and tertiary weightings (secondary tiebreaks primary, and so on). While I'd very much like us to use the UCA, implementing it would be a lot of work, but it would be a big improvement over the crude sort methods we generally use at the moment.
To answer @RichardW57m's question: the current sortkey isn't based on the UCA - with a handful of exceptions, our sortkey algorithms are very simple. Theknightwho (talk) 00:23, 2 May 2024 (UTC)[reply]
@Benwing2, Theknightwho: Well, someone made a change between 28 November 2022 and 15 December 2022 (see [2]), but the history is hidden in the now deleted Module:lt-sortkey. Was it perhaps @Octahedron80? --RichardW57 (talk) 06:48, 2 May 2024 (UTC)[reply]
I'm afraid Theknightwho's answer isn't an answer to my question. While our collations seem to be defined entirely by what the UCA would term primary keys, they tend to attempt to approximate the native sorting orders. It hasn't always been done - our Roman script Pali sorting bears no relationship to the usual sort order, for example, being mostly based on the foremost sorting order for each script. Someone should be making a decision on how to do the approximation - but they may not actually understand the subtlety of the secondary level. --RichardW57 (talk) 06:48, 2 May 2024 (UTC)[reply]
@RichardW57 There's only one edit to Module:lt-sortkey, when it was created on Dec 1 2022 by User:Theknightwho. It looks like prior to that there was no sorting algorithm defined for Lithuanian. The Lithuanian dictionary at [3] sorts ę as if it were e (e.g. gesti is directly followed by gęsti) but treats ė as a distinct letter, hence gežtis is directly followed by gėbelėti. I think you should figure out the correct sort order according to standard dictionaries, and then we can implement it. Benwing2 (talk) 07:07, 2 May 2024 (UTC)[reply]
{[re|Benwing2}} There had been a prior algorithm, but it was subsumed in the stripping of the three stress accents to generate page names. I haven't dug into the history of this stripping, but it may explain some of the oddities of Lithuanian templates if it was a later feature on Wiktionary.
Am I missing a trick with the LKZ web site? I can't see how to get a dictionary page from it, only dictionary entries. Short of ordering books, I could only find Lalis's dictionary and introductory pages. --RichardW57m (talk) 08:40, 2 May 2024 (UTC)[reply]
@RichardW57m You can e.g. type just g in the search box and hit enter, and down the left rail you'll see a list of all the entries starting with g, in sorted order. Benwing2 (talk) 08:56, 2 May 2024 (UTC)[reply]
@Benwing2: Thanks. Curious. The 'standard' Lithuanian collation as defined by CLDR and demonstrated (today) at [4] has gėbelėti before gežtis. --RichardW57m (talk) 10:19, 2 May 2024 (UTC)[reply]
@Benwing2: But remember Theknightwho's assessment above of the UCA (let alone the CLDR Collation Algorithm) being a lot of work. Last time I looked, the latter wasn't clearly defined. --RichardW57m (talk) 10:31, 2 May 2024 (UTC)[reply]
This LKZ list may not be reliably sorted. We have four consecutive items gabumas, gabūnas, gabuoti, gaburdalas, but the distinctly non-empty lists of words stating 'bub' and 'būb' do not overlap! Likewise for words starting gabu and gabū. At https://zodynas.vz.lt/terminaiRaidec.php, I found the index headings/links ABCČDEĖFGHIJKLMNOPRSŠTUŪVZŽ, but 'E' and 'Ė' and also 'U' and 'Ū' pointed to the same lists! Possibly that's a case of the author and the software having different ideas about Lithuanian sorting. The list for 'U' and 'Ū' contained examples of both as initial letter. --RichardW57m (talk) 13:10, 2 May 2024 (UTC)[reply]
@Benwing2:, (Notifying Agamemenon, Apisite, BigDom, GabeMoore, Insaneguy1083, Helrasincke, Hippietrail, RichardW57, Sławobóg, 70.175.192.217): I will need some help assessing what is a 'standard' Lithuanian dictionary. I've now found what looks like one at https://archive.org/details/lyberis-sinonimu-zodynas-2002/page/106/mode/2up; (Sinonimų Žodynas = "Dictionary of Synonyms") the page linked to makes it rather obvious that 'e', 'ę' and 'ė' are the same in some sense, and page 128 shows the intermingling of the three. Page 244 shows that 'u' and 'ū' have the same primary weight. Page 252 shows sameness for 'a' and 'ą'. Page 147 proclaims sameness for 'i', 'į' and 'y'. Page 528 proclaims the sameness of 'u', 'ų' and 'ū', though a demonstration for u ogonek will need more searching. --RichardW57m (talk) 15:00, 2 May 2024 (UTC)[reply]
@RichardW57m Implementing the UCA for general use on Wiktionary would be a lot of work because (a) it presents performance difficulties due to the size of the UCA dataset and complexity of tailorings, and (b) the sortkeys generated by the UCA aren't suitable as Wikimedia category sortkeys since they're numeric, which would mess up category headers. As such, any general implementation will need to solve that issue, too.
Implementing something something which emulates the UCA for one language is probably not too difficult, though (like our other sortkey algorithms) it would only support the relevant diacritics for the language. Theknightwho (talk) 15:17, 2 May 2024 (UTC)[reply]
Using the UCA doesn't force the use of the DUCET. To be honest, most of the CLDR tailorings don't look too wonderful when applied to characters not used outside the language's known character range. I suggest the keys be loaded as strings (which is what we do now). The UCA implementation notes admit that the DUCET values are unfit for any system that expect keys to be C strings, and explains how to convert them.
Even now, starting a category display of Lithuanian terms at 'Y' causes the first letter header to be shown as 'I'. I think that sort of thing is inevitable for any letter that sorts out of code point order. --RichardW57m (talk) 16:10, 2 May 2024 (UTC)[reply]
I finally found the demonstration of 'u' and 'ų' having the same primary key in the sequence 'skùsti', 'skų́sti', 'skų́stis', 'skùtas' on paɡe 440. Word internal 'ų' is inconveniently infrequent.--RichardW57 (talk) 21:21, 2 May 2024 (UTC)[reply]
The only CLDR collation for Lithuanian gives "Bronius Piesarkas: Lithuanian-English Dictionary ISBN 9986-465-56-7" as its source, and this collation defines ogonek on 'a', 'e', 'i' and 'u' not to make a difference in primary weight, 'e' and 'ė' not to differ in primary weight, 'i' and 'y' not to differ in primary weight, and 'u' and 'ū' not to differ in primary weight. The only weird thing about this collation is that it defines the accentuation marks (acute, grave and tilde) to make secondary differences, and thus be more significant than capitalisation, which I don't believe. It's also saying that, inter alia, a difference of accent on the first syllable would outweigh a difference between 'e' and 'ė' in a subsequent syllable. That's probably refutable. --RichardW57 (talk) 22:02, 2 May 2024 (UTC)[reply]
@RichardW57 I hope this might be what you're looking for (excerpted from Ambrazas et al):
In dictionaries and other lists of words arranged in alphabetical order, a and ą, e, ę and ė, i, y and į, u, ū and ų are treated as if they were identical letters, even though they represent different sounds. Therefore the following alphabetical order is customary: aržùsąsàasamblė́ja, ẽstiė́stièstiškas, įkéltiìkraiýlaìlgas.[2]
This appears to be confirmed by, for example, the Lyberis Lithuanian-Russian dictionary where gadýnė is followed by gadìnimas, gabėndinti is followed gabẽnimas (under which we find e.g. gabénti, gabéntis).[3] Helrasincke (talk) 22:15, 4 May 2024 (UTC) Helrasincke (talk) 22:15, 4 May 2024 (UTC)[reply]
@Helrasincke @RichardW57 Yeah, this is easy enough to implement, if no one objects I'll go ahead and do that. Benwing2 (talk) 00:19, 5 May 2024 (UTC)[reply]
The only question is whether this is what people would prefer, as opposed to government directive. I've been surprised by the evidence that some would prefer to treat 'e' v. 'ė' and 'u' v. 'ū' as primary differences. We did have someone object to the CLDR treating 'i' v. 'y' as a secondary difference. The objection was dismissed as being based on ignorance, but I can't help wondering if there is a folk collation. --RichardW57 (talk) 14:06, 5 May 2024 (UTC)[reply]
Curiously enough, Cathy Wissink and Micha Kaplan get Lithuanian 'i' v. 'y' wrong on p11 of https://www.infitt.org/ti2002/papers/03WISSIN.PDF - carelessness, ignorance or misinformation? --RichardW57 (talk) 14:35, 5 May 2024 (UTC)[reply]

References[edit]

  1. ^
    1915, Antanas Lalis, A Dictionary of the Lithuanian and English Languages[1], Chicago, page 325
  2. ^ Vytautas Ambrazas et al (2006) [1997] Lithuanian grammar, 2nd edition, pages 15-16
  3. ^ Antonas Lyberis (2005) Lietuvių-rusų kalbų žodynas [Lithuanian Russian Dictionary], page 182

--RichardW57m (talk) 14:50, 1 May 2024 (UTC)[reply]

Strange behaviour of translation template[edit]

The translations for verb play, sense "deal with a situation in a diplomatic manner", display OK, as you will see. However, if everything else in the "translations" section is deleted, leaving only the "deal with a situation in a diplomatic manner" part, then the translations become corrupted with funny characters, stuff like "Finnish: ⦃⦃t+¦fi¦hoitaa¦¦¦¦¦¦¦¦¦⦄⦄"etc., as you can see here in a test edit that I have now reverted. (Of course, I do not actually want to delete everything else in the section. I wanted to make another edit, which went wrong for the same reason, and in trying to work out where the problem lay, I successively deleted other sections, until nothing else was left, to arrive at the minimal case that exhibited the problem.) Any ideas? Mihia (talk) 17:39, 1 May 2024 (UTC)[reply]

@Mihia This is not a bug. The {{tt}} and {{tt+}} templates need to be surrounded by {{multitrans}} in order for them to work and not display "funny characters" (as you say). If you delete the surrounding call to {{multitrans}}, you need to convert {{tt}} back to {{t}} and {{tt+}} back to {{t+}}. But a better solution is just to not do this. You can also put the call to {{multitrans}} around each translation section instead of once around all of them, but that will partly negate the memory-saving benefits of {{multitrans}}. Benwing2 (talk) 23:53, 1 May 2024 (UTC)[reply]
To expand on Ben's point: the reason we originally used {{multitrans}} was to avoid hitting the 50MB memory limit on large pages (which is no longer much of a concern since the limit got raised to 100MB a few months ago), but it's still very useful because it helps with page loading times as well. water/translations would be totally unusable without it, for example. Theknightwho (talk) 00:26, 2 May 2024 (UTC)[reply]
Thanks. The way this is presently laid out, the start of "multitrans" is embedded within one "trans" block, and the end of it, "}}", I think I can now see, is embedded within another, so I must say that it is non-obvious to editors what is going on. It looks "obviously" as if each block is self-sufficient, and, e.g. that they can be reordered, but of course when I moved one block to the end this actually took it out of "multitrans" and broke it. Is there any way to lay this out more clearly? E.g. can "multitrans" be started at the very top on a separate line, and ended at the very bottom on a separate line? Mihia (talk) 09:29, 2 May 2024 (UTC)[reply]
@Mihia That was a kludge because the translation-adder originally freaked out if you put the multitrans opening at the top, and no-one wanted to touch it if we didn't have to since the translation-adder's quite finicky. That bug's since been fixed, but lots of pages still do the original workaround as no-one's sent a bot round yet. It wasn't seen as a priority originally, since multitrans was only used on a handful of pages, but as memory issues became worse it eventually came to be deployed on hundreds of pages. Theknightwho (talk) 15:20, 2 May 2024 (UTC)[reply]
I guess I understand from "since been fixed", then, that it is OK now to wrap "multitrans" around the whole section? (I can see that this does seem to work, but I wouldn't necessarily know if it is achieving the full memory-saving benefits.) Anyway, I've done it now, so please let me know if it's a problem. Thanks. 17:32, 2 May 2024 (UTC) Mihia (talk) 17:32, 2 May 2024 (UTC)[reply]
@Mihia Yes, AFAIK what you are suggesting is completely OK now. Benwing2 (talk) 18:58, 2 May 2024 (UTC)[reply]

Occitan template requests[edit]

Can anyone create Occitan past participle template and other essential templates?. other Romance languages such as Asturian, Catalan, French, Galician, Italian, Portuguese, Spanish have already had past participle template. Thank you in advance. Flummont (talk) 02:06, 2 May 2024 (UTC)[reply]

I guess you could make {{oc-pp}} as a copy of {{ast-pp}} to start with, as that does not use Lua and should be easy to adapt. But I can see that Occitan morphology is not always as simple as adding letters onto a fixed stem. Ultimately what needs to be created is Module:oc-headword, as a copy (😢) of Module:ca-headword or similar, with changes to the Catalan-specific logic there. @Benwing2 is the expert here. This, that and the other (talk) 02:49, 2 May 2024 (UTC)[reply]
@This, that and the other Conceptually this is not so hard, but unfortunately I don't know that much about Occitan; and a complicating factor is the 6 or so different dialects, each of which (conceivably) forms its feminine and plural according to its own rules. Benwing2 (talk) 03:03, 2 May 2024 (UTC)[reply]
Looking at oc:parlar#Conjugason, dialectal differences may not be an issue for past participles (at least in these four dialects, and for regular first group verbs). I was going to say "I'm sure Flummont knows more", but it seems that this editor does not actually speak Occitan. This, that and the other (talk) 04:13, 2 May 2024 (UTC)[reply]
I took a look at the conjugations there. Unfortunately they only have Lengadocian conjugations for -ir/-er/-re verbs but the regular ones seem to be like parlar. However, the irregular ones (e.g. oc:Template:Conjugason/oc/leng/-odre-t, oc:Template:Conjugason/oc/leng/-dire) look to be more complex and probably differ dialect-to-dialect. Benwing2 (talk) 04:24, 2 May 2024 (UTC)[reply]

Two issues with transliteration categories[edit]

This, that and the other (talk) 02:32, 2 May 2024 (UTC)[reply]

@This, that and the other I agree with your first statement. I'll have to look into what's going on with honey. Benwing2 (talk) 03:08, 2 May 2024 (UTC)[reply]
@Benwing2 I guess the hidden cat issue could be fixed by changing Module:category tree/poscatboiler#L-448 to return false, but I don't really understand the logic here. Umbrella categories don't contain entries, so why would they ever need to be hidden? This, that and the other (talk) 04:06, 2 May 2024 (UTC)[reply]
@This, that and the other I think I wrote that code more or less mechanically. But in this case making that change wouldn't fix the issue because the 'Requests for ...' categories are all raw, so the first arm of the if-statement would apply. We need to make a change somewhere in Module:category tree/poscatboiler/data/entry maintenance, which generates its own umbrella categories, to not hide such categories. Benwing2 (talk) 04:16, 2 May 2024 (UTC)[reply]
Maybe [5] will fix it. This, that and the other (talk) 04:40, 2 May 2024 (UTC)[reply]
@This, that and the other Looks good to me. Benwing2 (talk) 04:58, 2 May 2024 (UTC)[reply]

Lua Modules variable sized arguments and the arg magic variable[edit]

Several Lua Module have been "corrected" or modified and returned back to the Lua 5.0 old way of dealing with variable sized arguments. As Scribunto currently uses Lua 5.1 and efforts are ongoing that may update the Lua engine to new versions, I think we should stick to more modern ways to deal with varargs functions.

This is also important for me as I extract data from wiktionary and the Lua version I rely on does not understand this jargon anymore.

@Theknightwho Module:languages now uses the (soon obsolete) arg var in :

function export.addDefaultTypes(data, regular, ...)
	local n = arg.n
	local types = n > 0 and concat(arg, ",") or ""

Please, could you change it to :

function export.addDefaultTypes(data, regular, ...)
   local arg = {...}
   local n = select( '#', ... )

As a quick fix ?

The same goes on with Module:scripts which has been "corrected" recently and introduced the same problem:

for i = 1, arg.n do
    if not types[arg[i]] then 

from the more modern (older) version:

for _, type in ipairs{...} do
   if not self._type[type] then

Also please consider modernizing the Module:tables function:

function export.append(...)
	local ret, n = {}, 0
	for i = 1, arg.n do
		for _, v in ipairs(arg[i]) do

Thanks in advance, Dodecaplex (talk) 21:14, 3 May 2024 (UTC)[reply]

@Dodecaplex The instances where I’ve implemented this have been where it maximises performance, and I would not support changing that. The workaround that you suggest generates a performance hit, which is unacceptable for functions which are called many thousands of times, as happens with memoisation.
I really don’t see the point in caring about what is officially deprecated in Lua 5.1 while there is no chance of the functionality disappearing in the near future. I appreciate that it creates awkwardness for you, but the solution is for you to use a Lua 5.1 binary. Theknightwho (talk) 21:16, 3 May 2024 (UTC)[reply]
@Theknightwho As Lua 5.1 is used in Scribunto, then arg is already deprecated.
In Lua 5.1, using arg in the function WILL imply a new table creation which will have a cost.
While using ... will avoid the creation of the magic table named arg (which is always null in Lua 5.1 when varargs are to be used). See for instance this stackoverflow answer.
My suggested changes where just a workaround, but if you want to avoid performance hits, then directly use the ... which contains directly the args (no overhead for arg table creation:
select('#',...) instead of arg.n,
select(1,...) instead of arg[1],
select(2,...) instead of arg[2], etc. (well, it is not that easy as indeed select(2, ...) will give the unpacked table from position 2, see details in the cited question)
This will have a better performance impact and will last longer in the long term... Dodecaplex (talk) 21:57, 3 May 2024 (UTC)[reply]
@Dodecaplex If you look at the instances where it's been used, a table has to be created anyway. It's possible to iterate with select, but that rapidly becomes much slower than iterating over a table. I appreciate it's deprecated in Lua 5.1, but that is a completely academic point, really - in practical terms, it just means it's not portable to Lua 5.2 or higher, but Wiktionary doesn't use 5.2 and there's no chance it will for the foreseeable future. If and when that changes, we can adapt the code. Theknightwho (talk) 23:16, 3 May 2024 (UTC)[reply]

Need help with creating Module:number list/data/vi[edit]

Hello guys! I've been planning on expanding the cardinal box of the Vietnamese cardinal number entries by creating Module:number list/data/vi based on the data given in the English Wikipedia article Vietnamese numerals. Similar to Module:number list/data/ko, the number list should include both the native Vietnamese and Sino-Vietnamese transliterations. Does anyone here know how to do this? Thanks in advance! ChemPro (talk) 05:53, 4 May 2024 (UTC)[reply]

@ChemPro I can try to help you with that, but first you should try to make the module by copying the Korean one and filling in the corresponding Vietnamese forms. Benwing2 (talk) 18:48, 4 May 2024 (UTC)[reply]
@Benwing2 Done! --ChemPro (talk) 08:04, 5 May 2024 (UTC)[reply]
@ChemPro What you did was about half done, but I tried to fix it up. It still needs some work. Benwing2 (talk) 00:11, 6 May 2024 (UTC)[reply]
@Benwing2 Hello, thanks for fixing the error. I have some remarks on what still needs to be improved or implemented:
  • For numbers which include the digit 1 from 21 to 91, the number 1 is pronounced as mốt.
  • Even though năm chục (50) is an alternative form to năm mươi, it is not used to construct numerals from 51 to 59 (the same principle applies to all the multiples of ten from 20 up to 90)
  • When the number 5 appears after 10 in the unit digit, the pronunciation changes from năm to lăm --ChemPro (talk) 07:42, 6 May 2024 (UTC)[reply]
  • When the number 4 appears after 20 in the unit digit, pronunciation changes from bốn to --ChemPro (talk) 07:52, 6 May 2024 (UTC)[reply]
    @ChemPro OK, thanks. I'll get to this tomorrow, going to sleep now :) ... Benwing2 (talk) 08:01, 6 May 2024 (UTC)[reply]
Some additional informations to the rules mentioned above:
  • Exceptions to the rule of changing the pronunciation from năm to lăm are numbers ending in 05 (such as 105, 605, 9405, 39605).
  • In some Vietnamese dialects, the number seven (bảy) it also read as bẩy. I tried to implement it, but it somehow caused an error. --ChemPro (talk) 08:36, 6 May 2024 (UTC)[reply]

Bot preventing the creation of a redirect[edit]

Wiktionary bot is preventing me from turning this page ناـ to a redirect to نا. Everytime I tried, it's being labeled as a potentially harmful action. The page formerly contained prefix entries for 4 languages. I appropriately moved them to نا. - Ash wki (talk) 10:08, 4 May 2024 (UTC)[reply]

@Ash wki: I have done the intended page for you, surely the filter distinguishes us by user rights. But I think admins should give you autopatroller, I have observed meticulously clean and reasonable edits in you. Note that according to the discussion Wiktionary:Beer parlour/2024/April § Arabic-script affixes @Benwing2 wanted to move affixes, probably to contain the character ـ, anyway, so you might want to put in your voice there. Fay Freak (talk) 11:35, 4 May 2024 (UTC)[reply]

@Fay Freak Thanks a lot. I'll look into the discussion, thanks. Ash wki (talk) 11:44, 4 May 2024 (UTC)[reply]

MY PAGE WONT UPLOAD[edit]

I WAS ONLY GIVING THE LINK TO MY WIKIPEDIA PAGE!!!! Lilly is cool (talk) 17:42, 4 May 2024 (UTC)[reply]

@Lilly is cool: that’s exactly why … — Sgconlaw (talk) 17:47, 4 May 2024 (UTC)[reply]
If you make it a wikilink, as in [[w:User:Lilly is cool]], the abuse filter won't think you're just linking to some random website. Chuck Entz (talk) 17:57, 4 May 2024 (UTC)[reply]

I keep getting logged out[edit]

Ever since I cleared my site data for Wiktionary to fix the translation-adder, I have to log in again on my PC every day. However, my other devices don't have this issue. Any ideas? Aaron Liu (talk) 18:31, 4 May 2024 (UTC)[reply]

@Aaron Liu Weird. When you log in, there's a button to remember the login for up to a month or so; do you have that checked? Also I've found that if you log out of any MediaWiki site, it logs you out of all of them. Benwing2 (talk) 18:46, 4 May 2024 (UTC)[reply]
I’ve had the same issue today, so it might be a server problem. Theknightwho (talk) 19:05, 4 May 2024 (UTC)[reply]
I didn't log out before clearing the data, and I have no problem at all staying logged in on my iPad. Aaron Liu (talk) 19:15, 4 May 2024 (UTC)[reply]
@Benwing2 I am still logged in on other wikis, so logging in is instant. I'll try logging out and back in again, thanks. Aaron Liu (talk) 20:46, 5 May 2024 (UTC)[reply]

Dialectal variation[edit]

I'm currently trying to improve the state of Polish dialects and subdialects, showing the hierarchy and what-not. Would {{dialect synonyms}} be the best option? The idea isn't exactly to show synonyms, but just different reflexes of the same word. Vininn126 (talk) 06:22, 5 May 2024 (UTC)[reply]

What is the difference between what you're describing and an altform? Nicodene (talk) 07:08, 5 May 2024 (UTC)[reply]
Synonyms, as I understand it, have different morphemes, alt forms don't; in short these would be alt forms. I'd be wary of placing dozens of these by village in a hard to structure alt forms section, it would be nice to organise subdialects by dialect and to be able to host many easily. But I'm wary as this seems to be for synonyms. Vininn126 (talk) 08:07, 5 May 2024 (UTC)[reply]
@Vininn126 This sounds like a job for Descendants. Can you not put them on the proto-page and direct other pages to that page? BTW this is related to my post (maybe in the BP) about creating a generalization of {{alt}}. We may need a generalization of {{desc}}/{{desctree}} as well for these purposes. In general I'm not a fan of the dialect synonyms approach of using a separate module for the actual synonyms/alt forms/etc.; that is the approach used by {{etymtree}} and it didn't work. Benwing2 (talk) 19:47, 5 May 2024 (UTC)[reply]
@Benwing2 They are indeed descendants. It could be possible to set up there, but there's also the issue of how to list them on the Polish page itself. The alternative forms section and {{alt}} might lead to a huge mess. Vininn126 (talk) 07:11, 6 May 2024 (UTC)[reply]
Soft redirect? Do we need to list every dialectal form on every page? Benwing2 (talk) 07:17, 6 May 2024 (UTC)[reply]
@Benwing2 That's not my intention. Dialectal forms would most definitely be listed on the Standard Polish reflex. My issue is how to do that. The forms themselves would also be soft redirects (although this also sometimes gives pause to wonder, as they often don't always have the same definitions. They don't have the same declensions either, but I can make templates for that). The issue is if I have for example lekarz (which it might have?) with 6-10 forms and labels, it might not sound that bad, but I think giving it more structure would aid the reader, i.e. being able to organize subdialects by dialect and also maybe even geographically. Vininn126 (talk) 07:23, 6 May 2024 (UTC)[reply]
@Vininn126 It does sound like you want a generalization of {{desc}}/{{desctree}} for use in Alternative forms or whatever. I think structuring it the way that Descendants sections do it would work well. Benwing2 (talk) 08:03, 6 May 2024 (UTC)[reply]
I suppose I'll try that once I've done more work on the actual subdialects themselves. Vininn126 (talk) 08:05, 6 May 2024 (UTC)[reply]

< > (Unsupported titles/`lt` `gt`): broken labels[edit]

{{lb|mul|Internet slang}} results in “(Internet slang[[Category:Translingual internet slang|]])”. J3133 (talk) 15:27, 5 May 2024 (UTC)[reply]

@J3133 On which page? I tried this on a test page and it displays correctly. Benwing2 (talk) 19:42, 5 May 2024 (UTC)[reply]
@Benwing2: have you tried clicking on the wikilink in the title? I suspect it has something to do with interaction between characters in the pagename and the wikitext generated by the modules for the categories, though I have no clue whether it's the modules or the js or both that's doing it. Chuck Entz (talk) 20:31, 5 May 2024 (UTC)[reply]
@Chuck Entz Thanks, I see it now. Benwing2 (talk) 21:10, 5 May 2024 (UTC)[reply]
@Theknightwho Can you please take a look at this? This is happening in makeSortKey in Module:languages. The pagename passed in is < >, which is the actual pagename (rather than the "Unsupported titles" version), and line 1282 removes HTML tags, with the result that the sort key is an empty string, which is why the display shows garbled. I don't know why you are removing HTML tags but it clearly will interact badly with any pagename that looks like an HTML tag or contains HTML tags in it. Benwing2 (talk) 21:32, 5 May 2024 (UTC)[reply]
@Benwing2 Yes, this is something I’m aware of but a proper fix may not be straightforward, and may need to wait until the proper rewrite and disentanglement of the links and languages modules is ready. In the meantime, it should be possible to use the original input as a backup if the result is the empty string. Theknightwho (talk) 22:06, 5 May 2024 (UTC)[reply]
@Theknightwho Can you code this up? Benwing2 (talk) 22:25, 5 May 2024 (UTC)[reply]

Enabling collapsed quotations on Appendix: and Reconstruction: pages[edit]

I've already gone through every page in the Appendix and Reconstruction namespaces to remove bad uses of #*. Could one of our interface administrators change MediaWiki:Gadget-defaultVisibilityToggles.js#L-435 to include namespaces 100 (Appendix) and 118 (Reconstruction) so that quotations can display properly? (Before anyone asks, there are legitimate quotations on reconstructed entries, such as Reconstruction:Proto-Germanic/Harigastiz). Ioaxxere (talk) 16:55, 5 May 2024 (UTC)[reply]

The issue is some of our reconstruction only languages actually have quotations, if I understand correctly... Which means they aren't reconstruction only. Vininn126 (talk) 17:00, 5 May 2024 (UTC)[reply]
Yeah this is related to the issue with Proto-West-Germanic kamb being attested, which still hasn't been resolved. Benwing2 (talk) 19:40, 5 May 2024 (UTC)[reply]
 Done This, that and the other (talk) 10:00, 6 May 2024 (UTC)[reply]