Module talk:eo-sortkey

From Wiktionary, the free dictionary
Jump to navigation Jump to search

ĉ, ĝ, ĥ, ĵ, ŝ, ŭ[edit]

@Theknightwho: These letters do not sort correctly. E.g., aĉeti with {{eo-head}} adds [[Category:Esperanto lemmas|AĈETI]] (as seen using Special:ExpandTemplates), which should be ACETI. J3133 (talk) 16:14, 6 January 2023 (UTC)[reply]

@J3133 Done. This is actually done through Module:languages/data2 - this sortkey is obsolete. Theknightwho (talk) 20:18, 6 January 2023 (UTC)[reply]
@Theknightwho: The sortkey for ĝiri is GIRI (with a private-use character after G). J3133 (talk) 08:47, 7 January 2023 (UTC)[reply]
@J3133 I've just realised why this is: it's because ĝ comes after g in the Esperanto alphabet, so I had set it to do precisely that. I don't think we should be sorting it as the same, as it just makes lists of Esperanto lemmas more difficult to parse.
The issue you initially raised was related to a more general issue following a change to Module:languages, which has now been sorted, which is why it's gone back to sorting the way it was doing before. Theknightwho (talk) 15:48, 7 January 2023 (UTC)[reply]
@Theknightwho: Should the same sorting be specified for the digraphs ch/cx, gh/gx, hh/hx, jh/jx, sh/sx and ux used in H-system and X-system forms (and should they be in lemma categories)? J3133 (talk) 15:54, 7 January 2023 (UTC)[reply]
@J3133 I don't really have an opinion on whether they should be lemmas. I'm inclined towards sorting the digraphs (e.g.) ĉ, ch, cx, so as to keep the different forms in a consistent order if they are sorted together. The one question I do have is whether these letter combinations can occur outside of digraphs, and (if so) whether there's a straightforward way to determine that. It's not a problem if not. Theknightwho (talk) 15:58, 7 January 2023 (UTC)[reply]
@Theknightwho X is only used in the digraphs, whereas with H these letter combinations can occur: all entries I found using a Special:Search regex are bushaltejo (aŭtobushaltejo), ĉashundo, flughaveno, mashava, polichundo, senchava (senchaveco) and vangharoj. J3133 (talk) 16:10, 7 January 2023 (UTC)[reply]
@J3133 Hmm, tricky. I do have an idea, though: if these pages are consistently marked with the {{eo-spel}} template, then we can consistently determine whether they are H-system or X-system and take that into account accordingly. This will even work for ĉashundo/chashundo, as we'll know what the original spelling is, too. Doing this will require recreating Module:eo-sortkey, as this will rely on an additional step that isn't possible to do via the standard sortkey method. However, I've implemented something similar for Zhuang, so there is precedent for doing this. See Module:za-sortkey, lines 81-96.
As with Zhuang, it won't be possible to do this in list sorts (e.g. {{col3}}) if the page hasn't been created (as there won't be any way to tell if it's H-system or not). In those instances, I'll default to assuming that it is H-system, given that these exceptions are relatively rare. I'll implement the same logic for X-system, on the off-chance that X does ever get used outside of digraphs. Theknightwho (talk) 16:27, 7 January 2023 (UTC)[reply]