Raw links in [[Template:ga-noun]]

Fragment of a discussion from User talk:CodeCat
Jump to: navigation, search

That's from an unfinished project of mine to remove the use of sort= from Wiktionary. I'll have to get back to work on that. There's some more info at User talk:Embryomystic.

03:18, 2 July 2013

@Metaknowledge, do you mean removing sort= from *all* entries? I'm intrigued, but I cannot think how that would work for JA entries, given that one "spelling", as it were, can often be read in multiple different ways, with each reading requiring its own sorting. C.f. , for instance, which has at least four readings, each requiring its own sorting.

The MediaWiki backend fails horribly at this, as the categorization mechanism only seems to accept the last sorting given on a page. For instance, should be listed in Category:Japanese_nouns under all four readings tazu, tsuru, tsu, and kaku, but it's only under kaku, as that's the last one on the page. I brought this up at mw:Help_talk:Category#Any_way_to_sort_under_multiple_sort_keys.3F, but there have been no replies in the past 16 months. I should probably file a bug, but I'm not really sure how to go about it.

07:03, 2 July 2013

It should probably work in the same way that automatic transliterations work. There would be certain languages for which it is available, and others for which it isn't. Even so, though, we can probably use some default rules that apply to all languages, like removing the initial - from suffixes.

11:51, 2 July 2013

@Eiríkr: Well, yes, that is the goal. Your multisort issue with Japanese is definitely a MW problem that I can't help with, but theoretically (in a world where that was not a problem) sort= would still be unnecessary for Japanese because we'd just use the automatic transliteration of the manually input hiragana readings as the sort key in the vast majority of cases. tl;dr: JA is crazy here, I probably won't touch it anytime soon.

@CodeCat: Yes, although we need to be careful on scripts. For example, Hebrew script uses ־ instead and that would need to be stripped. Are you interested in dealing with that one, or at least figuring out in which module the logic should go?

15:45, 2 July 2013

@Μετάknowledge, that sounds good in those cases where a kana string is supplied. However, there are many instances of template calls like {{alternative spelling of}} (see 竹とんぼ, 眉尖刀, 滋籐 for examples) that have no kana string. How would this work without a sort param?

19:52, 3 July 2013

It can't work, anytime in the immediate future. My longterm belief is that entries should be wrapped in a something (presumably a template) so that each piece of data in an entry is only entered once. That would mean no sort keys for Japanese, no need to manually specify the genitive singular for every Latin noun, and tons of other benefits to editors. Unfortunately, that's not what we have to work with now, so I still won't touch Japanese sorting unless I have an idea or the situation changes.

20:19, 3 July 2013

That's some of what I was exploring a while back with a suggestion of reworking entry structures to use subpages for each language. Our current data model is horrifically abnormal (from a database standpoint), and that makes all kinds of things much more difficult or even impossible. I do terminology work in my day job, and we use actual database-backed tools with proper data models. Things that are trivial there are non-starters here. I'm no whiz DBA, but I know enough to know that "everything as text in one page" is ... I don't know what to call it, but "anti-model" might be the best term.

Anyway, I met all kinds of opposition to my suggestion, and it didn't (and still doesn't) strike me as all that radical; much luck to you for coming up with an approach for better data organization that garners a more positive response.

02:45, 5 July 2013