detect_script

Fragment of a discussion from User talk:Rua
Jump to navigation Jump to search
CodeCat16:31, 18 August 2013

I don't think this is going to work without some changes. It's also categorising entries where the language is "und", which will eventually include all 20 thousand entries in Category:term cleanup... And what happens if languages are written in a script for which no detection exists yet?

CodeCat16:43, 18 August 2013

Yes I've noticed that, und is excluded now. But there's no way to exclude those without scriptinfo.characters specified.

Z16:54, 18 August 2013

So how do we prevent that? What we really want is, before doing any detection, to make sure that all of the language's scripts have detection available. If not, then don't do the detection, because we can't make a proper determination on what script is really being used, and fallback wouldn't help in this case as the proper script simply can't be determined.

CodeCat16:58, 18 August 2013

The solution is simply adding the remaining scriptinfo.characters, most of them are already added, the remaining ones are mostly not-so-highly-used scripts and mostly those which are not used beside another script or other scripts, so the problem you said would rarely happen.

Z17:07, 18 August 2013
 

I did a search in Module:languages, here is a list of all of the scripts that are used alongside other script(s) and doesn't have "characters" field in Module:scripts yet (we should only be worried about these ones for now): Batk, Egyp, Mero, Bali, Ethi, Bugi, Dupl, CGK, Cans, Cyrs, Glag, Egyp, Egyd, Runr, Hmng, Jpan, Java, Knda, Mlym, Mend, Phlv, Phlp, Teng, Brah, Gran, Khar, Knda, Orya, Shrd, Telu, Tibt, Saur, Zzzz?, Mani, Sund, Tglg, Hani, Ogam

Z17:25, 18 August 2013

I added character ranges for some more scripts. What should we do with "None" and "Zyyy"? We can do two things. Either they match on "." so that a language that uses one of them never triggers fallback, or they match on some nonexistant character only, so that there never is a match and fallback is always triggered.

CodeCat18:37, 18 August 2013

I think it should always trigger the fallback. Why do you think we should do this through adding a nonexistent character to Module:scripts? Why not doing that simply with an "if" in detect_script?

Z18:50, 18 August 2013