This module is exports various general utility functions, which can be used by other modules.
Escapes the magic characters used in Regular Expression patterns. For example, "^$()%.*+-?|" becomes "%^%$%(%)%%%.%[%]%*%+%-%?%|".
format_categories(categories, lang, sort_key, sort_base, force_output)
Formats a list (table) of category names. The output is a string consisting of all categories with [[Category:...]] applied to each one, and the given sort key added. If the namespace is not the main, Appendix or Reconstruction namespaces, the output will be an empty string unless FORCE_OUTPUT is given. If no sort key is given:
- A default one is generated by using SORT_BASE (if given) or the current subpage name, and by removing hyphens from the beginning (so that suffixes can be sorted without a key).
- If a sort key is available for the given language, it is then used to create a sort key that follows the rules for that language.
- If the final sort key ends up being identical to PAGENAME (which is the default key used by the software), then it is omitted entirely, so that it can be used in combination with DEFAULTSORT.
This function adds a "catfix", which is used on language-specific category pages to add language attributes and often script classes to all entry names. The addition of language attributes and script classes makes the entry names display better (using the language- or script-specific styles specified in MediaWiki:Common.css), which is particularly important for non-English languages that do not have consistent font support in browsers.
Language attributes are added for all languages, but script classes are only added for languages with one script listed in their data file, or for languages that have a default script listed in the
catfix_script list in Module:utilities/data. Some languages clearly have a default script, but still have other scripts listed in their data file and therefore need their default script to be specified. Others do not have a default script.
- Serbo-Croatian is regularly written in both the Latin and Cyrillic scripts. Because it uses two scripts, Serbo-Croation cannot have a script class applied to entries in its category pages, as only one script class can be specified at a time.
- Russian is usually written in the Cyrillic script (
Cyrl), but also Braille (
Brai) listed in its data file. So Russian requires an entry in the
catfix_scriptlist, so that the
Cyrl(Cyrillic) script class will be applied to entries in its category pages.
To find the scripts listed for a language, go to Module:languages and use the search box to find the data file for the language. To find out what a script code means, search the script code in Module:scripts/data