I am taking a WikiBreak (Sep 22 2014). Most recent work, and todos:

  • Implemented tr_latin_matching() in Module:ar-translit. This transliterates from Latin to Arabic, given the equivalent possibly-unvocalized Arabic. Essentially, it matches the Latin transliteration to the Arabic and uses it to vocalize the Arabic (fill in vowel diacritics), to the extent it's not already vocalized. When vocalization fails, the page gets put into Category:Arabic terms where vowel insertion failed. There are currently 76 entries on this page, which should be fixed one by one. Note that implementing automatic vocalization of headwords will add many more to this category. If you edit Module:ar-translit, you can turn on errors, so that failed vocalization causes a module error. This can be temporarily useful to see exactly where in what word the failure is happening, since it will show both the word and the character and index in the word that can't be matched.
  • The failures are caused by various things and I try to fix them in Module:ar-translit as I encounter them. A failure that still exists is that sometimes the Arabic includes an 'i`rāb vowel but the transliteration doesn't. I handle -un by allowing "" as a match, but this seems less palatable for -a (in -ūna), -i (in -āti), -u (in diptotes, in place of -un). We need to special-case the end of the word and have a table just for it, with fallback to the regular table.
  • Currently automatic vocalization is implemented in {{ar-linkify-bold}} and places that call it, which means it handles plurals, feminines, etc. but not the main headword and not certain other places. We should create our own version of {{head}}, maybe {{ar-head}}, which uses {{ar-linkify-bold}}. It does not need to do the full features of {{head}}, just what is needed by {{ar-noun}}, {{ar-adj}}, {{ar-noun-head}}, {{ar-pos}}, etc. Probably there should also be an equivalent {{ar-linkify}} that doesn't boldface the Arabic but otherwise does the same thing as {{ar-linkify-bold}}; to implement this, use one underlying function (currently implemented in Lua in Module:ar-utilities) that implements both templates and takes a param boldlink=1 to specify bolding of the link (the Arabic) but not the transliteration, as {{ar-linkify-bold}} currently does.
  • Note there is also tr_latin_direct() in Module:ar-translit, which transliterates Latin to Arabic without the equivalent unvocalized text. At one point I considered allowing verbal nouns to be specified in Latin text and automatically transliterate them; this module could be helpful. Note that it needs some work (see comments in module).
  • It was mentioned above that sometimes the Arabic includes an 'i`rāb vowel but the transliteration doesn't. This applies in particular for nouns and adjectives. It could be implemented so that templates for nouns and adjectives leave off the 'i`rāb vowels.

