Module:pi-decl/noun: difference between revisions
RichardW57 (talk | contribs) Added neuter stems in -as and masculine stems in -an to all scripts. |
RichardW57 (talk | contribs) Isolated Brahmi - appears to be a fault in handling of supplementary codepoints. |
||
Line 48: | Line 48: | ||
elseif match(stem, sc("[สसসသ][[กฺक्ক্က်]$")) then -- Somehow fails if this test and next are combined! |
elseif match(stem, sc("[สसসသ][[กฺक्ক্က်]$")) then -- Somehow fails if this test and next are combined! |
||
ending = "as" |
ending = "as" |
||
-- elseif match(stem, sc("[ᩈសස𑀲][ᨠ᩺ᨠ᩼គ៑ක්𑀓𑁆]$")) -- Fails except for Lana. Suspect problem in supplementary plane handling. |
|||
elseif match(stem, sc("[ᩈសස][ᨠ᩺ᨠ᩼គ៑ක්]$")) then |
|||
ending = "as" |
|||
elseif match(stem, "𑀲𑁆$") then |
|||
ending = "as" |
ending = "as" |
||
elseif match(stem, "an$") then |
elseif match(stem, "an$") then |
Revision as of 19:41, 3 October 2018
- The following documentation is located at Module:pi-decl/noun/documentation. [edit] Categories were auto-generated by Module:module categorization. [edit]
- Useful links: root page • root page’s subpages • links • transclusions • testcases • sandbox
Purpose
This module provides inflection tables for Pali for nouns, adjectives and pronouns. For pronouns, one currently uses the interface for nouns, while for adjectives one uses separate invocations for each gender.
Some functions are exported from this module to service the testing of noun inflection. The module also provides utility functions for the conjugation of verbs.
Normal Use
The normal way to use this module is to invoke the template {{pi-decl-noun}}
, which see for the interface. This invokes the exported function show
.
Data tables
The primary data table for the inflections is the data module Module:pi-decl/noun/Latn, which contains the Latin script tables. These are supplemented by identically structured tables for each of the other supported scripts. If the table for a particular paradigm is missing from one of these, the table will be generated using the transliteration functions in Module:pi-Latn-translit. The data modules for the other scripts are:
- Module:pi-decl/noun/Thai
- Module:pi-decl/noun/Deva
- Module:pi-decl/noun/Brah
- Module:pi-decl/noun/Beng
- Module:pi-decl/noun/Sinh
- Module:pi-decl/noun/Mymr
- Module:pi-decl/noun/Lana
- Module:pi-decl/noun/Laoo
- Module:pi-decl/noun/Khmr
- Module:pi-decl/noun/Latn
With the exception of the masculine and neuter thematic nouns, the Thai and Lao tables are not used for declension with explicit vowels.
There is no such redundant table for the Chakma script.
Deliberately Exported Functions
The following Lua functions are exported by this module:
orJoin()
joinSuffix
arrcat_nodup
present
show
Function orJoin
Function joinSuffix
The original idea was to share this function with the code for verb conjugation. However, the conjugation of verbs in the Thai and Lao scripts is more complicated, and there is therefore a more general function in use for verbs.
Function arrcat_nodup
Function present()
Function show()
Other exported functions
detectEnding()
joinSuffixes
getSuffixes
modify
Algorithm
The paradigm to use is determined using the script of the stem, the ending of the stem (for which there are a few conventional values - see {{pi-decl-noun}}
) and gender of the stem. The script is always deduced from the script of the stem, while the ending may be supplied explicitly (in Latin script) or deduced from the stem. The gender is always supplied explicitly. The deduction of the ending from the stem is performed by function detectEnding
.
The set of suffixes is obtained by function getSuffixes
. This first attempts to load the paradigm from the data files. However, if the paradigm is unacceptable or missing, it will generate it itself. Paradigms from data files are only acceptable for some combinations of settings. At present, they are not acceptable for non-Roman scripts when using explicit vowels, except for the conventional ending 'ah', which denotes masculine or neuter nouns with stems in explicit -a. (The convention was chosen because the explicit vowel also represents the Sanskrit ending -aḥ.)
When paradigms are generated internally, they are converted from Latin script to the required script and implicit vowel settings. This is implemented in function convert_suffixes
.
The second stage of the generation, applicable to the Lao script only, is to, where needed, convert the ablative and instrumental plural in -bhi to the correct forms. The editor specifies the correct form using the parameter |liap=
.
The third stage of the generation, applicable to Lao script only, is to, where needed, convert the letter corresponding to <y> in the suffixes to the correct letter. This setting is treated as orthogonal to the choice between using or not using implicit vowels.
The endings are then attached to the stem using the function joinSuffixes
. This invokes function joinSuffixes
to apply the writing system-dependent rules for the attachment of suffixes. There is one user-controlled input to this process, the parameter |aa=
, which is applicable to the Burmes and Tai Tham scripts.
Next, the function modify
is applied to add, remove or replace the forms generated so far in accordance a list of modifications included in the invocation of {{pi-decl-noun}}
.
Finally, the function present
formats the list of forms for each combination of case and number. This formatting includes adding the transliteration, which is done in function orJoin
. Function show
then returns the inflection table for display on the page.
local export = {}
local links = require("Module:links")
local lang = require("Module:languages").getByCode("pi")
local gsub = mw.ustring.gsub
local match = mw.ustring.match
local sub = mw.ustring.sub
local u = mw.ustring.char
local genders = {
["m"] = "masculine", ["f"] = "feminine", ["n"] = "neuter",
}
local rows = {
"Nominative (first)", "Accusative (second)", "Instrumental (third)", "Dative (fourth)",
"Ablative (fifth)", "Genitive (sixth)", "Locative (seventh)", "Vocative (calling)",
}
local sc = function(str) -- 'strip carrier' - allows more legible inclusion of combining marks in strings.
return gsub(str, "[กकকကᨠគකk𑀓]", "")
end
function export.detectEnding(stem)
local ending
-- Thai, Deva, Beng, Mymr, Lana, Khmr, Sinh, Latn and then Brah
-- uses u() to prevent decomposition
if match(stem, "[าा"..u(0x0906).."া"..u(0x0986).."ါာᩣᩤា"..u(0x17A4).."ා"..u(0x0D86).."ā]$")
or match(stem, "𑀸$") or match(stem, "𑀆$") then
ending = "ā"
elseif match(stem, "[ิिइিইိဣᩥᩍិឥිඉi]$") or match(stem, "𑀺$") or match(stem, "𑀇$") then
ending = "i"
elseif match(stem, "[ีीईীঈီဤᩦᩎីឦීඊī]$") or match(stem, "𑀻$") or match(stem, "𑀈$") then
ending = "ī"
elseif match(stem, "[ุुउুউုဥᩩᩏុឧුඋu]$") or match(stem, "𑀼$") or match(stem, "𑀉$") then
ending = "u"
elseif match(stem, "[ูूऊূঊူဦᩪᩐូឨឩූඌū]$") or match(stem, "𑀽$") or match(stem, "𑀊$") then
ending = "ū"
elseif match(stem, "ar$") then
ending = "ar"
elseif match(stem, sc("[รरরရ][กฺक्ক্က်]$")) then -- Somehow fails if this test and next are combined!
ending = "ar"
elseif match(stem, sc("[ᩁរර𑀭][ᨠ᩺ᨠ᩼គ៑ක්𑀓𑁆]$")) then
ending = "ar"
elseif match(stem, "as$") then
ending = "as"
elseif match(stem, sc("[สसসသ][[กฺक्ক্က်]$")) then -- Somehow fails if this test and next are combined!
ending = "as"
-- elseif match(stem, sc("[ᩈសස𑀲][ᨠ᩺ᨠ᩼គ៑ක්𑀓𑁆]$")) -- Fails except for Lana. Suspect problem in supplementary plane handling.
elseif match(stem, sc("[ᩈសස][ᨠ᩺ᨠ᩼គ៑ක්]$")) then
ending = "as"
elseif match(stem, "𑀲𑁆$") then
ending = "as"
elseif match(stem, "an$") then
ending = "an"
elseif match(stem, sc("[นनনန][กฺक्ক্က်]$")) then -- Somehow fails if this test and next are combined!
ending = "an"
elseif match(stem, sc("[ᨶនන𑀦][ᨠ᩺ᨠ᩼គ៑ක්𑀓𑁆]$")) then
ending = "an"
else
ending = "a"
end
return ending
end
function export.joinSuffix(scriptCode, stem, suffixes)
local output = {}
local term, io
io = 1;
for _,suffix in ipairs(suffixes) do
if match(suffix, "^⌫⌫") then --backspace
term = sub(stem, 1, -3) .. sub(suffix, 3, -1)
elseif match(suffix, "^⌫") then --backspace
term = sub(stem, 1, -2) .. sub(suffix, 2, -1)
else
term = stem .. suffix
end
if scriptCode == "Thai" then
term = gsub(term, "(.)↶([เโ])", "%2%1") --swap
end
if scriptCode == "Mymr" then
term = gsub(term, "င္", "င်္")
term = gsub(term, "(င်္)([ဝခဂငဒပ])(ေ?)ာ", "%1%2%3ါ")
term = gsub(term, "္[ယရ]", { ["္ယ"] = "ျ", ["္ရ"] = "ြ" }) --these not need tall aa
term = gsub(term, "^([ဝခဂငဒပ])(ေ?)ာ", "%1%2ါ")
term = gsub(term, "([^္])([ဝခဂငဒပ])(ေ?)ာ", "%1%2%3ါ")
term = gsub(term, "([ဝခဂငဒပ])(္[က-အဿ])(ေ?)ာ", "%1%2%3ါ")
term = gsub(term, "္[ဝဟ]", { ["္ဝ"] = "ွ", ["္ဟ"] = "ှ" })
term = gsub(term, "ဉ္ဉ", "ည")
term = gsub(term, "သ္သ", "ဿ")
end
if scriptCode == "Lana" then
term = gsub(term, "ᨦ᩠", "ᩘ")
term = gsub(term, "^([ᩅᨣᨵᨷᨻ])(ᩮ?)ᩣ", "%1%2ᩤ")
term = gsub(term, "([^᩠])([ᩅᨣᨵᨷᨻ])(ᩮ?)ᩣ", "%1%2%3ᩤ")
term = gsub(term, "([ᩅᨣᨵᨷᨻ])(᩠[ᨠ-ᩌᩔ])(ᩮ?)ᩣ", "%1%2%3ᩤ")
term = gsub(term, "᩠[ᩁᩃ]", { ["᩠ᩁ"] = "ᩕ", ["᩠ᩃ"] = "ᩖ" })
term = gsub(term, "([ᨭ-ᨱ])᩠ᨮ", "%1ᩛ")
term = gsub(term, "([ᨷ-ᨾ])᩠ᨻ", "%1ᩛ")
term = gsub(term, "ᩈ᩠ᩈ", "ᩔ")
end
--[[if scriptCode == "Laoo" then
term = gsub(term, "(.)↶([ເໂ])", "%2%1")
end]]
output[io] = term;
io = io + 1;
end
return output
end
function export.orJoin(script, list)
local output = "";
for _,term in ipairs(list) do
if output ~= "" then
output = output .. " <small style=\"color:888\">or</small> "
end
output = output .. links.full_link({lang = lang, sc = script, term = term})
end
return output
end
-- convert Latin script inflections to another script
local convert_suffixes = function(stem, nstrip, suffixes, sc)
local form, pre
local xlitend = {}
local strip = string.rep("⌫", nstrip)
for k = 1, #suffixes do
xlitend[k] = {}
form = export.joinSuffix('Latn', stem, suffixes[k])
for ia, va in pairs(form) do
altform = to_script(va, sc)
-- Special handling is needed for a preposed vowel.
pre = match(altform, "^[เโ]")
if pre then
xlitend[k][ia] = strip .. "↶" .. pre .. sub(altform, 3)
else
xlitend[k][ia] = strip .. sub(altform, 2)
end
end
end
return xlitend
end
function export.getSuffixes(scriptCode, ending, g)
local pattern = require("Module:pi-decl/noun/" .. scriptCode) or nil
local applicable
if pattern[ending] then
applicable = pattern[ending][g]
else
applicable = nil
end
if applicable then
return applicable
elseif 'Latn' == scriptCode then
return nil
else
pattern = require("Module:pi-decl/noun/Latn") or nil
to_script = require("Module:pi-Latn-translit").tr
applicable = pattern[ending] and pattern[ending][g] or nil
if not applicable then
return nil
elseif 'ar' == ending then
return convert_suffixes('kar', 2, applicable, scriptCode)
elseif 'as' == ending then
return convert_suffixes('kas', 2, applicable, scriptCode)
elseif 'an' == ending then
return convert_suffixes('kan', 2, applicable, scriptCode)
else
return nil
end
end
end
function export.show(frame)
local args = frame:getParent().args
local PAGENAME = mw.title.getCurrentTitle().text
local stem = args[1] or args["stem"] or PAGENAME
currentScript = require("Module:scripts").findBestScript(stem, lang)
scriptCode = currentScript:getCode()
local ending = args[2] or args["ending"] or export.detectEnding(stem)
local g = args[3] or args["g"] or args["gender"] -- for each gender only
if not g then
error("A gender is required to display proper declensions.")
end
local selectedPattern = export.getSuffixes(scriptCode, ending, g)
local output = '<div class="NavFrame" style="min-width:30%"><div class="NavHead" style="background:#d9ebff">Declension table of "' .. stem .. '" (' .. genders[g] .. ')</div><div class="NavContent">'
output = output .. '<table class="inflection-table" style="background:#F9F9F9;text-align:center;width:100%"><tr><th style="background:#eff7ff">Case \\ Number</th><th style="background:#eff7ff">Singular</th><th style="background:#eff7ff">Plural</th></tr>'
for i,v in ipairs(rows) do
output = output .. "<tr><td style=\"background-color:#eff7ff;\">" .. v .. "</td>"
output = output .. "<td>" .. export.orJoin(currentScript, export.joinSuffix(scriptCode, stem, selectedPattern[2 * i - 1])) .. "</td>"
output = output .. "<td>" .. export.orJoin(currentScript, export.joinSuffix(scriptCode, stem, selectedPattern[2 * i])) .. "</td>"
output = output .. "</tr>"
end
output = output .. "</table></div></div>"
return output
end
return export