User talk:AryamanA/sandbox

From Wiktionary, the free dictionary
Latest comment: 6 years ago by Kutchkutch in topic Help
Jump to navigation Jump to search

Help[edit]

@Kutchkutch So, I've done a little work. Can you check the above tables? I'm not sure how plural oblique works, so I haven't added it yet. —Aryaman (मुझसे बात करो) 20:03, 31 October 2017 (UTC)Reply

@Aryamanarora: Wow, that looks amazing! Other than transliteration issues, विंचू is declined like that with an oblique marker and alternatively without any oblique marker. Kutchkutch (talk) 21:56, 31 October 2017 (UTC)Reply
@Kutchkutch: The module MOD:mr-decl I wrote is really clunky lol. BTW, I can find no results for the ergative विंच्वाने (viñcvāne) but plenty for विंचूने (viñcūne). —Aryaman (मुझसे बात करो) 22:37, 31 October 2017 (UTC)Reply
@Aryamanarora: Thanks, and I appreciate all the work you've done on Module:mr-decl! Since Module:mr-decl is the replacement for T:mr-decl-noun, that could mean I won't be able contribute as much to the implementation side unless T:mr-decl-noun is still useful. Since it's called Module:mr-decl without '-noun' does that mean it would ideally handle adjective and pronoun declension too?
Module:mr-decl would ideally have the genitive case, but it's fine if it's not handled yet. At Template:mr-decl-noun, I was thinking of removing the locative case for simplicity since it's just a postposition, but that can be done easily at any time.
An old source from Google Books says the default way to decline u-final masculine words is without an oblique, and the oblique is an alternative for some words such as विंचू (viñcū). If the oblique is used for विंचू (viñcū), I see a many search results for विंचवाला, विंचवांना, विंचवाने, विंचांनी, etc without the virama/halant before .
It looks like you used Dhongde & Wali to make Module:mr-decl, and Dhongde & Wali doesn't say much on plural cases other than Direct Plural. Also, it looks the module can be manually told if a consonant-final feminine nouns is in the exception class with elseif g == "f-e" then . Kutchkutch (talk) 01:42, 1 November 2017 (UTC)Reply
@Kutchkutch: Okay, thanks for the help! I think I will make a function for alternate obliques. Also, I totally forgot about the genitive, I'll implement that as well. And yes, you're right about "f-e". —Aryaman (मुझसे बात करो) 10:37, 1 November 2017 (UTC)Reply


@Aryamanarora Here's a list of rules if this project is to ever be completed. Couldn't think of a better place to put. Kutchkutch (talk) 01:25, 12 November 2017 (UTC)Reply
@Kutchkutch: Wow, this makes many things clear to me! I'll probably be unable to work much on it next week because of real life, but I'll do a bit tomorrow for sure. —Aryaman (मुझसे बात करो) 01:41, 12 November 2017 (UTC)Reply
@Aryamanarora: Thanks for your willingness to continue! Figuring out how each paradigm works is confusing since direct singular, direct plural, oblique singular and oblique plural could all be different. Kutchkutch (talk) 10:51, 12 November 2017 (UTC)Reply
@Aryamanarora: Perhaps a solution to deal with this confusion is to create 2x2 testcase tables for each regular paradigm so that I can show how they would look like Kutchkutch (talk) 21:32, 12 November 2017 (UTC)Reply
@Kutchkutch: This is amazing! The tables really help. —Aryaman (मुझसे बात करो) 12:19, 13 November 2017 (UTC)Reply
@Aryamanarora: I can't thank you enough for the work you've done thus far on everything. It's the least I can do so that the focus is more on to how to Luacise this rather than what to Luacise. I noticed that you've even 'adopted' this project to the point that you put {{currently|Marathi decl and translit}} on your userpage. Since User:Vinayak.patwardhan originally created Template:mr-decl-noun when I wasn't around, this seemed like a natural starting point.
I combined the direct plural and & oblique charts in Dhongde & Wali into a single chart in the 'Regular Paradigms' section for clarity. Of course there may be ways to improve this guide, but it appears to be comprehensive for the most part. I'm still wondering if pronoun declension should be handled automatically or entered manually. Kutchkutch (talk)
@Kutchkutch: I've revamped Module:User:Aryamanarora/mr-decl so it's more flexible (now it uses regex to match stems). I've implemented the first rule, you can see at User:Aryamanarora/sandbox that माणूस (māṇūs) and वडील (vaḍīl) decline correctly now! —Aryaman (मुझसे बात करो) 21:36, 14 November 2017 (UTC)Reply
@Aryamanarora: That's exactly what I was looking at right now! It looks great. Kutchkutch (talk) 21:40, 14 November 2017 (UTC)Reply
@Aryamanarora: For long u, the schwa would ideally deleted: देवळा (devḷā), but if that's not a priority right now then it's fine (along with the genitive). Kutchkutch (talk) 21:48, 14 November 2017 (UTC)Reply
@Kutchkutch: Fixed. MOD:mr-translit was treating "ळ" as a diacritic (like "ं"). —Aryaman (मुझसे बात करो) 21:52, 14 November 2017 (UTC)Reply
@Aryamanarora: Really? I didn't even notice. Kutchkutch (talk) 21:55, 14 November 2017 (UTC)Reply
@Aryamanarora: I'm aware that you've indicated that you're 'getting ready for exams'. So I hope this isn't distracting you too much.
Now there appears to be an issue with पाल, रात्री, लाट. Kutchkutch (talk) 22:05, 14 November 2017 (UTC)Reply
@Kutchkutch: Hmm, I think I'm done with editing for today TBH. I'll look at it tomorrow. Don't worry about it :) —Aryaman (मुझसे बात करो) 22:09, 14 November 2017 (UTC)Reply
@Aryamanarora: Ok then, thanks for all the work today! I really appreciate it. Kutchkutch (talk) 22:12, 14 November 2017 (UTC)Reply


@माधवपंडित IDK how much you know about Marathi declension (more than me, that's for sure), but you might be interested in this as well. —Aryaman (मुझसे बात करो) 21:47, 14 November 2017 (UTC)Reply

@Aryamanarora: Sure, this looks great. Kutchkutch's detailed rules are also very helpful. The desclension structure (and the associated schwa deletion) are very similar to Konkani, only the suffixes differ. As already pointed out, the feminine nouns aren't declining properly -- माधवपंडित (talk) 01:54, 15 November 2017 (UTC)Reply
@माधवपंडित: Fixed now. (But what was wrong with लाट (lāṭ)?) —Aryaman (मुझसे बात करो) 02:03, 15 November 2017 (UTC)Reply
@Aryamanarora: See the 'C-ending feminine exceptional paradigm (f-e)' section below. Oblique plural: लाटां-.Kutchkutch (talk) 02:06, 15 November 2017 (UTC)Reply
@Kutchkutch: Okay, now it's fixed :) —Aryaman (मुझसे बात करो) 02:17, 15 November 2017 (UTC)Reply
Thanks, I had no idea what was wrong. -- माधवपंडित (talk) 02:20, 15 November 2017 (UTC)Reply
The oblique plural is the hardest 'box' to ascertain, but a quick search shows this is the case for लाट (lāṭ) since even Google suggests 'Did you mean: लाटांवर' when लाटेंवर is searched. Kutchkutch (talk) 02:22, 15 November 2017 (UTC)Reply
There appears to a delay between changes at a module and transmitting those changes to where that module is being used so doing a page preview of a page shows the latest version (The mobile version takes hours).
@Aryamanarora: The in भक्त appears to be missing.
Also, I realised that oblique plural of शाळा is शाळां- instead of शाळें-. That was an error on my end. Dhongde & Wali don't say much about the oblique plural other than the . I wish I would have caught that much earlier. Kutchkutch (talk) 03:00, 15 November 2017 (UTC)Reply
@Aryamanarora: I originally thought C cluster-ending might be unnecessary, but this shows it is necessary.
[1] has an enormous list of words tagged with an example word to show how it's declined, which would be useful for the ambiguous cases if I can figure out the correspondences between their example words and Dhongde & Wali's example words. भक्त is an example word used for tagging in that that list to show that a C cluster-ending masculine word is declined exactly like भक्त. There's other resources at [2], but I haven't figured out a use for them yet. Kutchkutch (talk) 22:30, 15 November 2017 (UTC)Reply
@Kutchkutch: I was actually just now on that site, using their wonderful IndoWordNet tool to fill up the descendants for Sanskrit जानाति (jānāti). That list could actually be used in our code if C clusters are too irrgular, much like how {{zh-new}} can automatically generate Mandarin and Cantonese pronunciations (which have no obvious rules) using a database. —Aryaman (मुझसे बात करो) 22:35, 15 November 2017 (UTC)Reply
@AryamanA: (Off topic, but It appears your username has changed, which changed the title of this page…)
Thanks for suggesting the idea if it becomes necessary!
The tags in that list includes समई (samaī), which perhaps suggests my guide may not be as complete as I originally thought, but I think the only difference is the lack of the halant/virama in the oblique marker ्या.
Another puzzling tag is वकील (vakīl), which suggests there could be more exceptions to the final closed syllable rule in which case your suggestion might come in handy. However, in that list वकील (vakīl) is the tag for वडील (vaḍīl) so perhaps the list didn't account for the final closed syllable rule.
The module even in it's current state is a spectacular wonder. I know it takes a lot of patience to code, and you've clearly shown that. As I realise how the additional paradigms work, I can update the guide. Kutchkutch (talk) 23:46, 15 November 2017 (UTC)Reply
@Kutchkutch: Out of curiosity I Googled वकलाला (vaklālā) and only got 3 hits and "Did you mean: वकिलाला (vakilālā)". So the stem does change, but I wonder if it is irregular because it's a Persian-derived word.
But then I looked up वडलाशी (vaḍlāśī), and got very few hits. I tried वडिलाशी (vaḍilāśī) and got many thousands of hits. That makes me think the stem is just weakened, not entirely lost, and maybe वडलाशी (vaḍlāśī) dropping the vowel entirely is a colloquial or a recent variant (I only got hits after 2015 for it).
I'll probably implement alternative forms eventually since it seems Marathi is undergoing some standardization right now.
(I changed my username only so that it's not my full name, and User:Aryaman was taken) —Aryaman (मुझसे बात करो) 00:34, 16 November 2017 (UTC)Reply
I really enjoy programming actually, and this is a great way to apply what I've learned. I don't mind it at all. —Aryaman (मुझसे बात करो) 00:36, 16 November 2017 (UTC)Reply
@AryamanA: I think you're right about वडील (vaḍīl). And you're also right about वकील (vakīl), 'i' is never deleted since it's a Perso-Arabic borrowing. Kutchkutch (talk) 02:42, 16 November 2017 (UTC)Reply
@AryamanA: Earlier, I misunderstood the tag for वडील (vaḍīl). Compounds containing वडील (vaḍīl) such as वाडवडील have the <वकील (vakīl)> tag and वडील (vaḍīl) itself has the <पाटील (pāṭīl)> tag. The more I look at that list, the large amount of exceptions perhaps favours integrating features of the tagging system. What I'll do now is try to combine the advantages of both the Dhongde & Wali and the tagging system.
Calling it a tagging system probably doesn't affect the testcases already implemented because it's just another way of looking at the same system with more descriptive names for each paradigm. For example, <देऊळ (deūḷ)> is a tag in the tagging system and पाऊल (pāūl) has the <देऊळ (deūḷ)> tag so since the <देऊळ (deūḷ)> tag is already implemented perhaps nothing needs to done about it. For an ambiguous case such as consonant-ending feminine nouns, 'f-e' is currently the tag used to tell the module to use the exceptional paradigm. Some of the identical paradigms in Dhongde & Wali are merged into a single tag. For example, खांब (khāmba) and भक्त (bhakta) in Dhongde & Wali are effectively the same paradigm so the tagging system merged them into the <भक्त (bhakta)> tag.
The User_talk:AryamanA/sandbox#CFILT_Tags section is not done yet, but it's a preview of how such a system would look like . Kutchkutch (talk) 04:04, 16 November 2017 (UTC)Reply
@Kutchkutch: I will start implementing this tomorrow morning (if you're in India, that'll be in your evening). —AryamanA (मुझसे बात करेंयोगदान) 02:43, 17 November 2017 (UTC)Reply
@AryamanA: Hi! Thanks for letting me know! Kutchkutch (talk) 02:46, 17 November 2017 (UTC)Reply
@AryamanA: As already mentioned, the Dhongde & Wali system appears to work well for the unambiguous and common paradigms. The 'CFILT_Tags' system is perhaps a 'Version 2' of the Dhongde & Wali system for the ambiguous and less common paradigms for exception handling. It's interesting to think about how a module could handle those. The tags in the last section are ones that I'm still thinking about. If there's a use for them, I'll add information about them. If they're not useful they can be ignored because there are a few duplicate tags in the CFIL list. Kutchkutch (talk)
@AryamanA: Thanks for your work on the Eyelash today! The schwa is currently still in the transliteration of the oblique for चेहरा and नोकरी. The Eyelash is an orthographic way of indicating that the has moved to the onset of the following syllable with another consonant following it before the next vowel. Kutchkutch (talk) 22:10, 17 November 2017 (UTC)Reply
@AryamanA: The tagging system is now looking a bit daunting. I like the comparisons with Chinese. Kutchkutch (talk) 01:30, 18 November 2017 (UTC)Reply

Rules[edit]

{{User:Kutchkutch/mr-decl-4|direct singlar|direct plural|oblique singular|oblique plural}}

direct singlar direct plural
oblique singular oblique plural

Stem changes[edit]

Final Closed Syllable: If masculine or neuter stem ends with stem-C₁VC₂ with V = /ə/, /iː/, /uː/, then V → ∅ / stem-C₁ __ C₂[edit]

m-aC: दगड (dagaḍ)दगडा- (dagḍā-)
दगड दगड
दगडा- दगडां-
m-īC: वडील (vaḍīl)वडला- (vaḍlā-)
वडील वडील
वडला- वडलां-
m-ūC: बेडूक (beḍūk)बेडका- (beḍkā-)
बेडूक बेडूक
बेडका- बेडकां-
n-īC: हरीण (harīṇ)हरणा- (harṇā-), plural हरणे- (harṇe)/हरणं- (harṇa)
हरीण हरणे,हरणं
हरणा- हरणां-
n-ūC: माणूस (māṇūs)माणसा- (māṇsā-), plural माणसे- (māṇse)/माणसं- (māṇsa)
माणूस माणसे,माणसं
माणसा- माणसां-
Exceptions: Words that have no sounds preceding C₁VC₂ such as जग (jag), जीव (jīv) बूट (būṭ),
No spelling change, Only transliteration changes: m-aC दगड (dagaḍ),

If stem ends with stem-V₁V₂C with V₂ = /uː/, then uː → ʋ / stem-V₁__C[edit]

देऊळ (deūḷ)देवळा- (devḷā-)
देऊळ देवळे,देवळं
देवळा- देवळां-
पाऊस (pāūs)पावसा- (pavsā-)
पाऊस पाऊस
पावसा- पावसां-
Exceptions: English borrowings such as 'town' and 'gown'.

Palatalisation: If stem ends with /s/ and followed by ्या or , then s → ɕ[edit]

m-a s: पैसा (paisā) + ्यापैशा- (paiśā-)
पैसा पैसे
पैशा- पैशां-
m-a s: मासा (māsā) + ्यामाशा- (maśā-)
मासा मासे
माशा- माशां-
s: म्हैस (mhais, buffalo) (f-i) + म्हैशी- (maiśī-)
म्हैस म्हैशी
म्हैशी- म्हैशीं-
Note: s → ɕ could occur in more cases such as दिवस n + postpostion = दिवशी

Degemination: If stem ends with stem-C₁C₁VC₂, then C₁V → ∅∅ / stem-C₁__C₂[edit]

f-C: चप्पल (cappal) (f-e)चपले- (caple-)
चप्पल चपला
चपले- चपलां-
f-C: गंमत (gammat)/गम्मत (gammat, fun activity) (f-i)गमती- (gamtī-)
गम्मत/गंमत गमती
गमती- गमतीं-

Eyelash : stem-- / + ्या = stem-ऱ्या[edit]

चेहरा (cehrā) + ्या = चेहऱ्या (cehryā)
चेहरा चेहरे
चेहऱ्या- चेहऱ्यां-
नोकरी (nokrī) + ्या = नोकऱ्या (nokryā)
नोकरी नोकऱ्या
नोकरी- नोकऱ्यां-

Regular Paradigms[edit]

direct singlar direct plural
oblique singular oblique plural
Stem Final Masculine Feminine Neuter
C
ां
खांब खांब
खांबा- खांबां-
ीं
पाल पाली
पाली- पालीं-
े,ं
ां
घर घरे,घरं
घरा- घरां-
ə
C Cluster
ां
भक्त भक्त
भक्ता- भक्तां-
ीं
रात्र रात्री
रात्री- रात्रीं-
∅,े
ां
पत्र पत्र,पत्रे
पत्रा- पत्रां-
a
्या ्यां
आंबा आंबे
आंब्या- आंब्यां-
शाळा शाळा
शाळे- शाळां-
ī
हत्ती हत्ती
हत्ती- हत्तीं-
्या
्यां
चिमणी चिमण्या
चिमणी- चिमण्यां-
्या ्यां
पाणी पाणी
पाण्या- पाण्यां-
ū
वा वां
विंचू विंचू
विंचवा- विंचवां-
वा
वां
सासू सासवा
सासू- सासवां-
े,ं
ां
लिंबू लिंबे,लिंबं
लिंबा- लिंबां-
e/ə
ए/ं
्या ्यां
केळे,केळं केळी
केळ्या- केळ्यां-

C-ending feminine exceptional paradigm (f-e)[edit]

DSAL dictionaries usually indicate which paradigm a C-ending feminine noun follows

Stem Final Feminine
C
ां
लाट लाटा
लाटे- लाटां-

u-ending Exceptional Paradigms[edit]

Stem Final Masculine Feminine Neuter
ū
काजू काजू
काजू- काजूं-
[3]
साळू साळू
साळू- साळूं-
[4]
वे,वं
वा वां
गळू गळवे,गळवं
गळवा- गळवां-
[5]

Note: Some nouns can be declined using the non-exceptional or the exceptional paradigm. The word साळू (sāḷū, porcupine) is rare, but despite Dhongde & Wali's choice of word the paradigm it represents is valid.

Possible Exceptional Stems[edit]

The individual words may not matter unless they're common.

बी बिया
बी- बियां-

Common Nouns[edit]

बायको बायका
बायको- बायकां-
बाई बायका
बाई- बायकां-
मुलगा मुलगे
मुला- मुलां-
मुलगी मुली
मुली- मुलीं-

Given Names[edit]

Singular Oblique is usually ∅.

Personal Pronouns[edit]

Palatalisation: If stem ends with /t͡s/ or /d͡zʱ/ and followed by ्या or , then t͡s → t͡ɕ, d͡zʱ → d͡ʑʱ[edit]

t͡s: त्याचा (tyācā) + ्यात्याचा- (tyāċā-)
d͡zʱ: माझा (mājhā) + ्यामाझा- (māj̈hā-)
Note: No spelling changes, Only transliteration changes

CFILT Tags[edit]

Fundamental Tags[edit]

CFILT Tag Gender Ending Paradigm Example Notes
भक्त m C
ां
भक्त भक्त
भक्ता- भक्तां-
Cluster: भक्त
 
Final closed syllable: दगड
माळा m ā
्या ्यां
आंबा आंबे
आंब्या- आंब्यां-
बघ्या m-e ā
तांब्या तांब्या
तांब्या- तांब्यां-
already before in stem
काका m-e ā
राजा राजे
राजा- राजां-
Similar to <बाबा>
गारूडी m ī
्या- ्यां-
माळी माळी
माळ्या- माळ्यां-
गणपती m-e ī
हत्ती हत्ती
हत्ती- हत्तीं-
वाट f-e C
ां
लाट लाटा
लाटे- लाटां-
नात f C
ीं
पाल पाली
पाली- पालीं-
Cluster: रात्र
गंगा f ā
शाळा शाळा
शाळे- शाळां-
मामी f ī
्या
्यां
चिमणी चिमण्या
चिमणी- चिमण्यां-
ग्रंथी f-e ī
माती माती
माती- मातीं-
झाड n C
ां
"
घर घरे
घरा- घरां-
Cluster: पत्र
लोणी n ī
्या ्यां
पाणी पाणी
पाण्या- पाण्यां-
कडे n e
्या ्यां
केळे,केळं केळी
केळ्या- केळ्यां-

ū-ending[edit]

CFILT Tag Gender Paradigm Example Previous Example
नातू m
वा वां
नातू नातू
नातवा- नातवां-
विंचू can be in <परशू> too
परशू m
काजू काजू
काजू- काजूं-
काजू
सासू f
वा
वां
सासू सासवा
सासू- सासवां-
सासू
वाळू f
साळू साळू
साळू- साळूं-
साळू
वासरू n
े,ं
ां
लिंबू लिंबे,लिंबं
लिंबा- लिंबां-
लिंबू
तारू n
ां
अळू अळवे
अळवा- अळवां-
गळू

o-ending[edit]

CFILT Tag Gender Paradigm Example Notes
धनको all genders
धनको is masculine
फोटो फोटो
फोटो- फोटों-
Primarily for English loanwords


native o-ending words are rare

Final Diphthong[edit]

CFILT Tag Gender Paradigm Example Previous Example
भाऊ m
वा- वां-
भाऊ भाऊ
भावा- भावां-
समई f
या- यां-
मिठाई मिठाई
मिठाया- मिठायां-
माई f
ं-
ताई ताई
ताई- ताईं-
गाय f
ई- ईं-
गाई
गाई- गाईं-

Vowel Weakening[edit]

CFILT Tag Subtag of Paradigm Example Previous Example
सूर भक्त
ु…ा ु…ां
बूट बूट
बुटा- बुटां-
बूट
स्त्री मामी
ि…या
ि…यां-
बी बिया
बी- बियां-
बी
फीत नात
ि…ी
ि…ी ि…ीं-
बहीण बहिणी
बहिणी- बहिणीं-
चूल नात
ु…ी
ु…ी,ु…ा- ु…ीं,ु…ां-
धूळ धुळी
धुळी- धुळीं-
चूक नात
ु…ी,ु…ा
ु…ी,ु…ा- ु…ीं,ु…ां-
मिरवणूक मिरवणुकी,मिरवणुका
मिरवणुकी-,मिरवणुका- मिरवणुकीं-,मिरवणुकां-
मिरवणूक
भीक वाट
ि…ा
ि…े ि…ां
जीभ जिभा
जिभे- जिभां-
सून वाट
ु…ा
ु…े ु…ां
सून सुना
सुने- सुनां-
मूल झाड
ु…े
ु…ा ु…ां
मूल मुले
मुला- मुलां-
रीळ झाड
ि…े
ि…ा ि…ां
बक्षीस बक्षिसे
बक्षिसा- बक्षिसां-

Final closed syllable Subtags[edit]

CFILT Tag Subtag of Example Previous Example
पाटील भक्त
पाटील पाटील
पाटला- पाटलां-
वडील
कापूस भक्त
बेडूक बेडूक
बेडका- बेडकां-
बेडूक
चिमूट नात
लसूण लसणी
लसणी- लसणीं-
लसूण is also masculine
ढेकूळ झाड
माणूस माणसे,माणसं
माणसा- माणसां-
माणूस

Exceptions[edit]

CFILT Tag Subtag of Example Previous Example
वकील भक्त
वकील वकील
वकीला- वकीलां-
शूर भक्त
समूह समूह
समूहा- समूहां-

Long u Subtags[edit]

CFILT Tag Subtag of Example Previous Example
पाऊस भक्त
पाऊस पाऊस
पावसा- पावसां-
पाऊस
देऊळ झाड
देऊळ देवळे,देवळं
देवळा- देवळां-
देऊळ

Palatalisation Subtags[edit]

CFILT Tag Subtag of Example Previous Example
ससा माळा
पैसा पैसे
पैशा- पैशां-
पैसा, मासा
म्हैस नात
म्हैस म्हैशी
म्हैशी- म्हैशीं-
म्हैस

Degemination[edit]

CFILT Tag Subtag of Example Previous Example
अक्कल वाट
चप्पल चपला
चपले- चपलां-
चप्पल
छप्पर झाड
छप्पर छपरे
छपरा- छपरां-

Eyelash [edit]

CFILT Tag Subtag of Example Previous Example
मोगरा माळा
चेहरा चेहरे
चेहऱ्या- चेहऱ्यां-
चेहरा
परी मामी
पुरी पुऱ्या
पुरी- पुऱ्यां-
नोकरी

Other[edit]

CFILT Tag Paradigm Example Previous Example
रीत
वीर
कूस
जोखीम