Module talk:sh-IPA

From Wiktionary, the free dictionary
Latest comment: 7 months ago by Vorziblix in topic Nasal assimilation and phonemic transcription
Jump to navigation Jump to search
[edit]

The function to generate the syllabic 'r' (/r̩/) fails in many cases. It appears that it doesn't work when the 'r' is not stressed. What the IPA handbook completely fails to mention is that there are also syllabic L, N and Lj sounds (though quite rare, e.g. in bicikl), and the module doesn't seem to deal with them either. Are these cases supposed to be entered manually, as I did here? — Phazd (talk) 00:44, 8 May 2023 (UTC)Reply

@Phazd: Fixed now, I think, although there might be some edge cases I didn’t catch. Let me know if you run into any other cases where the module doesn’t match what you’d expect (or if my fixes caused anything else to go wrong). — Vorziblix (talk · contribs) 02:29, 9 May 2023 (UTC)Reply
@Vorziblix Thanks for getting back to me. For now it still doesn't work consistently, e.g. see brzina. How exactly does the script try to detect the syllabic consonants? I should look into some more detailed analyses to get a guaranteed precise overview, but for now, based on my memory and Hrvatska gramatika (2005, §56), the syllabic consonant has to be surrounded by non-syllabic consonants (excluding 'j') and/or word-breaks to be syllabic, and those syllabic consonants can be: 'r', 'L', 'n', 'Lj' and 'm'. HG also mentions a super rare and unpredictable case where syllabic 'r' is followed by 'L' which is vocalised to 'o', while the 'r' remains syllabic. Gȑlce > gȑoce /gr̩̂.o.t͡se/, and ȕmrl > ȕmro /û.mr̩.o/ (in Vuk's 1818 dictionary, p. LXIX, these are written as гръоце and умъро). But that phenomenon is dialectal and certainly absent from modern standard variants (HG itself had to recycle Vuk's example, groce), so I guess manually adding the r̩'s is the way to go if someone ever decides to try to add such words and word-forms.
More importantly, I've also noticed the module can't represent the modern ijekavian "long yat" -ije- correctly. Before the module was added, Ivan Štambuk (and other users, probably?) simply used the general IPA template and did it correctly, but now other users and bots have replaced it in some words with sh-IPA, and it has resulted incorrect forms. However, any quick fix could create conflict with -ije- sequences that aren't yats and have different pronunciation.
I'll try to find some good resources on these two issues and provide you with more accurate descriptions (if you don't have any yourself?). The first problem seems entirely solvable. Regarding the long yat, I should probably expand the relevant section on Wikipedia first. It could be troublesome, because there are two or even three different ways of pronouncing and writing down the reflex in the different standard variants, and in Croatia there were some polemics regarding their status.
Phazd (talk|contribs) 03:21, 15 May 2023 (UTC)Reply
@Phazd: Thanks for the notes! brzina doesn’t use this template; someone entered the pronunciation manually. If you replace it with this template it should work fine. The template currently tries to detect syllabic consonants similarly to what you described. It looks at whether the consonant is a sonorant /r/, /l/, /ʎ/, /m/, /n/, or /ɲ/ and checks if it’s between two other consonants or a consonant and a word edge — right now /l/, /ʎ/, and /j/ are excluded from the list of consonants that trigger this condition, but I’m not sure that’s right. Accurate descriptions would be very helpful if you could dig them up. Can nj (/ɲ/) ever be syllabic? I can’t think of any examples offhand, but surely if it was ever found in the same phonetic environment it would behave like the other sonorants. (Or would it?)
The case with ‘r’ followed by ‘o’ from vocalized ‘l’ will probably just have to be entered manually if the ‘r’ is not stressed. There can also be a similar problem with other vowels in a few cases, as in zarđati.
Long yat is a messy problem. As far as I know there are regional pronunciation differences even within ijekavian, where in some regions it’s disyllabic [ije] and in others it’s monosyllabic [jeː]. I honestly don’t know what the best way to handle this is (and unfortunately I’m not a native ijekavian speaker, and don’t know some of the details). If we want to settle on [jeː] (as in standard Croatian, if I’m not mistaken), we could mark the long yat in some special way when providing it as input to this template; then I could have the template generate the pronunciation as desired. Or, if some other solution is preferable, I’m open to suggestions. — Vorziblix (talk · contribs) 05:54, 15 May 2023 (UTC)Reply

Nasal assimilation and phonemic transcription

[edit]

There's no reason to mark place-of-articulation assimilation (n → ŋ / _{k, g, x}) as this module is used in phonemic and not phonetic transcription (despite the string variable being called phonetic, it is wrapped in slashes, traditionally used for phonemic and not phonetic representation). The velar nasal is, after all, not phonemic in Serbo-Croatian. Alternatively, the module can be changed to use square brackets (that is, [ ]); such is the case, for example, in the Azeri module. Kneelian (talk) 02:39, 7 December 2023 (UTC)Reply

@Kneelian: You’re right; fixed. — Vorziblix (talk · contribs) 13:39, 7 December 2023 (UTC)Reply