Module talk:tg-translit

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

A piece of Tajik text: Zaboni tojikī, ki dar Eron: forsī, va dar Afġoniston darī nomida mešavad, zaboni davlatiyi kišvarhoyi Tojikiston, Eron va Afġoniston mebošad. Dar Üzbakiston, agarčī zaboni aqalliyat tojikī mahsub mešavad, vale dar Üzbakiston ziyoda az 15 million nafar ba tojikī guftugü mekunand. In zabon ba xonavodayi zabonhoyi hindu-avrupoyī doxil mešavad. Dar majmüʾ: porsigüyoni asil(forsī, tojikī, darī) ziyoda az 122 mln mardum mebošand. Ammo tamomi porsigüyoni jahon 222 mln hastand. Faqat ba güiši tojikī 44 million nafar gap mezanand. Zaboni točikī, yake az zabonhoyi bostontarini jahon ba šumor meravad. Davrayi navi inkišofi on dar asrhoyi 7-8 sar šudaast. Bo in zabon šoyironu navisandagoni buzurg Rüdakī, Firdavcī, Xayyom, Sino, Jomī,Mavlono, Hofiz, Doniš, Aynī, Lohutī, Tursunzoda va digaron asarho ejod kardaand.

Testing conversion of "е" after vowels and "ъ". AyeOyeUyeEyeYayeYoyeYuyeIyeĪyeEyeʾyeayeoyeuyeeyeyayeyoyeyuyeiyeīyeeyeʾe --Anatoli (обсудить/вклад) 23:37, 17 April 2013 (UTC)

И, и[edit]

Letter И, и should be I, i, not Yi, yi. Which part of code does it?

In the string Ин ин Ин. ин Ин, only the first character is transliterated correctly see test: In in In. in In --Anatoli (обсудить/вклад) 04:01, 18 April 2013 (UTC)

Fixed. --Z 04:09, 18 April 2013 (UTC)
Thank you! Does "%A" matches beginning of a line? Which documentation are you using, I'd like to read a bit more. --Anatoli (обсудить/вклад) 04:27, 18 April 2013 (UTC)
NP. %A is "all characters not in %a", and %a represents all ASCII letters, so %A includes white spaces as well, which we don't want. Here is a comprehensive documentation. --Z 06:30, 18 April 2013 (UTC)
Thank you for the link and explanation. Something happened, though The test: AyeOyeUyeEyeYayeYoyeYuyeIyeĪyeEyeʾyeayeoyeuyeeyeyayeyoyeyuyeiyeīyeeyeʾe doesn't produce what it should at the moment. "е" after vowels should be "ye". User:Dijan knows the exact details. --Anatoli (обсудить/вклад) 06:36, 18 April 2013 (UTC)
My mistake... fixed now. --Z 06:42, 18 April 2013 (UTC)
It occurs to me that the sequence аа should be transliterated "aya" (for verb forms like кардаам, гирифтаанд, etc.) — [Ric Laurent] — 13:38, 18 April 2013 (UTC)
No. The rules of epenthesis are the same for both Farsi and Tajiki for their corresponding vowels. The only ones I'm mentioning here are the ones that aren't obvious in the written form. In the examples of verb forms that you cited, they correspond to Farsi کرده‌ام (karde'am) and گرفته‌اند (gerefte'and). In literary/formal Tajiki, as in literary/formal Farsi, they are pronounced with the glottal stop, while in colloquial/spoken they both omit the glottal stop and only one "a" is pronounced (the "e" being dropped in the case of Farsi and "a" being pronounced instead). --Dijan (talk) 15:50, 18 April 2013 (UTC)
I've seen it written quite specifically that in Tajik it's /aja/, but whatever, I don't know anything. — [Ric Laurent] — 00:41, 19 April 2013 (UTC)
Now I'm curious. As far as I remember, I don't think it is. If it was it would be written ая. Tajiki compensates for "ya", "yo", and "yu" in spelling. But, if you can find where it was written as such, let me know. :) --Dijan (talk) 05:25, 19 April 2013 (UTC)
From Tajiki Reference Grammar for Beginners by Nasrullo Khojayori and Mikael Thompson (which, for the record, uses a wonderfully modest Cyrillic font):
"However, the pronunciation differs from the spelling (which is purely historical)."
"The present perfect is formed by adding the predicate endings to the past participle. Note that (1) the 3rd singular аст is written joined to the participle, and (2) although a й is sometimes added between the participle and the predicate endings, it is not indicated in writing with yoted letters (thus хондаам can be pronounced [хондаям])."
[Ric Laurent] — 13:06, 19 April 2013 (UTC)
I'm not sure what that means exactly. He says "can be pronounced", but not that it actually is in the standard/literary language. It's possible that it is a dialectal feature. He points out various differences in pronunciation between the northern and southern dialects.
This is what Shinji Ido says in "Tajik" (2005) about pronominal clitics on page 26, "The 'buffer' sound /j/ ... is inserted between a vowel other than /a/ and a pronominal clitic that follows it."
An example of this would be хонаам (xona-ammy house), which would be pronounced with a glottal stop after the last syllable in хона in literary language, and without the glottal stop with only one а in colloquial.
John R. Perry in "Tajik Persian Reference Grammar" (2005) says "A euphonic -y- is inserted after a word ending in a vowel other than -a;". --Dijan (talk) 08:51, 20 April 2013 (UTC)
Thanks, ZxxZxxZ. @Dijan. It seems WT:TG TR needs a bit of notes on how transliteration should work. --Anatoli (обсудить/вклад) 00:08, 19 April 2013 (UTC)

Ӯ ӯ[edit]

I was recently editing the Wikipedia article on the Tajik alphabet, and it seems that the usual transliteration of the letter ӯ (ü) is ū, but here on Wiktionary it's ü. (It was changed from ū to ü in this edit by @Dijan.) The pronunciation of the letter is /ɵː/ according to Wikipedia (though I doubt the vowel is actually distinctively long, since quality serves to distinguish it from the other vowels). In most languages (for instance, German, Turkish, Hungarian), ü represents /y/ or /ʏ/.

Ū suggests the value /uː/, which is rather far from the real Tajik pronunciation. (It's not even historically correct: I gather from the table in w:Persian phonology#Historical shifts that the vowel descends from Early New Persian /oː/. A historical transcription would be ō.) So it's a very misleading transliteration to use, even though it is a direct representation of the Cyrillic letter ӯ (ü), composed of у (u) plus a macron.

The typical pronunciation of ü in other languages is much closer to the Tajik pronunciation of ӯ (ü): it's front or near-front and rounded. But I think it would make far more phonetic sense to transcribe ӯ (ü) as ö, which has the vowel quality [ø] or [œ] in German, Turkish, Hungarian, and Finnish. These symbols are canonically defined as front, but rounded front vowels are often somewhat centralized: i.e., closer to [ɵ], the Tajik vowel. So, ö would be the best transliteration, if the transliteration is meant to suggest the phonetic value. Otherwise, it would be better to go back to the standard transliteration, ū. — Eru·tuon 07:56, 22 December 2016 (UTC)

I should probably ask: @Dijan, why did you change the transliteration of ӯ (ü) from ū to ü? I think ü is probably more understandable phonetically, but ö would be even better, since the Tajik vowel is mid like ö, not close like ü. — Eru·tuon 08:01, 22 December 2016 (UTC)

Unfortunately, I do not recall exactly why this was changed in 2011. In my opinion, ū is misleading as it suggests just a longer variant of u. I believe it was probably to differentiate from regular u but also keep it aesthetically similar to u for transliteration purposes. However, feel free to change it to whatever you think is more appropriate. Regarding "usual" or standard transliteration vs Wiktionary, as long as I have been here, Wiktionary has always opted for its own transliteration standards. Whether they are based on other systems or not is somewhat irrelevant on Wiktionary. Dijan (talk) 14:08, 22 December 2016 (UTC)
Personally, I prefer ū because of its graphical similarity to the Cyrillic; I don't think we can represent it well phonetically whatever we might do. But I don't speak Tajik. —Μετάknowledgediscuss/deeds 17:40, 22 December 2016 (UTC)