Wiktionary talk:Thai romanization

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Transliterate help[edit]

I don't understand this page. How should I transliterate เค็ม in Appendix:Proto-Sino-Tibetan/g-rjum? It would be "khem" in the RTGS. Wyang (talk) 11:01, 24 March 2013 (UTC)[reply]

I think this page is too difficult for most people to use. I (and other editors) have been using the Paiboon system, where เค็ม = kem, สวัสดีครับ = sàwàtdii kráp. I like the tone indications, but I don’t think it is critical. I think it’s okay to use RTGS if you know that system best. —Stephen (Talk) 11:31, 24 March 2013 (UTC)[reply]
Thank you. (Hopefully someday transliteration can be achieved automatically so that all Thai words here are transliterated using the same system.) Wyang (talk) 05:32, 25 March 2013 (UTC)[reply]
It's only achievable to a degree. There are too many exceptions, especially in terms of traditionally versus modern tones, tone sandhi and silent letters. Lao is more phonetical. The spelling/transliteration rules are also too complicated. Google Translate does a terrible job of transliterating Thai, even for words, which follow the rules. I'd prefer tones where possible but some standard Lao systems also ignore tones. --Anatoli (обсудить/вклад) 05:53, 25 March 2013 (UTC)[reply]
The word "เค็ม" is made of " (kɔɔ)" (k) with a compound diacritic - one on the left () and one on top (can't enter separately but combined with it's ค็ ) + (mɔɔ) (m). The difficulty here is in identifying the complex vowel that wraps the consonant. The concept with this kind of diacritics is similar to Indic languages where vowels can be on the left, on the right, on top, on the bottom or wrapping letters from two sides. --Anatoli (обсудить/вклад) 06:06, 25 March 2013 (UTC)[reply]
Found that short vowel with a "seat" on w:Thai_alphabet - เ◌็◌ (e). So, "เค็ม" is + เ◌็◌ + . Our table is incomplete, needs some work - especially tones, vowels and diacritics. --Anatoli (обсудить/вклад) 06:18, 25 March 2013 (UTC)[reply]
If it is to be done automatically, the important thing I think would be to find syllable divisions from a Thai text input. The way the vowel signs (the ones placed at front included) work in Thai seems to be analogous to Tibetan/Burmese, which shouldn't be a huge problem (I wrote sometime ago on zh.wikt transliterators for bo(Wylie)/my(MLCTS)/km(UN), if anyone is interested.). The irregularities may be an issue if they occur too frequently, though this doesn't (?) seem to be the case judging from Thai alphabet. Wyang (talk) 06:54, 25 March 2013 (UTC)[reply]
The number of exception is not huge but there are some in my Thai textbooks. It's no show-stopper and exceptions can be manually overwritten.
I have two transliteration modules on my to do list - Korean (Module:ko-translit) and Arabic (Module:ar-translit). User:Ruakh and User:ZxxZxxZ did much or most of the work there but I provided the formulas and the logic to transliterate Korean and set up the majority of Arabic letters. I'd like to help you with a Thai module if I can, perhaps with testing, getting all the reading rules but I need to finish those as well.
Why don't you import your Tibetan and Burmese scripts? User:Angr would be able to help/check the Burmese module. --Anatoli (обсудить/вклад) 23:04, 25 March 2013 (UTC)[reply]

Update: {{th-pron}}, Module:th-pron and Module:th-translit. Wyang (talk) 13:19, 29 January 2016 (UTC)[reply]

Symbols for suggested new transliterations[edit]

@Iudexvivorum I've just copied the symbols for new transliterations: ʉ ɛ ɔ ə á à â ǎ é è ê ě ɛ́ ɛ̀ ɛ̂ ɛ̌ ə́ ə̀ ə̂ ə̌ í ì î ǐ ó ò ô ǒ ɔ́ ɔ̀ ɔ̂ ɔ̌ ú ù û ǔ ʉ́ ʉ̀ ʉ̂ ʉ̌ I still need to make proper tables but I could use some help because it's time consuming and I'm not so great with tables. :). --Anatoli T. (обсудить/вклад) 04:32, 11 August 2015 (UTC)[reply]

IPA[edit]

@Wyang, Octahedron80 Could this page be updated to include IPA? —suzukaze (tc) 06:12, 8 September 2017 (UTC)[reply]

I feel that if I were to add the IPA part, I really should rewrite and expand the whole thing... consequently nothing gets done because it will take way too much time. A trait of OCPD? (note: not OCD) Hopefully I will remember to come back to this when a bit freer. Wyang (talk) 11:42, 8 September 2017 (UTC)[reply]

Ordering[edit]

What relevance is the alphabetical ordering of Thai? I think it is irrelevant and should be deleted. The description given is wrong. If relevant, someone (me?) should rewrite it along the lines of:

1. Preparation: Swap preposed vowel (i.e. one of เ แ โ ใ ไ) with following consonant. 2. Level 1: Perform lexicographic comparison, with consonants ordering before vowels and tone marks, maitaikhu (อ็) and thanthakhat (อ์) ignored. 3. Level 2: Compare on basis of leftmost difference, with nothing < maitaikhu < tones (in order) < thanthakhat.

RichardW57 (talk) 22:59, 15 December 2017 (UTC)[reply]

@RichardW57: Do you have a particular example you're complaining about? เขต (kèet) is not sorted under but in the list of lemmas, as expected, more or less but imperfect. I believe this is already happening. Please check with @Octahedron80, Wyang. --Anatoli T. (обсудить/вклад) 00:13, 16 December 2017 (UTC)[reply]
Additionally, if you want to sort words under a topic category, use template {{topics}} rather than raw wiki code. I think that swapping front-vowels (1.) is enough to index a word; no need to make rich sorting (that would make unnecessary task/problem). Ah. Lao, Lü and Pali also use the same logic. --Octahedron80 (talk) 00:53, 16 December 2017 (UTC)[reply]
@Octahedon80 Where, if anywhere, are the rules given on the page used? Let us consider two words on the list of Thai lemmas, กสิ and กะได, which are correctly sorted in that order - they're on pp86 and 91 respectively of the 1999 edition of the Royal Institute Dictionary. Level 1(a) 'left to right' probably helps, but not immediately. I can't see any application of 1(b) 'end of string < consonant symbol <? Thai numeral'. Level 2(a) 'by consonant, left to right', looks useful. The consonant sequences are กส and กด. They differ in the second consonant, and ด comes before ส in the alphabet, so by the rules given, กะได comes before กสิ. But that is wrong!
I agree that if you just sort by the initial consonant (counting อ as a consonant), that will work fairly well for short lists.
While the vowel swapping is needed for Lao and Lü, thereafter it gets complicated. Firstly, it is the complete vowel symbol, not just one character at a time, that gets compared in the comparison step. For Lao, one has to do syllable by syllable comparison, and the algorithms and tools promoted by the Unicode Consortium (UCA/CLDR and ICU) can't cope with that extra task.

RichardW57 (talk) 22:26, 26 January 2018 (UTC)[reply]

Paiboon[edit]

This may not relate to the page. I wonder who is Paiboon (from Paiboon Publishing (?)). However, there are two words read Paiboon: ไพบูลย์ and ไพบูรณ์. With this ambiguity, I cannot translate into Thai. --Octahedron80 (talk) 13:30, 16 October 2018 (UTC)[reply]

You've understood 'Paiboon'; see https://slice-of-thai.com/pronunciation-guides/#paiboonplus for confirmation. Perhaps we need a Wiktionary entry for the word - but I don't know how easily the three valid quotations will be found. RichardW57 (talk) 22:09, 16 October 2018 (UTC)[reply]

@Octahedron80, RichardW57, Wyang: I've made a stub: ไพบูลย์ (pai-buun). Please enhance/fix and add the gender for the given name. --Anatoli T. (обсудить/вклад) 22:21, 16 October 2018 (UTC)[reply]
@Atitarev:: I meant an entry for 'Paiboon'! RichardW57 (talk) 22:34, 16 October 2018 (UTC)[reply]
@RichardW57: The proper noun may not comply with WT:CFI and could be deleted. The Thai word has multiple senses, though, including the company name. --Anatoli T. (обсудить/вклад) 22:39, 16 October 2018 (UTC)[reply]
As I told before, Thai name could be any word in dictionary (or even out of dictionary by modifying some letters). Therefore, inclusion of given name or company name in every entry is useless. I gonna remove them from ไพบูลย์. I just wanted to know if Paiboon Publishing has official Thai name. Now I think they have'nt because it was established in USA. --Octahedron80 (talk) 01:39, 17 October 2018 (UTC)[reply]
Interesting that Paiboon Publishing doesn't follows its own transliteration standard, it should be "Paibuun", which is closer to "pai-buun". --Anatoli T. (обсудить/вклад) 22:49, 16 October 2018 (UTC)[reply]

Stress in Paiboon+[edit]

According to Syllable Stress in Thai, there is a "~" introduced in Paiboon+. Since other formats, like long vowels are actually following Paiboon+, I suggest we should add the "~" symbol for Paiboon Romanization on wiktionary.--汩汩银泉 (talk) 13:48, 21 April 2024 (UTC)[reply]