Jump to content

Wiktionary:Chagatai entry guidelines

From Wiktionary, the free dictionary

Language

[edit]
English Wikipedia has an article on:
Wikipedia

Chagatai was a literary Turkic language spoken in Central Asia. It is the ancestor of the Uzbek and Uyghur languages.

Etymology

[edit]
Inherited terms

The ancestors of Chagatai are, in the following order:

  • Khorezmian Turkic zkh
  • Karakhanid xqa
  • Proto-Common Turkic trk-cmn-pro
  • Proto-Turkic trk-pro
Perso-Arabic loans
  • Perso-Arabic loans were largely borrowed from/by way of Classical Persian fa-cls, and derived from Arabic ar, or Middle Persian pal, etc.

Lemmatization

[edit]
Encoding

Every letter in Chagatai has multiple stylistic variants, sometimes corresponding to different character encodings. On the English Wiktionary, Chagatai lemmas always use the same unicode character encoding as Persian.

Language Codepoint Isolated Final Medial Initial
unicode variant U+0643 ك ـك ـكـ كـ
lemma form U+06A9 ک ـک ـکـ کـ
Language Codepoint Isolated Final Medial Initial
unicode variant U+064A ي ـي ـيـ يـ
U+0649 ى ـى ـىـ ىـ
lemma form U+06CC ی ـی ـیـ یـ
Language Codepoint Isolated Final Medial Initial
lemma form U+0647 ه ـه ـهـ هـ
unicode variant U+06C1 ہ ـہ ـہـ ہـ
U+06BE ھ ـھ ـھـ ھـ
U+06D5 ە ـە ـەـ ەـ

In cases where a letter does not connect to a following letter, use a ZWNJ, do not use an isolated variant of a letter.

Spelling Variants
  • گ (g) never became mainstream in Chagatai writing, thus its spellings are not lemmatized and ک (k) is always prefered, even for /ɡ/.
  • When پ (p) is substituted as ب (b) or ف (f), the latter are given as alternative spellings.
  • ڭ spellings are added as alternative spellings of نک (nk /⁠ñ⁠/).

Transliteration and Transcription

[edit]

Because Chagatai is a historical language with often disputed readings, it employs a double transliteration/transcription system. Transliteration is automatically generated by Module:chg-translit, but transcriptions need to be entered manually using the |ts= parameter.

Consonants

[edit]
No. Letter Transcription IPA
1 ا (ʾ) [1] /ʔ/ or ∅(see below)
1b آ (ʾā) ā [ʔɑː]
2 ب (b) b, p[2] /b/, /p/
3 پ (p) p /p/
4 ت (t) t /t/
5 ث () s /s/
6 ج (j) j /d͡ʒ/
7 چ (č) č /t͡ʃ/
8 ح () h /h/
9 خ (x) x /x/
10 د (d) d /d/
11 ذ () z /z/
12 ر (r) r /r/
13 ز (z) z /z/
14 ژ (ž) ž /ʒ/
15 س (s) s /s/
16 ش (š) š /ʃ/
17 ص () s /s/
18 ض (ż) z /z/
19 ط () t /t/
20 ظ () z /z/
21 ع (ʿ) ' /ʔ/
22 غ (ġ) ġ /ɣ/
23 ف (f) f, p[2] /f/, /p/
24 ق (q) q /q/
25 ک (k) k, g[3] /k/, /ɡ/
26 گ (g) g[3] /ɡ/
27 ل (l) l /l/
28 م (m) m /m/
29 ن (n) n /n/
30 و (w) w /w/
31 ه (h) h, ∅[4] /h/
32 ی (y) y /j/
0 ء (ʾ), أ (ʾ), إ (ʾ), ؤ (ʾ), ئ (ʾ) ' /ʔ/
  1. ^ When ا (ʾ) is acting as a consonant in the initial position, it is not transcribed.
  2. 2.0 2.1 پ (p /⁠p⁠/) rarely was represented by ب (b) or ف (f) in some manuscripts.
  3. 3.0 3.1 While Chagatai manuscripts often don't distinguish k~g, Wiktionary maintains a distinction in transcription.
  4. ^ Final ه (h) is not transcribed when it is not pronounced.
Notes
  • As the Chagatai does not distinguish letter case, transcriptions should not artificially add it, and uppercase letters should never be used.
  • نگ (ng) / نک (nk), when representing /ŋ/, is written as ñ between vowels and ng otherwise.
    • The rare letter ڭ is transcribed ñ.
  • ZWNJ is transcribed as a hyphen (-).

Vowels

[edit]

Unlike Ottoman Turkish, Chagatai typically used matres lectionis in words of Turkic origin and it was less common to use diacritics outside of Perso-Arabic loans. Chagatai orthography only clearly distinguishes whether a vowel was rounded, unrounded, or low; And some Chagatai dictionaries are hyper-specific (i.e. including dialectal or conditional allophones), so etymological considerations will need to be taken when choosing which vowel to use.

Letter Possible corresponding vowels[1]
Front Back
ا (ʾ), آ (ʾā) ä a, ā[2]
ی (y) e, ē,[2] i, ī[2] e,[3] ı[4]
و (w) ö, ü[5] o, ō,[2] u, ū[2]
  1. ^ In Turkic words, all vowels are typically in the same set (i.e. all front or all back) but in Perso-Arabic words both sets of vowels can appear in a single word.
  2. 2.0 2.1 2.2 2.3 2.4 Length distinctions only existed in Perso-Arabic loans, and should not be marked otherwise.
  3. ^ e may ignore vowel harmony, even in Turkic words.
  4. ^ ı /ɯ/ later merged into i /i/ in what became Uyghur and Uzbek, but should still be distinguished when possible.
  5. ^ A somewhat uncommon exception is the usage of ـُیـ (-y-) for ü. In this case, it is preferable to have a diacritic in the headword.
Notes
  • There should never be a vowel hiatus in transliteration, and diphthongs are interpreted as a sequence of vowel + semivowel.
  • In Perso-Arabic words, short vowels are always transliterated as ä or a (for ـَ), i or ı (for ـِ), ü or u (for ـُ). Ignoring the conditional (i.e. allophonic) lowering of short vowels before glottal consonants. Perso-Arabic short vowels must've been near these vowels, as evidenced by the neutralization of vowel length in Uzbek, though a short e and o can appear in Turkic words.
  • Vowel transcription should reflect spelling variations, i.e. ایرون (irün), ایرین (irin).