Wiktionary:Chagatai entry guidelines
| This is a Wiktionary policy, guideline or common practices page. This is a draft proposal. It is unofficial, and it is unknown whether it is widely accepted by Wiktionary editors. | |
| Policies – Entries: CFI - EL - NORM - NPOV - QUOTE - REDIR - DELETE. Languages: LT - AXX. Others: BLOCK - BOTS - VOTES. |
Language
[edit]Chagatai was a literary Turkic language spoken in Central Asia. It is the ancestor of the Uzbek and Uyghur languages.
Etymology
[edit]- Inherited terms
The ancestors of Chagatai are, in the following order:
- Khorezmian Turkic
zkh - Karakhanid
xqa - Proto-Common Turkic
trk-cmn-pro - Proto-Turkic
trk-pro
- Perso-Arabic loans
- Perso-Arabic loans were largely borrowed from/by way of Classical Persian
fa-cls, and derived from Arabicar, or Middle Persianpal, etc.
Lemmatization
[edit]- Encoding
Every letter in Chagatai has multiple stylistic variants, sometimes corresponding to different character encodings. On the English Wiktionary, Chagatai lemmas always use the same unicode character encoding as Persian.
| Language | Codepoint | Isolated | Final | Medial | Initial |
|---|---|---|---|---|---|
| unicode variant | U+0643 | ك | ـك | ـكـ | كـ |
| lemma form | U+06A9 | ک | ـک | ـکـ | کـ |
| Language | Codepoint | Isolated | Final | Medial | Initial |
|---|---|---|---|---|---|
| unicode variant | U+064A | ي | ـي | ـيـ | يـ |
| U+0649 | ى | ـى | ـىـ | ىـ | |
| lemma form | U+06CC | ی | ـی | ـیـ | یـ |
| Language | Codepoint | Isolated | Final | Medial | Initial |
|---|---|---|---|---|---|
| lemma form | U+0647 | ه | ـه | ـهـ | هـ |
| unicode variant | U+06C1 | ہ | ـہ | ـہـ | ہـ |
| U+06BE | ھ | ـھ | ـھـ | ھـ | |
| U+06D5 | ە | ـە | ـەـ | ەـ |
In cases where a letter does not connect to a following letter, use a ZWNJ, do not use an isolated variant of a letter.
- Spelling Variants
- گ (g) never became mainstream in Chagatai writing, thus its spellings are not lemmatized and ک (k) is always prefered, even for /ɡ/.
- When پ (p) is substituted as ب (b) or ف (f), the latter are given as alternative spellings.
- ڭ spellings are added as alternative spellings of نک (nk /ñ/).
Transliteration and Transcription
[edit]Because Chagatai is a historical language with often disputed readings, it employs a double transliteration/transcription system.
Transliteration is automatically generated by Module:chg-translit, but transcriptions need to be entered manually using the |ts= parameter.
Consonants
[edit]| No. | Letter | Transcription | IPA |
|---|---|---|---|
| 1 | ا (ʾ) | ∅[1] | /ʔ/ or ∅(see below) |
| 1b | آ (ʾā) | ā | [ʔɑː] |
| 2 | ب (b) | b, p[2] | /b/, /p/ |
| 3 | پ (p) | p | /p/ |
| 4 | ت (t) | t | /t/ |
| 5 | ث (s̱) | s | /s/ |
| 6 | ج (j) | j | /d͡ʒ/ |
| 7 | چ (č) | č | /t͡ʃ/ |
| 8 | ح (ḥ) | h | /h/ |
| 9 | خ (x) | x | /x/ |
| 10 | د (d) | d | /d/ |
| 11 | ذ (ẕ) | z | /z/ |
| 12 | ر (r) | r | /r/ |
| 13 | ز (z) | z | /z/ |
| 14 | ژ (ž) | ž | /ʒ/ |
| 15 | س (s) | s | /s/ |
| 16 | ش (š) | š | /ʃ/ |
| 17 | ص (ṣ) | s | /s/ |
| 18 | ض (ż) | z | /z/ |
| 19 | ط (ṭ) | t | /t/ |
| 20 | ظ (ẓ) | z | /z/ |
| 21 | ع (ʿ) | ' | /ʔ/ |
| 22 | غ (ġ) | ġ | /ɣ/ |
| 23 | ف (f) | f, p[2] | /f/, /p/ |
| 24 | ق (q) | q | /q/ |
| 25 | ک (k) | k, g[3] | /k/, /ɡ/ |
| 26 | گ (g) | g[3] | /ɡ/ |
| 27 | ل (l) | l | /l/ |
| 28 | م (m) | m | /m/ |
| 29 | ن (n) | n | /n/ |
| 30 | و (w) | w | /w/ |
| 31 | ه (h) | h, ∅[4] | /h/ |
| 32 | ی (y) | y | /j/ |
| 0 | ء (ʾ), أ (ʾ), إ (ʾ), ؤ (ʾ), ئ (ʾ) | ' | /ʔ/ |
- ^ When ا (ʾ) is acting as a consonant in the initial position, it is not transcribed.
- ↑ 2.0 2.1 پ (p /p/) rarely was represented by ب (b) or ف (f) in some manuscripts.
- ↑ 3.0 3.1 While Chagatai manuscripts often don't distinguish k~g, Wiktionary maintains a distinction in transcription.
- ^ Final ه (h) is not transcribed when it is not pronounced.
- Notes
- As the Chagatai does not distinguish letter case, transcriptions should not artificially add it, and uppercase letters should never be used.
- نگ (ng) / نک (nk), when representing /ŋ/, is written as ñ between vowels and ng otherwise.
- The rare letter ڭ is transcribed ñ.
- ZWNJ is transcribed as a hyphen (-).
Vowels
[edit]Unlike Ottoman Turkish, Chagatai typically used matres lectionis in words of Turkic origin and it was less common to use diacritics outside of Perso-Arabic loans. Chagatai orthography only clearly distinguishes whether a vowel was rounded, unrounded, or low; And some Chagatai dictionaries are hyper-specific (i.e. including dialectal or conditional allophones), so etymological considerations will need to be taken when choosing which vowel to use.
| Letter | Possible corresponding vowels[1] | |
|---|---|---|
| Front | Back | |
| ا (ʾ), آ (ʾā) | ä | a, ā[2] |
| ی (y) | e, ē,[2] i, ī[2] | e,[3] ı[4] |
| و (w) | ö, ü[5] | o, ō,[2] u, ū[2] |
- ^ In Turkic words, all vowels are typically in the same set (i.e. all front or all back) but in Perso-Arabic words both sets of vowels can appear in a single word.
- ↑ 2.0 2.1 2.2 2.3 2.4 Length distinctions only existed in Perso-Arabic loans, and should not be marked otherwise.
- ^ e may ignore vowel harmony, even in Turkic words.
- ^ ı /ɯ/ later merged into i /i/ in what became Uyghur and Uzbek, but should still be distinguished when possible.
- ^ A somewhat uncommon exception is the usage of ـُیـ (-y-) for ü. In this case, it is preferable to have a diacritic in the headword.
- Notes
- There should never be a vowel hiatus in transliteration, and diphthongs are interpreted as a sequence of vowel + semivowel.
- In Perso-Arabic words, short vowels are always transliterated as ä or a (for ـَ), i or ı (for ـِ), ü or u (for ـُ). Ignoring the conditional (i.e. allophonic) lowering of short vowels before glottal consonants. Perso-Arabic short vowels must've been near these vowels, as evidenced by the neutralization of vowel length in Uzbek, though a short e and o can appear in Turkic words.
- Vowel transcription should reflect spelling variations, i.e. ایرون (irün), ایرین (irin).
