Wiktionary talk:Persian transliteration

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

ằIs this official, i.e. should I be changing all the "ā"s to "â"s ? --Ivan Štambuk 19:07, 2 April 2008 (UTC)

I don't think so. I raised the issue of transliteration when I started using this site, but the discussion was kind of abandoned. However Persian entries still only number around two and a half thousand, so it's not a disaster that it's not yet standardised. I think it's right to say that Dijan uses ā/š/kh/ž/č and I think Stephen G. Brown uses ā/š/x/ž/č. I use â/sh/kh/zh/ch. My own opinion is that sh/kh/zh/ch is the most user-friendly and more common, and that š/kh/ž/č (or š/x/ž/č) is a more academic style. As you know, both systems have advantages and disadvantages. There does need to be a decision between the two. Pistachio 19:24, 2 April 2008 (UTC)
Regarding ā and â, I believe that â is actually marginally more common overall. Pistachio 19:32, 2 April 2008 (UTC)
Yesterday I saw a Persian entry that simultaneously used both 'â' and 'ā' in transliteration, and that is kind of..very wrong, for obvious reasons. Macron is almost universally used in most of the languages of the world for transliterating/transcribing long vowels, circumflex is kind of Persian-only thing.
As for the 'sh' vs 'š' and similar - I'd personally prefer the latter, and generally the single letter vs. a corresponding digraphs which should somehow "approximate" phonetic value of the transliterated character in English-familiar terms. There's a ===Pronunciation=== section to accommodate for those unfamiliar with Persian phonology; we shouldn't be bastardizing transliteration scheme instead.
I suggest someone knowledgeable put this to vote at some time. Also would be great if someone could expand this with examples for stress notation and hyphen-separation for prefixes and enclitics in transliterations. --Ivan Štambuk 16:44, 3 April 2008 (UTC)serves the learner
It is counter-productive to use 'š', a letter which is not familiar to native English speakers. Pistachio 19:25, 3 April 2008 (UTC)
And I might add that it's a bit bloody rude to come in here and accuse me of 'bastardising', no less, the transliteration scheme after all the hundreds of hours of my own time I have spent trying to add entries in Persian. Pistachio 19:36, 3 April 2008 (UTC)
I'm sorry if it felt rude to you, but I didn't accuse of anything, nor tried to invalidate your contributions here. Two important points must be made clear:
  • Macron is generally really much more prevalent for indicating vowel length.
  • There are numerous transliterations schemes for other languages here on Wiktionary using obscure characters such as 'ṅ', 'ś' or 'ě' which average English user (for a definition of "average user", browse WT:FEED) will probably never encounter in his life. Confining oneself to the world of 7-bit ASCII or Latin-1 in a Unicode-aware Wiktionary might be exaggarating.
Here is a PDF comparing various transliteration schemes. I suggest this put to vote ASAP, because it looks silly that the three persons contributing to Persian entries have been for 2+ years using each his own's mutually incompatible scheme. --Ivan Štambuk 11:19, 4 April 2008 (UTC)


Does the current version correspond to an established system? Can someone provide a reference?

Discussing the requirement for references at Wiktionary:Beer parlour#Transliteration appendices. —Michael Z. 20:10, 12 April 2008 (UTC)

Revival of discussion[edit]

The discussion regarding transliteration schemes and the usage thereof needs to be revived, transliteration being an important part of a dictionary containing entries in an alien script. There are some points I would like to bring forth, each of them representing my personal views:

  • A 1:1 relation between Persian and Latin letters is preferable, for a number of reasons:
    • Gemination is more clearly represented, i.e. čč instead of chch.
    • Ambiguity problems are resolved, i.e. ž can only mean ژ, as opposed to zh which could mean زه or ژ.
    • It is more consistent and easy to read.
  • Capitalisation of proper names, initial words etc. in transliterated text should always be avoided. Capitalisation is a concept absent in Persian and therefore it should be left out.
  • Homophonic letters should be transliterated differently, i.e. ظ ز ض ذ should all have their own Latin counterparts, which is not the case in many of the transliteration schemes employed on Wiktionary. Transliteration schemes should be as bijective as possible.

Much more can be said on this subject; these are some starting points. ✎ HannesP · talk 21:51, 5 January 2012 (UTC)

The above suggestions incorporates a mix of transliteration standards, especially diacritics employed by non-English systems such as French and German. I vote that we adopt one that confirms to an English standard, such as ALA-LC or IJMES. I am very puzzled over why غ must be rendered as ğ. This is confusing as it is also a modern Turkish yumuşak ğ. The use of the caron is also primarily employed in non-English systems like the DGH, ALA-LC and IJMES do not use it, ever. What does this clarify much for an English speaker? IMHO, not much.Jemiljan (talk) 16:27, 30 September 2016 (UTC)
I think this issue has pretty much been handled. Except for capitalization, which I use strictly as an aid in reading transliterated text. — [Ric Laurent] — 16:27, 6 January 2012 (UTC)
Further revival, regarding capitalisation, not just Persian: Wiktionary:Beer_parlour/2014/January#Capital_letters_in_transliterations_of_languages_that_do_not_have_capital_letters. --Anatoli (обсудить/вклад) 23:32, 2 February 2014 (UTC)

New Persian Romanization System[edit]


The purpose of any Romanisation/ transliteration system is to correctly identify and distinguish different letters. These systems were first developed by scholars and library cataloguers in an effort to systematise how non-Latin letters rendered in Latin script. Furthermore, transliteration systems are not, strictly speaking, phonetic systems. The purpose is to convert written letters, and not necessarily an attempt to specify their exact sounds (which can change with a dialect anyway).

Under the current standard, I observe four different Persian letters, ز, ذ, ض and ظ are all transliterated as simply "z" without diacritics to distinguish them. Also س، ص, and ث are rendered as simply "s". These are phonetic standards that do not in any way distinguish between the original letters. Hence, this Wiktionary "standard" fails to meet the most basic and essential criteria of a transliteration system, which is to distinguish between letters written in a non-Latin script.

For this reason, I vote that additional diacriticals be added that follow an accepted language standard.

Compare with the system developed for Arabic Transliteration. While my own preference is against the use of the caron/hacek for it follows primarily non-English European transliteration standards, I do see that some have argued against the use of digraphs like kh, zh, gh for specific letters. Yet for Arabic, ǧ is used for ج when j is perfectly acceptable, and here we have ğ for غ.

So, beome suggestions to clarify and improve the current standard:

س = s

ث = ṯ

ص = ṣ

ز = z

ذ = ẕ

ظ = ż

ض = ẓ Jemiljan (talk) 18:21, 30 September 2016 (UTC)

1.) The purpose of transliteration/transcription may be to represent spelling or to represent pronunciation. Both are equally justified. Your claim that the former alone is "the most basic and essential criterium" is uncandid. The question is what is useful in a given context. When I use Persian words in a scientific text – for example: "The principle of vilāyat-i faqīh remains a matter of dispute." – then I do need exact reflection of spelling. But here on wiktionary we don't do that. Instead we always give the original spelling and only add the transcription for the sake of pronunciation. Your dialectal argument is not valid, because none of the letters in question are distinguished in any dialect of Persian. (You might be able to make a case concerning vowels, however.)
2.) We do use "j" for Arabic ج. I don't think this is a recent change either, but maybe it was. In fact, I prefer using ǧ, because the letter is pronounced /g/ invalid IPA characters (g), replace g with ɡ or /ɟ/ in some accents, and is etymologically g as well.
3.) In your proposed transcription you may want to use underlined s instead of underlined t for ث for the sake of consistency. And switch the representations of ض and ظ. That would leave you with the official DMG transcription for Persian. Again, however, I don't think we need it here. Kolmiel (talk) 23:42, 13 February 2017 (UTC)


@ZxxZxxZ, Irman, Dijan: Hi. User:Kaixinguo~enwiktionary insists on using "ow" for the diphthong [ou] instead of "ou". Do you agree with this? (Please ping any active Persian editor if I missed). --Anatoli T. (обсудить/вклад) 10:31, 4 January 2018 (UTC)

I agree, because Persian-speakers normally use ou" for long /u/, it's ambiguous. --Z 13:22, 4 January 2018 (UTC)
@ZxxZxxZ, Kaixinguo~enwiktionary, Irman, Dijan: Thanks. I have updated the policy as the preferred one and added some other symbols used occasionally in diff. Please follow the transliteration policy you have endorsed! :) --Anatoli T. (обсудить/вклад) 22:06, 4 January 2018 (UTC)
@ZxxZxxZ, Kaixinguo~enwiktionary, Irman, Dijan For the record, I didn't 'insist' on this at all, I made one edit to a word where he had put an erroneous transliteration and I used 'ow' because it was my belief that that was the recent consensus. User:Atitarev's decision to edit the policy is premature. And there is no need to tell User:ZxxZxxZ to follow the policy, when he has always followed every policy anyway. It comes across in a bad way. In fact, it was User:Atitarev who decided unilaterally to change our standard of capitalising proper noun transliterations and then altered only a few of the entries, leaving half in one way and half the other. Kaixinguo~enwiktionary (talk) 11:42, 15 January 2018 (UTC)
@Kaixinguo~enwiktionary: The erroneous transliteration you are talking about in خودرو‎ wasn't my edit, I only corrected "kh" to "x". I don't know why you are trying to put me in a negative light. Converting to lower case wasn't my decision either. The "ow" rule could never be followed if it was never part of the policy. It was always "ou", I changed it since everyone seemed to agree it was better than "ou". --Anatoli T. (обсудить/вклад) 11:57, 15 January 2018 (UTC)
I'm sorry for the late reply. No, not all of us are in agreement. I disagree with the change from "ou" to "ow". Following Z's logic of ambiguity, should we also use "oo" instead of "u" and "eh" instead of every "e"? --Dijan (talk) 16:41, 17 January 2018 (UTC)
That's a different case, a one-letter transcription is better than a two-letter one. But there's no advantage in using "ou" instead of "ow". --Z 12:36, 22 January 2018 (UTC)