User talk:DerekWinters

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

WT:HI TR[edit]

I had to correct the transliteration of your Hindi entries according to the link above. --Lo Ximiendo (talk) 17:22, 17 November 2012 (UTC)

Sorry for the trouble, I will try and do them properly from now on. Thanks for doing so though. DerekWinters (talk) 17:23, 17 November 2012 (UTC)
Please also see WT:TE TR, etc. Thanks —Μετάknowledgediscuss/deeds 17:26, 17 November 2012 (UTC)

रूपक[edit]

Derek, when you add a new L-2 entry, make sure you put a line between the entries with four hyphens (----). Compare what I did. Thank you for reading. --Lo Ximiendo (talk) 22:48, 18 November 2012 (UTC)

Oh
Good point, thanks for letting me know. DerekWinters (talk) 00:57, 19 November 2012 (UTC)

Derived terms in JA entries[edit]

Heya, I see you've added a number of ====Derived terms==== sections to JA entries; thank you for that.

A couple minor points:

  • Make sure that the ====Derived terms==== header comes at the end of the relevant POS or etymology section. Over on the entry, you added it before the POS header ([1]), but then I also see on the entry that you added it in the correct place.  :)
  • The {{l}} template doesn't need the sc parameter for JA. Compare:
    • (ひ, hi) -- using {{l|ja|日|tr=ひ, ''hi''}}
    • (ひ, hi) -- using {{l|ja|sc=Jpan|日|tr=ひ, ''hi''}}
Identical results, at least on my machine with FF 18 running on fully-patched Win 7.

Otherwise, looking good! Thank you for expanding the entries!

Cheers, -- Eiríkr Útlendi │ Tala við mig 17:38, 1 February 2013 (UTC)

Thank you and I shall be sure to place the entries in the correct location, the easier way now :). DerekWinters (talk) 22:38, 1 February 2013 (UTC)

photon and hydrogen in Sanskrit[edit]

Hi! Regarding the entries such as उदजन (udajana) and प्रकाशाणु (prakāśāṇu) - we can only add words that are actually attested in the written corpus, not made-up words that nobody uses/has used. Modern words coined/borrowed into extinct/ancient languages through some kind of "revival" efforts can only be added if there is evidence for them. E.g. we already have some modern Latin terms that are can be backed by quotations from Vatican publications. So unless there is actual attestation for these Sanskrit terms, they should be removed. --Ivan Štambuk (talk) 10:29, 24 March 2013 (UTC)

I understand. In all Indic languages, these words are used (with slight alterations in two or three) and so I assumed that, since they derived from Sanskrit, I should simply add them under Sanskrit too. And I doubt I'll be able to find any scientific articles written in Sanskrit, as they are primarily done in Hindi or English in India. DerekWinters (talk) 03:46, 26 March 2013 (UTC)

Hungarian words containing tan (science)[edit]

Hi, the Hungarian words containing tan (science) are compound words (tan is not a suffix). What was your source? Can you please go back and correct all of them? Thanks. --Panda10 (talk) 19:26, 27 December 2013 (UTC)

I corrected all of them and deleted the category. --Panda10 (talk) 20:23, 27 December 2013 (UTC)

Module:te-translit[edit]

I will try later to make it work. --Anatoli (обсудить/вклад) 04:17, 30 January 2014 (UTC)

Thanks. I think that all that needs to be done is to include Module:te-translit in Module:languages/data2 under Telugu. DerekWinters (talk) 04:20, 30 January 2014 (UTC)
Yes, that's the right place but need to make it work first. Something is not right with diacritics: Module:te-translit/testcases --Anatoli (обсудить/вклад) 04:33, 30 January 2014 (UTC)
It looks better now, I used literal diacritics, not UTF codes, which were for another script, anyway. Please check: Module talk:te-translit. Do you actually know Telugu? --Anatoli (обсудить/вклад) 05:07, 30 January 2014 (UTC)
Thank you. No, I don't know how to speak Telugu, but I do know how to write it. The Indic scripts are very to learn once one is, because they are all so similar. DerekWinters (talk) 05:13, 30 January 2014 (UTC)
So, there's no dropping of inherent "a"? Telugu is now partially enabled. To make it mandatory in headword templates, the templates need to change or if manual transliteration is removed, the automatic will work, e.g. see అంకపాళి (aṃkapāḷi) (the first noun I've come across). You can try Tamil, Kannada, etc. based on the Telugu module. If you want to edit here, please consider adding Babel to your user page. --Anatoli (обсудить/вклад) 05:41, 30 January 2014 (UTC)
Yes, the Southern Indian writing systems do not leave off the inherent "a" as the Northern Indian scripts do out of "laziness". Thank you very much. I just made a Kannada one Module:kn-translit, but the test cases are having a problem Module:kn-translit/testcases. Could you take a look? DerekWinters (talk) 05:44, 30 January 2014 (UTC)
I don't know what the problem is there but I will look into it, when I'm at my desktop. --Anatoli (обсудить/вклад) 07:43, 30 January 2014 (UTC)
It seems to work now but please check if all letters are transliterated correctly and nothing's missing. --Anatoli (обсудить/вклад) 11:44, 30 January 2014 (UTC)
Yes, the Kannada one is good, no issues. Thank you for all the help so far. However, as I tried to make a Tamil one Module:ta-translit, there were complications. There are 3 digraphs, {ஃப-f, ஃஜ-z, ஃஸ-x}, but I am unsure how to deal with them properly. If you could help with that as well? DerekWinters (talk) 02:22, 31 January 2014 (UTC)
I didn't have luck with it, sorry. I have asked a question here: Wiktionary:Grease_pit/2014/January#Module:ta-translit_-_another_transliteration_module_.28Tamil.29. Somebody might help later. --Anatoli (обсудить/вклад) 05:46, 31 January 2014 (UTC)
I made a Malayalam one Module:ml-translit and it seems to be working perfectly. However, the last of the testcases returns an error, even though the transliteration matches the expected perfectly. Either way, I still believe it to be fully functional. DerekWinters (talk) 17:26, 1 February 2014 (UTC)
I also just made a Dhivehi one Module:dv-translit and it works perfectly. DerekWinters (talk) 18:25, 1 February 2014 (UTC)
Thanks. There is some problem with Malayalam, see Module talk:ml-translit. --Anatoli (обсудить/вклад) 11:53, 2 February 2014 (UTC)
The Tamil one has been fixed too now. DerekWinters (talk) 21:28, 3 February 2014 (UTC)
I just finished up someone else's Inuktitut syllabics one Module:iu-translit and it works fine too. DerekWinters (talk) 21:50, 3 February 2014 (UTC)
Very good. You can try other languages but you will have to ask for assistance yourself. Those who can help are not very friendly. Good luck! --Anatoli (обсудить/вклад) 21:54, 3 February 2014 (UTC)
Well, maybe he's not the nicest. And if you're telling me you won't help any longer, I shall be very sad indeed. But for the Tamil and the Inuktitut and the Cherokee I just completed Module:Cher-translit, I was mainly hoping you could simply put them into use. DerekWinters (talk) 23:13, 3 February 2014 (UTC)
If you're happy with testing, you can do it yourself, you already know where to add translit modules - Module:languages/data2 (for languages with the two letter code). For languages, which share a module, you just need to repeat the same line. Of course, you can ask me questions but my Lua knowledge is limited. --Anatoli (обсудить/вклад) 23:37, 3 February 2014 (UTC)
My only issue is that I cannot actually add to it. I don't have editing rights. DerekWinters (talk) 23:40, 3 February 2014 (UTC)
──────────────────────────────────────────────────────────────────────────────────────────────────── I see. Could you make a list of languages/modules to add, so that I can edit easily, e.g. like this:
Malayalam: translit_module = "ml-translit". --Anatoli (обсудить/вклад) 23:47, 3 February 2014 (UTC)
Tamil: translit_module = "ta-translit"
Inuktitut: translit_module = "iu-translit"
Cherokee: translit_module = "Cher-translit" DerekWinters (talk) 23:51, 3 February 2014 (UTC)
They have been added already: Module:languages/data3/c has Cherokee. --Anatoli (обсудить/вклад) 23:58, 3 February 2014 (UTC)
Oh, thanks. Is it possible I could get editing rights for that page? DerekWinters (talk) 00:12, 4 February 2014 (UTC)

Mass translation-adding[edit]

You need to be more careful when you add blocks of translations: your attempt to add a translation to the computer entry using the non-existent language code "eml" failed with a big fat "Module error: Module error", which you might have noticed if you had checked your edit. FYI, "eml" is a fake code they made up in order to have one for the Emiliano-Romagnol Wikipedia. It's tempting to crib translations from other Wikipedias, but contributors in smaller Wikipedias have a strong tendency to make things up/guess when they don't know a word in their language for something- even when there's a name for it in the language already. Chuck Entz (talk) 08:34, 11 February 2014 (UTC)

がん[edit]

Please don't forget the language section separator: [2]. By the way what's Yoron? Can you add an English entry for it? JamesjiaoTC 21:02, 3 March 2014 (UTC)

Whoops. Yoron is a Ryukuan language, like Okinawan and Miyako. I'll add one. DerekWinters (talk) 02:22, 4 March 2014 (UTC)

Module:yi-translit[edit]

Hi,

It seems Yiddish can't be transliterated accurately without vowel points, like Hebrew or Arabic, cf. manual עזה פאס(ezza pas), עזה־שטרײַף(ezza-shtrayf) and automatic עזה פאס(ezh fas), עזה־שטרײַף(ezh-shtrayf) - translations of Gaza Strip. --Anatoli (обсудить/вклад) 05:17, 28 March 2014 (UTC)

Oh I see, I had been under the impression that Yiddish was generally written in a fully pointed way, but it seems that it can vary and is often only partially pointed. Sorry if it caused any errors. DerekWinters (talk) 17:08, 29 March 2014 (UTC)

škl[edit]

Are you sure that word is attested in the Inscriptional Pahlavi script? Please see this discussion. --Vahag (talk) 08:55, 15 June 2014 (UTC)

I must say that I don't think it is attested in that form. Also, I was not aware of this discussion, so I apologize for my mistake. I can change it back, or would you rather revert the change I made? DerekWinters (talk) 19:52, 15 June 2014 (UTC)
I moved it back. --Vahag (talk) 20:37, 15 June 2014 (UTC)

Speedy deletion[edit]

Hi,

Please do not speedy delete entries, especially not 5 from one page without an explanation. If you wish to challenge a word, use WT:RFV (and read the page intro of that page to see what qualifies and what doesn't). Thank you, Renard Migrant (talk) 20:10, 19 August 2014 (UTC)

Also, don't delete anything that's in actual use- even erroneously: we're a descriptive, not a prescriptive dictionary. And don't delete terms in scripts that are used by native speakers because they're not the "right" scripts for their languages. Remember, as well, that we aren't limited to any one time or place: if a script was used briefly, then abandoned, we'll want to have entries in the abandoned script for those terms that were known to be written in that script. You can tag incorrect forms as obsolete, proscribed, nonstandard, etc., and you can explain in usage notes why they shouldn't be used. If, on the other hand, you don't think they were ever used with that spelling/script, that's when you would take it to WT:RFV. Chuck Entz (talk) 02:05, 20 August 2014 (UTC)

Okinawan kana entries linking to kanji entries[edit]

I noticed you added some more Okinawan content, thank you for that. One minor change to make going forward, please use {{ryu-def}} to link from Okinawana kana entries to their corresponding kanji spellings. I found that you'd used {{ja-def}}, which links to the corresponding Japanese kanji spelling instead of the Okinawan entry.  :) ‑‑ Eiríkr Útlendi │ Tala við mig 07:02, 6 November 2014 (UTC)

Oh whoops. I had copied the code from Japanese entries. I'll be more careful from here on out. DerekWinters (talk) 01:19, 7 November 2014 (UTC)
  • No worries, easy fix.  :) I also noticed that we don't have very many templates for Okinawan. The Japanese templates' code can probably be copied over and tweaked to create new Okinawan templates where required. ‑‑ Eiríkr Útlendi │ Tala við mig 05:51, 7 November 2014 (UTC)

所#Okinawan[edit]

Heya, I'm not fully up on Okinawan, but I do notice that the reading given here matches the mainland on'yomi for 場#Japanese instead. Mainland o shifts to うちなーぐち u, much as the o in okinawa becomes the u in uchinā, so the expected Okinawan on'yomi for 所 would be シュ and for 場 would be ジュー. I checked http://hougen.ajima.jp/hougen.php?q=%E6%89%80 to see what data that might give, and while that list is not exhaustive, it doesn't include any ジョー readings for 所.

FWIW, I also see the listed kun'yomi is tukuru, following the same o > u shift.

Could you have another look at the 所#Okinawan entry? ‑‑ Eiríkr Útlendi │ Tala við mig 06:47, 9 November 2014 (UTC)

Thank you very much for pointing this out, as I had simply run with this. Other sources do point out that 所 can have both シュ or ジュ (in 御所: うんじゅ) for its onyomi reading, although the second one is probably just because of rendaku. However, http://www.jlect.com/entry/350/unju/ notes that 御所 can be うんじょー when topicalized, and perhaps this is what User:Viskonsas saw in some Okinawan text? Should I change them to じゅ?
Side question, what are the Okinawan names for onyomi and kunyomi? DerekWinters (talk) 01:52, 10 November 2014 (UTC)

また、[あなたは]というのは、「うんじょー」になります。これは「うんじゅ」+主格「や」[は]が融合した形になっています。

So if we include any entry for Okinawan うんじょー (or should it be in katakana?), the entry should probably describe this form as a contraction. The じょー part, at any rate, does not appear to be any standard Okinawan reading for .
As far as Viskonsas's edits, yes, those should probably be changed to シュ・ジュ as appropriate. They (he? she?) self-describe as ja-1 with no mention of ryu anything.
  • Re: Okinawan for on'yomi or kun'yomi, I assume that such terms exist in Okinawan, as the phenomenon of both Chinese-derived and native-derived readings for a single character does seem to happen in Okinawan as well, but I don't know what these terms would be. My brief searching so far has also failed to find anything. ‑‑ Eiríkr Útlendi │ Tala við mig 06:24, 10 November 2014 (UTC)
This is wonderful. Thanks! I'll make the changes to , but I think I'll hold off on うんじゅ and うんじょー for now until I better understand them. Most sources tend to treat Okinawan on the same level as Japanese, using hiragana and kanji for "native" terms and katakana for modern borrowings. I also believe that historically, after the Japanese developed hiragana, it was imported to Okinawa and used there as well. So, overall, I think we should stick to the standard rules we know for Japanese writing for the Ryukyuan languages. DerekWinters (talk) 04:16, 11 November 2014 (UTC)

Bengali transliteration module[edit]

Hi,

Are you still interested in Indic languages? Do you think you can work on Module:bn-translit and Module:gu-translit? I will try to address dropping inherent vowels later for Hindi et al, Bengali, Gujarati. Amharic/Tigrinya have a similar problem with dropping vowels. There's no reason we can make these languages transliterated 100% or nearly 100% automatically, they are much easier than Korean or Arabic. Just need to get some help from Lua gurus. @Dick Laurent, Dijan your help on Bengali transliteration would be much appreciated. --Anatoli T. (обсудить/вклад) 01:05, 6 January 2015 (UTC)

Hi. I'll definitely be able to create a module for Gujarati, but as you noted, the schwa-dropping exists as well in Gujarati (sources say it's different to the schwa-dropping of Hindi, but I've never noticed a difference). Some words will have to be hard-transliterated because Gujarati lacks proper trasnscription for 2 less-used vowel phonemes (ɛ and ɔ), but that shouldn't be too hard.
Bengali on the other hand is a little more complicated (less transparent) and I never have truly learned the script. I'll make a basic Bengali module, but it most definitely won't be ready to use until someone with expertise makes some changes.
Also, I noticed a defect with the Tamil module. "Plosives are unvoiced if they occur word-initially or doubled. Elsewhere they are voiced." I, too, have noticed this, but I'm unable to code Lua with such skill. DerekWinters (talk) 12:24, 6 January 2015 (UTC)
So I've made a Gujarati module, but the testcases show what the main issues are. 1st is the schwa-dropping. 2nd is the uṃ sequence word-finally. It is always to be transliterated ũ, but I am unsure how to code that. 3rd is the issue of ṃ in front of a consonant.
ṃ in front of a velar (k, kh, g, gh) is . In front of a palatal letter (c, ch, j, jh) it is ñ. In front of a retroflex (ṭ, ṭh, ḍ, ḍh) it is . In front of a labial (p, ph, b, bh, m) it is m. In front of a dental (t, th, d, dh) and all remaining consonants (y, r, l, v, ḷ, ś, ṣ, s, h) it is simply n. I also don't know how to code this. Also, this last issue I noted is common to all Indic languages except in a few cases where words will have to be hard-transliterated.
I don't know what we should do about Bengali transliteration. I can see the merits of sticking with a more scholarly system, given the differences in pronunciation between Indian and Bengali dialects, but it shouldn't be identical to the systems used for other Indic languages. For example, where others use a short vowel "a," in Bengali this vowel is pronounced "o," and of course as you guys mention, that vowel is often dropped. — [Ric Laurent] — 16:51, 6 January 2015 (UTC)
Thanks very much. I will address it in due course. Ric, we can choose one system and stick to it. I think you meant "ô", not "o" (the short vowel). There is no 100% consistency in transliterating dropped vowels, so if we come up with a working logic, we could use for many languages like Hindi, Gujarati, Bengali and (surprisingly) Amharic/Tigrinya (short vowel "ə"), e.g. ዩክሬን (yukren) should be "yukren". Amharic et al (Module:Ethi-translit) also have gemination issue, which is not expressed graphically. Native speakers don't have any problem with it and it seems that some transliterations ignore it altogether. --Anatoli T. (обсудить/вклад) 22:33, 6 January 2015 (UTC)
I also made an Oriya transliteration module. Its testcases show the same problem of schwa dropping (or in this case ô-dropping). Module:or-translit. DerekWinters (talk) 18:09, 7 January 2015 (UTC)
Do you think you can write a short paragraph describing the rules when, e.g. in Hindi, the inherent vowel "a" is dropped: e.g. ("C" is any consonant, "V" is any vowel, apart from "a") CaCaCa = CaCaC, CeCaCāCaCī = CeCCāCCī (devanāgarī = devnāgrī), etc.? Does it matter, which consonants are involved, e.g. in consonant clusters? CaCCCa = CaCCC or CaCCCa? --Anatoli T. (обсудить/вклад) 23:27, 7 January 2015 (UTC)
The idea behind vowel dropping lies with syllabification. A schwa at the end of a syllable is always dropped.
करन - क|रन (ka|ran) (the 'na' becomes 'n')
करना - कर|ना (kar|nā) (the 'ra' becomes 'r)
One major exception is if the schwa is part of a consonant cluster involving a "special" consonant (y, r, l, v, h, ṇ, n, and m) word-finally. The schwa here is not dropped. The words syllabifies by the first member of the cluster becoming part of the previous syllable, and the rest of the cluster becoming its own syllable.
वस्त्र - वस्|त्र (vas|tra) (the 'tra' remains because 'r' is a special consonant)
भस्म - भस्|म (bhas|ma) (the 'ma' remains because 'm' is a special consonant)
Another is if the schwa is part of any consonant cluster (or gemination) word-medially. The schwa here is not dropped. Syllabification happens just as above.
अस्पताल - अस्|प|ताल (as|pa|tāl) (the 'pa' remains because it is part of the cluster)
उत्तम - उत्|तम (ut|tam) (the 'ta' remains because part of cluster) (the 'ma' becomes 'm' because end of syllable)
I laid out a list of some CVC formations.
S - Special Consonants (y, r, l, v, h, ṇ, n, and m)
R - Regular Consonants (all other consonants)
X - any vowel ('a' and all the others)
T - any consonant combination (C or CCC, etc.)
An 'a' on an initial consonant is never dropped.
An 'a' in independent form is never dropped. (अ)
XCa = XC (ila = il)
XCRa = XCR (opsa = ops)
XCSa = XCSa (ustra = us‧tra)
TXCa = TXC (skela = skel)
TXCRa = TXCR (drupta = drupt)
TXCSa = TXCSa (blisva = blis‧va)
XTXCa = XTXC (ertopa = er‧top) (ertapa = er‧tap)
XTXCRa = XTXCR (ertopsa = er‧tops)
XTXCSa = XTXCSa (ertopya = er‧top‧ya)
XCaCV = XCCV (utasi = ut‧si)
XCaTV = XCATV (utasmi = u‧tas‧mi)
XTaTV = XTaTV (ektammo = ek‧tam‧mo) (ektalo = ek‧ta‧lo)
DerekWinters (talk) 09:58, 8 January 2015 (UTC)

@DerekWinters, could you please check if the actual cases in Module:hi-translit/testcases conform. User:Wyang has kindly added and fixed most of them. --Anatoli T. (обсудить/вклад) 02:12, 9 January 2015 (UTC)

Could you check if अंगरेज़ (as opposed to अंग्रेज़) should be "aṁgrez", not "aṁgarez". The latter has a virama (्) after ग, so no problem there but the former hasn't got it. Also pinging @Wyang. --Anatoli T. (обсудить/вклад) 04:55, 9 January 2015 (UTC)
I am super dooper impressed. Very, very, very impressed. The transliteration aṁgrez (actually angrez) is correct for both. अंगरेज़ and अंग्रेज़ both get split the same way: अंग्‧रेज़ / अंग‧रेज़. DerekWinters (talk) 11:13, 9 January 2015 (UTC)
I believe this system will work for Gujarati, Marathi, Sindhi, Kutchi, Rajasthani, Marwari, Bhojpuri, Konkani, Saurashtra. Beyond that, I'm not sure if any other languages would work. Gujarati and Kutchi share the same script (Gujarati). Saurashtra has its own script (Saurashtra). All the others use Devanagari. Do you know what we should do about Bengali, Oriya, etc.? @Atitarev DerekWinters (talk) 07:24, 10 January 2015 (UTC)
Yes, you're right. I don't feel comfortable with Lua, though. Are you able to make a basic Bengali module based on WT:BN TR, perhaps? Then we can ask Wyang to do his magic tricks, also for Gujarati and Oriya by copying the logic? --Anatoli T. (обсудить/вклад) 23:59, 11 January 2015 (UTC)
I'll make a Bengali module, but we'll have to have someone verify it before we even try to work with it. DerekWinters (talk) 14:14, 12 January 2015 (UTC)
Sorry for being a lazy poo. I made some edits to the bn-translit module, but after visiting the wiki article and other sources on the Bengali alphabet, I realized why I'm so terrified of it. It's a lot, and we need an expert to help us here. DerekWinters (talk) 08:33, 13 January 2015 (UTC)

હરિકેન[edit]

Hey. Could you check that entry? I'm pretty sure that the transliteration is wrong and I'm not sure if the definition is correct. --Dijan (talk) 02:50, 3 February 2015 (UTC)

Oops, sorry about that. I fixed the definition. I don't believe the transliteration is wrong though: હ(ha)રિ(ri)કેન(ken). DerekWinters (talk) 03:50, 3 February 2015 (UTC)
Sorry, I did fix it before you did :) --Anatoli T. (обсудить/вклад) 03:52, 3 February 2015 (UTC)
Oops, I must have just completely forgotten. Whoops. Forgive me. DerekWinters (talk) 04:53, 3 February 2015 (UTC)

鏡#Yonaguni[edit]

I'm curious, did you mean to list the definition as water, or was that a copypaste error? ‑‑ Eiríkr Útlendi │ Tala við mig 07:20, 8 February 2015 (UTC)

Oh god. Do forgive me. I keep making these copy-paste errors. I absolutely hate writing an entry from scratch so I copy-paste and sometimes I forget key things. DerekWinters (talk) 08:15, 8 February 2015 (UTC)
  • No worries, I've done that too, with similar erroneous results sometimes.  :) FWIW, you might find the edittools JavaScript useful: [[User talk:Conrad.Irwin/edittools.js]]. This allows you to define your own one-click insertion items. I've found this extremely helpful over the years. Cheers, ‑‑ Eiríkr Útlendi │ Tala við mig 08:56, 8 February 2015 (UTC)

Citability[edit]

Hi. Some of the recent English terms you've added don't seem to be citable per WT:ATTEST, and I've sent a couple to WT:RFV. If these can in fact be cited, can you please help do so, and if not, can you please refrain from adding such entries? Thank you! —Μετάknowledgediscuss/deeds 02:51, 11 May 2015 (UTC)

ᐅᓪᓗᕆᐊᖅ[edit]

Hi. How wrong is my new entry ᐅᓪᓗᕆᐊᖅ? --Type56op9 (talk) 13:40, 22 May 2015 (UTC)

@Type56op9 Hey, no need to be so pessimistic. It's actually a rather solid entry. Nothing at all wrong with it. If you wish to make it stronger however, you could add the categories it would fall under (just check the star to see which ones), an etymology, a pronunciation, and even sample sentences or citations. Keep up the good work. DerekWinters (talk) 19:34, 2 June 2015 (UTC)
No need to be so optimistic. I can't do anything like that. --Type56op9 (talk) 22:25, 2 June 2015 (UTC)
@Type56op9 :) Even just adding new terms like this is a extremely helpful to the project. DerekWinters (talk) 22:43, 2 June 2015 (UTC)
Yeah, that's the kind of message you wanna send me. --Type56op9 (talk) 22:47, 2 June 2015 (UTC)

Admin?[edit]

Hey DW. Fancy being burdened with administrative tools? --Type56op9 (talk) 22:48, 2 June 2015 (UTC)

I mean, would you accept the great honour of becoming a systems operator? --Type56op9 (talk) 22:49, 2 June 2015 (UTC)
@Type56op9 What would this entail? DerekWinters (talk) 00:56, 3 June 2015 (UTC)
Thankless tasks mostly: Being the guy who cleans up after vandals. Deleting pages, protecting pages, changing the Main Page and other proctected pages, blocking users. --Type56op9 (talk) 08:01, 3 June 2015 (UTC)
I mean, it is a true honour. You will be able to do lots of cool things! You can see the content of deleted pages, many of which include personal information of our users; you can get rid of users you disagree with, your opinion will be worth more in our forums, and you'll have loads of fun! --Type56op9 (talk) 08:03, 3 June 2015 (UTC)
@Type56op9 I've done some research myself and I've decided that I'll take you up on the offer. DerekWinters (talk) 22:32, 3 June 2015 (UTC)
Sweet. Please accept here --Type56op9 (talk) 09:53, 4 June 2015 (UTC)
@Type56op9 Thanks for supporting me. And I must say, your style of writing (speech) is very refreshing here on wiktionary. DerekWinters (talk) 15:21, 4 June 2015 (UTC)
@Type56op9 Just remove the vote, it seems futile. DerekWinters (talk) 21:25, 8 June 2015 (UTC)

{{gu-noun}}[edit]

After your edits, there are 35 entries with the redlinked Category:Gujarati terms needing gender. It looks like the {{head}} template is adding Category:Gujarati terms with incomplete gender to the same entries, and there are no other categories in the format "[language] terms needing gender". There's also an error in at least one entry due to setting cat2 twice. Is there a reason you have it set up this way? Chuck Entz (talk) 20:08, 12 June 2015 (UTC)

@Chuck Entz I seem to have messed up there. I was certain that I'd seen a category somewhere that handled missing gender, but I couldn't find the exact name for it, so I modeled it after the other category that was already set up as Category:Gujarati terms needing transliteration. I've gone ahead and removed the cat2. I hope that's fixed the problems at hand. Sorry for any inconveniences. DerekWinters (talk) 16:42, 13 June 2015 (UTC)

Telugu module[edit]

Derek, in ఫలింౘు (phaliṃtsu), the letter (tsa, u+0c58) is not being transliterated. Also ౙంకు (dzaṃku), (dza, u+0c59). Can you have a look? —Stephen (Talk) 12:18, 15 July 2015 (UTC)

@Stephen G. Brown Thank you for notifying me. I'd never even heard of those letters. I've added the letters, checked them, and they work now :) DerekWinters (talk) 03:10, 16 July 2015 (UTC)
Thanks, looks good. Yes, there are a few special, classical and rare characters: (tsa), (dza), (), (l̥̄), () (w:avagraha, or apostrophe ’, referring to the Sanskrit letter (), u+0c3d), (0⁄4) (fraction sign 04, u+0c78), (¼) (fraction sign 14, u+0c79), (2⁄4) (fraction sign 24, u+0c7a), (¾) (fraction sign 34, u+0c7b), (0⁄16) (fraction sign 016, u+0c66), (1⁄16) (fraction sign 116, u+0c7c), (2⁄16) (fraction sign 216, u+0c7d), (3⁄16) (fraction sign 316, u+0c7e), ౿ (౿) (tuumu sign, an antiquated measuring unit for grains, u+0c7f)). —Stephen (Talk) 04:30, 16 July 2015 (UTC)
@Stephen G. Brown I added the avagraha and the extra numerals too, thanks. I think that almost everything from the Unicode chart can be transliterated now. Also, would you happen to know the way (if any) that Telugu accomodates Arabic/Persian/Urdu and English loanwords that use z, f, x, q, ɣ, ʒ, etc.? I was thinking that Telugu might use some variation of the nukta like Hindi does, or perhaps even something similar to the Tamil āytam. DerekWinters (talk) 16:00, 17 July 2015 (UTC)
No, I don’t know what Telugu does with those languages. However, you can ask User:Rajasekhar1961. User:Rajasekhar1961 is native Telugu, and educated (he’s a doctor of medicine), and he’s very interested in the Telugu entries, both here and on the Telugu Wiktionary. —Stephen (Talk) 17:46, 17 July 2015 (UTC)
@Stephen G. Brown Thank you very much. I'll be sure to ask him. DerekWinters (talk) 00:04, 18 July 2015 (UTC)

Proto-Kartvelian[edit]

Most sources on Proto-Kartvelian use the Latin script for reconstructions. Please do not change them to the Georgian script. --Vahag (talk) 19:31, 4 October 2015 (UTC)

Just a reminder to be careful when editing[edit]

[3]Μετάknowledgediscuss/deeds 21:27, 10 October 2015 (UTC)

Broken {{borrowing}}[edit]

Your change to Module:etymology/templates caused a lot of breakage. Can you fix it? Benwing2 (talk) 04:58, 3 November 2015 (UTC)

Yeah sorry... DerekWinters (talk) 04:58, 3 November 2015 (UTC)
Thanks. Benwing2 (talk) 05:03, 3 November 2015 (UTC)

inter-wiki help need[edit]

When i see this module, we felt very happy. I have one request on behalf of them. Is it possible to do the transliteration reversely. That is English to Tamil. If possible we will do the tech to our Indian languages.--த*உழவன்(info-farmer) (talk) 11:26, 18 November 2015 (UTC)

@Info-farmer I'll definitely try. I'll let you know soon. DerekWinters (talk) 16:28, 18 November 2015 (UTC)

Middle Persian[edit]

Hello. Book Pahlavi is not in Unicode. You should not replace the Romanizations with Inscriptional Pahlavi, a different script. Also, no Manichaean fonts exist, even though it is now in Unicode. It is the consensus among the Middle Iranian editors on Wiktionary to use Romanizations for lemmas of Middle Iranian languages, except for the words attested in Inscriptional Pahlavi and Inscriptional Parthian. See also Wiktionary:Votes/pl-2011-09/Romanization of languages in ancient scripts 2. --Vahag (talk) 10:10, 5 December 2015 (UTC)

छठवां[edit]

The module error seems to be due to your changes to the data module. Chuck Entz (talk) 13:36, 22 December 2015 (UTC)

{{borrowing}}[edit]

Why are you replacing etymologies of Hindi and Gujarati words from Sanskrit with {{borrowing}}? They are all Indo-Aryan languages, so shouldn't {{bor}} be limited to Persian and English loanwords? I'm just curious. —Aryamanarora (मुझसे बात करो) 00:25, 20 January 2016 (UTC)

I don't know about any specific cases, but, in general, it's entirely possible for terms from a literary language to be both inherited and borrowed. When inherited, it stays in continuous use as the parent language evolves into the daughter language, reflecting any sound changes that happen along the way. When borrowed, someone reads it (or, in this case, perhaps hears it recited) centuries later, and adopts it into their language directly. Think about all of the religious terminology in Hindi that's pure Sanskrit. Those terms may contain basic vocabulary that has made its way separately down to Hindi by inheritance, but the precise combination that has religious meaning is intentionally kept as close to the Sanskrit original as possible.
When you're doing etymology, you have to look at the history of the individual terms. The whole language may have been indirectly inherited from Sanskrit (actually Old Indic, of which Sanskrit is a very artificial subset), but individual terms may have been borrowed directly from Sanskrit, or indirectly via another language that got it from Sanskrit. Chuck Entz (talk) 02:16, 20 January 2016 (UTC)
Exactly what Chuck said. For example, the Hindi word काम is inherited from कर्म -> कम्म (through assimilation) -> काम (simplification and compensatory lengthening). Thus I would say that काम is inherited, but कर्म is borrowed. These words have been borrowed as opposed to having been inherited. However, it gets really murky along the lines, as every new stage of Indic languages tried to sound like its former stage in an effort to sound erudite and intelligent. Thus, the Prakrits, while originally celebrating their separateness from Sanskrit, began borrowing heavily from it in learned speech. The Apabramshas, again, celebrated their distinctness from their Prakrit forebears, but then began borrowing lexically and morphologically from them and lexically from Sanskrit. And the same has happened with the new Indo-Aryan languages. A good example of different forms of the same language are Shadhu-bhasha and Cholitobhasha for Bangla. DerekWinters (talk) 18:32, 20 January 2016 (UTC)
That makes complete sense! कम्म is actually attested as Pali kamma, so that was a very good example. And the digloss in Indic languages is quite common - in Hindi there's शुद्ध हिंदी (śuddh hindī) and the spoken version हिंदुस्तानी (hindustānī). According to my Odia textbook, the same applies for that language. Thank you for the explanation! —Aryamanarora (मुझसे बात करो) 18:43, 7 February 2016 (UTC)
I'm glad it helped :). It's fairly common in all modern (and most historical) Indian languages to have very high levels of Sanskritic loanwords. And diglossia is when two different languages are actually spoken concurrently by the population. Hindi doesn't really have that though. Shuddh Hindi and Hidustani differ only in vocabulary, not morphology. Gujarati also isn't diglossic (except for maybe Hindi and some English in today's Gujarat?), but you can even find Old Gujarati forms borrowed into the language for use in bhajans and kirtans, etc. to give an old or rustic feel to them. DerekWinters (talk) 21:05, 7 February 2016 (UTC)
Oh, I seem to have misunderstood the meaning of diglossia - I thought it meant when there were two registers of the same language, one with higher prestige. Again, thanks for the knowledge! —Aryamanarora (मुझसे बात करो) 23:42, 17 February 2016 (UTC)
I don't think you're so far off. In diglossic situations, there are two different dialects or languages, with different prestige. Usually, the languages are fairly closely related; it seems a bit odd to me to refer to English/Gujarat as diglossia, but technically I think it's correct. I'm not sure whether two registers that differ largely or only in vocabulary would count as diglossia, although Wikipedia does indicate both Hindi and Urdu as diglossic. Among Indian languages, Tamil definitely has diglossia. Benwing2 (talk) 01:54, 18 February 2016 (UTC)
BTW I agree with Chuck and Derek that you can have borrowings from an earlier form of the language. The Romance languages, for example, have tons of borrowings from Latin. Benwing2 (talk) 01:58, 18 February 2016 (UTC)
Diglossia is with another language or a higher form of the same language. Its just where there are two languages with differing levels of prestige and usage within a community. So Shadhubhasha and Cholitobhasha are (were) the diglossic forms of Bengali, but I would also say that high levels of Arabic proficiency in Nigeria, within a Fulani community, would be classified as diglossia. DerekWinters (talk) 02:11, 18 February 2016 (UTC)

Ngoko, Krama, & Krama Inggil[edit]

I think that's a brilliant idea! However, I'm not good at creating template. Cahyo Ramadhani (talk) 00:06, 8 February 2016 (UTC)

उछव[edit]

Is there a distinction between "afterbears" and the more usual "descendants"? DTLHS (talk) 01:12, 27 March 2016 (UTC)

Oh no not really. I'll switch it. DerekWinters (talk) 01:27, 27 March 2016 (UTC)

Konkani[edit]

{{kok-pos}} exists. —Aryamanarora (मुझसे बात करो) 13:14, 13 April 2016 (UTC)

Just curious[edit]

Hey I'm just curious about the title of the source where you got the Tagalog words like balnidinagipik and balngawsukatan. Thanks.

@Mar vin kaiser Hello. Sorry for the delay, just got back from vacation. Oh my. Those were so long ago, to be honest, my zeal was such at that time, I may have simply seen it somewhere online and found that to be worthwhile enough to add it. If you don't see any valid reason to keep them, feel free to remove them. DerekWinters (talk) 04:24, 20 July 2016 (UTC)
If they don't meet the requirements of WT:ATTEST, they should be deleted. Derek, if you know now that you created them in error, please put {{delete}} on each of the entries like that. —Μετάknowledgediscuss/deeds 05:37, 20 July 2016 (UTC)
@Metaknowledge Well, see, I'm not sure at this point if they are valid terms or not. If Mar vin kaiser can confirm one way or another, I'll take the appropriate steps. DerekWinters (talk) 05:44, 20 July 2016 (UTC)
balngawsukatan was probably a scanning or typing error. It probably should be balangaw sukatan (literally, rainbow metrics). The entry balnidinagipik at least needs another "a", balani dinagipik, and I think "dinagipik" is also not quite correct.
Your terms seem to come from here. The correct spellings should be in a scientific dictionary named Maugnaying Talasalitaang Pang-agham Ingles-Pilipino, by Gonsalo del Rosario. However, it is out of print and no copies are available for sale. It can be found in many major libraries. Someone has photocopied the dictionary (jpeg), a page at a time, and it is available for free download here. It is awkward to use, since it consists of something like 300 individual jpegs (not searchable). —Stephen (Talk) 10:20, 11 September 2016 (UTC)
It won't make them searchable, but maybe merging them in to a single pdf might help.Crom daba (talk) 11:45, 11 September 2016 (UTC)

Old Uyghur[edit]

Hey, I've noticed you made Old Uyghur ᠨᠤᠮ (nom) a year or so ago. I'm guessing it's not the same encoding as Mongol script, judging by the fact that it's a different page from Mongolian ᠨᠣᠮ (nom), so how did you get O. Uyghur characters? Wikipedia only has images of the letters. Crom daba (talk) 22:25, 9 September 2016 (UTC)

@Crom daba From my memory, I had seen Old Uyghur ᠨᠤᠮ (nom) as a redlink on here, with its pronunciation as 'nom'. I can, very slowly, read some Mongolian, and cross-referenced it with some source, and just added it from the redlink. Doing some research now, it seems Unicode has both an "o" and an "u", which both look the exact same (actually though). I'm not sure what to do. DerekWinters (talk) 01:16, 10 September 2016 (UTC)

ⲥⲁϫⲓ[edit]

If it's of any assistance, Walter Till gives śḏd as the etymology. Lingo Bingo Dingo (talk) 10:45, 5 October 2016 (UTC)

@Lingo Bingo Dingo You're right. I shouldn't be dumb. There's a clear ⲥ right there. Thanks :) DerekWinters (talk) 01:52, 6 October 2016 (UTC)
Nothing dumb about it, as I guess the roots are anyways related. If you'd like I could give you a pointer to a few references on Coptic etymology. Lingo Bingo Dingo (talk) 12:18, 6 October 2016 (UTC)
@Lingo Bingo Dingo Thanks. Yeah that would be great if you could. DerekWinters (talk) 14:24, 6 October 2016 (UTC)
A quick and easy reference is Westendorf, Koptisches Handwörterbuch (German). More extensive are Černý, Coptic Etymological Dictionary (English), and Vycichl, Kasser, Dictionnaire étymologique de la langue copte (French). Probably present at larger libraries. Lingo Bingo Dingo (talk) 13:47, 8 October 2016 (UTC)
Thank you very much! DerekWinters (talk) 20:17, 8 October 2016 (UTC)

Share your experience and feedback as a Wikimedian in this global survey[edit]

  1. ^ This survey is primarily meant to get feedback on the Wikimedia Foundation's current work, not long-term strategy.
  2. ^ Legal stuff: No purchase necessary. Must be the age of majority to participate. Sponsored by the Wikimedia Foundation located at 149 New Montgomery, San Francisco, CA, USA, 94105. Ends January 31, 2017. Void where prohibited. Click here for contest rules.

Assamese[edit]

I see that we lack any infrastructure for Assamese here; at the very least we should have a standard transliteration scheme documented at WT:Assamese transliteration and a module Module:as-translit to support it. Is it wholly predictable, or are there any complications? @Atitarev, WyangΜετάknowledgediscuss/deeds 02:08, 13 March 2017 (UTC)

I don't work on these modules, they are too complicated for me but I could add some test cases. Module:bn-translit could serve as a base or maybe just a tweak is required. It's not working perfectly, Module:hi-translit does a much better job for regular terms. Perhaps the only challenge in all these north Indian modules is to get the shwa-dropping rules right. @DerekWinters, Kc kennylau, Aryamanarora, FYI. --Anatoli T. (обсудить/вклад) 02:22, 13 March 2017 (UTC)
Just the same as Bengali, Assamese has no real predictability with schwa-dropping, and words must be learned individually. Otherwise, Assamese is not bad for the standard transliteration. I can create the transliteration page. As an aside, I have noticed many many words incorrectly transliterated (in regards to the schwa-dropping) for Gujarati, but honestly I don't know how feasible it would be to try and fix it. DerekWinters (talk) 02:25, 13 March 2017 (UTC)
If a shwa rule represents a common behaviour, they can and should be implemented. The real irregular readings are not that numerous. The test cases should only represent what the module must do, not the exceptions. There are also cases where shwa is light or optional. We can just decide what the rule should be and leave the phonetics to the pronunciation sections. A light shwa can follow the normal rules for transliteration purposes. In short, humans don't find the shwa-dropping rules overly complicated. It's just a matter of combining that knowledge with the programming knowledge. As for Gujarati or Bengali - the modules are just far form complete. --Anatoli T. (обсудить/вклад) 02:43, 13 March 2017 (UTC)
As an example, Hindi डायनासोर (ḍāyanāsor) and Gujarari ડાયનાસોર (ḍāyanāsor) should be transliterated as "ḍāynāsor", not "ḍāyanāsor". The modules drops the final "a", which is good but not the one between "y" and "n". The rule for dropping the vowel there is straightforward but it hasn't been implemented. Also, the Bengali module drops the final shwa in মানচিত্র (mancitr) but it shouldn't, the rule is simple as well (for humans) but the module doesn't know about it. --Anatoli T. (обсудить/вклад) 03:07, 13 March 2017 (UTC)
For the ones where the inherent vowel is indeed a schwa, you're right it really isn't an issue, and yeah the module for Gujarati really isn't complete yet. But for Bengali, Assamese, Oriya, etc. where the inherent vowel is like /ɔ/ or something related, it very much matters. And I do have to stress that for Bengali it's not a matter of exceptions when it comes to word-final schwa-dropping, it's quite unpredictable. E.g. তাল (tal), ডাল (Dal), ভাল (bhalo), গাল (galo), লাল (lal); হর (horo), নর (noro), ঘর (ghor), বর (bor), and more and more. Word-medial dropping is much more regular though. I haven't learned enough about Assamese yet to make any such claim. DerekWinters (talk) 03:19, 13 March 2017 (UTC)
If there is a mess with shwa-dropping in Bengali, then the module shouldn't drop "ô" by default. The module might as well show the inherent vowel and a method to drop them when required could be used for phonetic respellings in the pronuniciation section.--Anatoli T. (обсудить/вклад) 03:48, 13 March 2017 (UTC)
Hmmm, do you have an example? DerekWinters (talk) 04:17, 13 March 2017 (UTC)
I mean, in case the final "ô" is definitely unpredictable, transliterate both cases নর (nôr) and ঘর (ghôr) with inherent "ô", ie. "nôrô" and "ghôrô" but mark the entry ঘর (ghôr) that it's actually pronounced ঘর্ (ghôr) "ghôr". "ঘর্" as a phonetic respelling, employing the virama or হসন্ত (hôsônt) symbol () to suppress inherent vowels. --Anatoli T. (обсудить/вклад) 05:07, 13 March 2017 (UTC)
@Atitarev That's perfect! We should do that. DerekWinters (talk) 23:14, 13 March 2017 (UTC)
Well, one still needs to assess, as User:Wyang said, the percentage of unpredictable shwa-droppings. What is more common, cases like নর or ঘর. If cases like ঘর are much more typical, then they should still be used as the default behaviour. For new modules, e.g. Assamese or Oriya, you can probably ignore the shwa-dropping rules altogether until they are understood and described. Some online dictionaries display all inherent vowels even if they are silent. --Anatoli T. (обсудить/вклад) 05:50, 14 March 2017 (UTC)
I don't know how to get an accurate percentage, but I would wager quite a bit that it's a significant part of the vocab. Several verb conjugation endings have undropped schwas in Bengali, and a whole host of random words have it, like কেন (keno), মত (moto), বর্ষ (borṣo), etc. DerekWinters (talk) 04:14, 15 March 2017 (UTC)
@Metaknowledge An issue I am coming up against is the plurality of letters that sound the same now. Should I transliterate them uniformly (as I see on Wikipedia and elsewhere too), or should I transliterate them according to our IAST standards? For example, all the Ts and Ds are alveolar and the whole sibilant set is now an unvoiced velar fricative. DerekWinters (talk) 03:38, 13 March 2017 (UTC)
@Metaknowledge What do you think of this: WT:Assamese transliteration? DerekWinters (talk) 03:44, 13 March 2017 (UTC)
Looks good to me, but I don't know anything about Assamese. —Μετάknowledgediscuss/deeds 04:03, 13 March 2017 (UTC)
@Metaknowledge Do we want a one-to-one (or as close as possible) correspondence, or would we want to match it phonetically with a loss of the one-to-one by a large margin. DerekWinters (talk) 04:17, 13 March 2017 (UTC)
Really, it's up to editors for each language to decide. Different languages have different standards; there's no need to be bijective, but if that's the standard, we can cleave to it. —Μετάknowledgediscuss/deeds 04:33, 13 March 2017 (UTC)
I don't know Assamese so I'm only speaking from the experience from doing the Hindi and Nepali romanisation modules. The approach will have to depend on how irregular the schwa dropping and other unpredictabilities are. If more than 20% of all Assamese words in a comprehensive dictionary are unpredictable, a sensible approach may be to use a phonetic respelling in the main entry (relying on a pronunciation module) to cover the romanisation, and have all Assamese links refer to the Assamese articles themselves to extract the respelling, instead of applying an automatic algorithm which relies on external transcription assistance any time a word is romanised anyway. This is the approach used by the Thai-editing community here. Many Indic languages, for which a close-to-perfect transliteration algorithm is impossible, may benefit this way. Wyang (talk) 06:47, 13 March 2017 (UTC)
I don't know Assamese and Bengalese either. From what I know, like most Northern Indic languages, they also feature shwa-dropping. Dropping the inherent vowel shwa (transliterated as either "a" or "ô") must be common with Bengali as well in the final position after a consonant, which follows a vowel. There are apparently exception, see DerekWinter's post above. If they are not typical, I hope not, then there still can be a rule to drop shwa's in such positions.
For Bengali definitely it's safer to just have the phonetic respelling because there are a lot of exceptions. After I do more research on Assamese I'll come back to this. DerekWinters (talk) 23:29, 13 March 2017 (UTC)
@DerekWinter, thanks for adding the Assamese transliteration page. Another remaining major Indian language is Oriya/Odia. --Anatoli T. (обсудить/вклад) 07:38, 13 March 2017 (UTC)
Thanks. I know next to nothing about Oriya, but if I do get into it I'll add that one. DerekWinters (talk) 23:32, 13 March 2017 (UTC)
I think it would be a good idea to include the light schwa that Anatoli mentioned above. I know Wikipedia uses ǎ, but I think maybe using ə wouldn't be a bad idea. DerekWinters (talk) 23:40, 13 March 2017 (UTC)
The recent Oxford Hindi-English dictionary has the info on the light shwa. This is the best Hindi dictionary for foreigners so far (It has genders and usexes as well). --Anatoli T. (обсудить/вклад) 05:50, 14 March 2017 (UTC)
Oh interesting, what symbol do they use? DerekWinters (talk) 04:14, 15 March 2017 (UTC)
Just ə if I remember right. --Anatoli T. (обсудить/вклад) 04:25, 15 March 2017 (UTC)

Indic Languages[edit]

I've been meaning to ask, how do you know so many Indic languages? —Aryamanarora (मुझसे बात करो) 20:59, 14 April 2017 (UTC)

@Aryamanarora I only know three, Gujarati, Hindi, and as of late Bengali. Oh, and a little Cochin Konkani. Bengali and Cochin Konkani I learned from friends, and then supplemented my information from online. But I love to dabble in the others, picking up little bits of Marathi, Punjabi, Assamese, etc. (as you can see my focus is very strongly on North Indian languages). I'll do that from online resources, or, as is often the case, from music in those languages. That is especially how I got into Assamese recently. However, I do love learning how to write all and any manner of script out there, so that helps a lot with my Indic focus as well. But other than that I really don't know that many, nothing compared to those who live in India. DerekWinters (talk) 00:52, 15 April 2017 (UTC)
Oh sometimes I wish I still lived in India. My town only has Gujarati immigrants who I can't practice in Hindi with... And you'd be surprised at how many Hindi speakers don't know any other languages (besides English). I didn't know any others until I came on Wiktionary and got interested in them. Anyway, it's cool to see Indian languages becoming more important outside of India. —Aryamanarora (मुझसे बात करो) 22:22, 15 April 2017 (UTC)
True we gujjus do move everywhere. And you can probably practice some Hindi with them, although from experience some really suck at Hindi. And yeah I've noticed lol. DerekWinters (talk) 01:17, 16 April 2017 (UTC)

Caribbean Hindustani[edit]

Do you have any info about this language? I was able to find [4], but it seems to be focused on the dialect spoken in Suriname. Thanks. —Aryamanarora (मुझसे बात करो) 00:04, 20 May 2017 (UTC)

I know the least if anything about Caribbean Hindustani. Although, just googling Guyanese Hindi gives me a wealth of stuff if you want to look into that. DerekWinters (talk) 20:38, 20 May 2017 (UTC)

pitzote[edit]

It's not in any Spanish dictionaries I can find, hence the revert. --Celui qui crée ébauches de football anglais (talk) 11:03, 24 May 2017 (UTC)

@Celui qui crée ébauches de football anglais However, it fits Mexican Spanish phonology rules for borrowings from Nahuatl. Regardless, instead of just wiping it, bring it up in the Etymology Scriptorium. DerekWinters (talk) 18:31, 24 May 2017 (UTC)
@DW. I don't care too much about it to do that. --Celui qui crée ébauches de football anglais (talk) 07:00, 25 May 2017 (UTC)

Apabhramsa[edit]

So, I found this comprehensive grammar [5] but it seems to treat Apabhramsa as a continuum instead of separate languages. The author divides it into 4 dialects, Northern (attested only in 1 work), Southern, Eastern, and Western. Which one of these is Gurjar and which one is Sauraseni do you think? —Aryamanarora (मुझसे बात करो) 03:25, 28 May 2017 (UTC)

On page 15 he says Western Apabhramsa is the ancestor of Punjabi, Hindi, and Gujarati. Are Gurjar and Sauraseni Apabhramsa that similar? —Aryamanarora (मुझसे बात करो) 03:42, 28 May 2017 (UTC)
@Aryamanarora: He also mentions a lot of confusions, so I wouldn't take anything all that seriously in terms of classification. I think our best course is to work up for now, add Old Gujarati, Hindi, Marathi, Punjabi, Oriya, Bengali, etc. etc. Then we can properly pick at the next stage. That or complete the Prakrits on here first and then move down. Those are the best classified of the sources remaining and that should help a lot. DerekWinters (talk) 03:51, 28 May 2017 (UTC)
@DerekWinters: Sometimes it's disappointing how poorly recorded New Indo-Aryan languages are on Wiktionary. It seems though that we're getting a lot more editors as of late. I'll start bulking up Punjabi soon. —Aryamanarora (मुझसे बात करो) 04:00, 28 May 2017 (UTC)
@Aryamanarora Indeed. And good, Punjabi needs a lot. But what I meant was Old Hindi, Old Punjabi, Old Marathi, etc. etc. But regardless any language we add to is good. DerekWinters (talk) 04:04, 28 May 2017 (UTC)
Is there even an Old Punjabi? —Aryamanarora (मुझसे बात करो) 04:09, 28 May 2017 (UTC)
Punjabi has been attested since the 12 or 13 hundreds (I think), so I would assume that to mean there was an Old Punjabi. DerekWinters (talk) 04:10, 28 May 2017 (UTC)
@Aryamanarora Pinged. DerekWinters (talk) 04:11, 28 May 2017 (UTC)
@Aryamanarora "By 500 AD these Middle Indo-Aryan dialects had been developed many local features and lost many inflectional morphemes. Literary form of these dialects is known as Apabhramsha (ਅਪਭ੍ਰੰਸ਼,اپبھرنش). Principle Apabhramshas are Takka Apabhramsha in Central Punjab and Vrachada Apabhramsha in Southern Punjab. By 1200 AD these Apabhramshas or 'corrupt dialects' had few inflectional morphemes left. During Middle Ages Takka Apabhramsha developed into Lahori dialect and Vrachada (व्राचड/व्राचड़) Apabhramsha developed into Multani dialect." I found this on the "History of the Punjabi language" wiki page. I have never even heard of those Apabhramshas so maybe you know something. DerekWinters (talk) 06:20, 28 May 2017 (UTC)
@Aryamanarora [6] Here's this. DerekWinters (talk) 06:46, 28 May 2017 (UTC)
And this [7] page 1469
"The immediate predecessor of Sindhi was an Apabhramsha Prakrit named Vrachada. Arab and Persian travellers, specifically Abu-Rayhan Biruni in his book 'Tahqiq ma lil-Hind', had declared that even before the advent of Islam in Sindh (711 A.D.), the language was prevalent in the region. It was not only widely spoken but written in three different scripts – Ardhanagari, Saindhu and Malwari. Biruni has described many Sindhi words leading to the conclusion that the Sindhi language was widely spoken and rich in vocabulary in his time." This is from an older version of the Sindhi page on wikipedia, but due to lack of sources was removed.
[8] This is all I have for now. DerekWinters (talk) 06:46, 28 May 2017 (UTC)
Thanks, that's really interesting! We have some languages to add I guess. —Aryamanarora (मुझसे बात करो) 12:19, 28 May 2017 (UTC)

entry request. . .[edit]

Hi, are you still working on Old Gujarati? I was going thru *saĵʰásram and found Old Gujarati सहस listed as a descendant, just wondered if you could create it. - madhavpandit (talk) 15:04, 19 June 2017 (UTC)

@माधवपंडित Hey! Did it. DerekWinters (talk) 01:27, 21 June 2017 (UTC)

Thanks a lot!!!! - madhavpandit (talk) 07:16, 21 June 2017 (UTC)

Newari[edit]

There is a Newari Bible that you may find useful. I also found a Classical Newari dictionary online (at least I think it's Classical Newari). —Aryaman (मुझसे बात करो) 17:09, 10 July 2017 (UTC)

@Aryamanarora Thanks! I'll check them out. DerekWinters (talk) 22:05, 10 July 2017 (UTC)

Gujarati-English Dictionary resource[edit]

This is 1925 Gujarati-English dictionary (1600+ pages) published by Baroda State. It is now in public domain so it can be added to Wiktionary. It was scanned by Digital Library of India and then mirrored on the Internet Archives.

Hope this helps. :) --Nizil Shah (talk) 13:37, 26 July 2017 (UTC)

@Nizil Shah ખૂબ ખૂબ ધન્યવાદ. કૃપયા તમે પણ અહિયાં ગુજરાતી શબ્દો ઉમેરો, હજી તો બહુ જ ઓછા શબ્દો છે. DerekWinters (talk) 14:32, 26 July 2017 (UTC)
@Nizil Shah પણ ખાસ કરીને, તમને ટેક્નોલોજી અને બીજા એવા આધુનિક દુનિયાના શબ્દો (અને ગુજરાતી સ્લેંગ) ખબર છે? ખબર હોય તો જરૂર ઉમેરો કે અહિયાં લખી આપજો. DerekWinters (talk) 14:35, 26 July 2017 (UTC)
ગુજરાતી વિક્શનરીમાં ઉમેરો કરી એને વિશાળ બનાવવા ગુજરાતી વિકિપીડિયાના ઘણા સભ્ય ઉત્સુક છે પણ વિકિડેટા જેવી સ્ટ્રકચર્ડ ડેટા સપોર્ટ કરતી વિક્શનરી લોન્ચ થાય એની રાહ જોઈએ છીએ. એ પછી ગુજરાતી ભાષાના સૌથી વિશાળ શબ્દજ્ઞાનકોષ ભગવદ્ગોમંડળ (૯૦૦૦થી વધુ મોટા પાનાં; ૨,૮૧,૩૭૭ શબ્દ; જે ૨૦૧૬થી પબ્લિક ડોમેનમાં છે) ને ઓનલાઈન કરવા પણ વિચાર છે. આ સાથે આધુનિક શબ્દો પણ ઉમેરીશું. અત્યારે હું વિકિડેટા અને વિકિપીડિયામાં વધુ કાર્યરત છું જેમાં પણ ઘણું કામ પેન્ડિંગ છે. આમ છતાં હું અહીં અવારનવાર શબ્દો ઉમેરતો રહીશ. સ્લેંગ (ગાળ/અપશબ્દના સંદર્ભમાં કે રોજબરોજના વપરાશના સંદર્ભમાં?) અને આધુનિક શબ્દો તો ઘણા છે ઉમેરવાલાયક, તો એ પણ ઉમેરીશું. આજથી વીસ વર્ષ પહેલાં બોલાતી અને અત્યારે બોલાતી ગુજરાતી ભાષામાં પુષ્કળ ફરક છે. અત્યારે અંગ્રેજી શબ્દો વગર ભાગ્યેજ ગુજરાતી બોલાય છે અને એનું શુદ્ધ ગુજરાતી માત્ર ડિક્ષનરીમાં જ જોવા મળે એવું છે. જેમકે ટેલીવિઝન/ટીવીનું શુદ્ધ ગુજરાતી દુરદર્શન થાય પણ કોઈ વાપરતું નથી. આ ઉપરાંત મારે પોતાને પણ વિક્શનરી શીખવું પડે અને સાથે સાથે ગુજરાતી વ્યાકરણ/લેક્ઝીકોગ્રાફી પર પકડ કેળવવી પડે. જોઈશું અને કરીશું પણ ધીમે ધીમે. આભાર. --Nizil Shah (talk) 19:17, 26 July 2017 (UTC)
ખૂબ ખૂબ ધન્યવાદ ફરીથી. ખબર છે મને કે શુદ્ધ ગુજરાતી આજકાલ કોઈ બહુ બોલતું નથી, પણ તેમ છતાં, અંગ્રેજી શબ્દો ઉમેરવા કરતા ગુજરાતીના એવા એવા ખાસ શબ્દો (જેમકે ગ્રામ્ય ભાષા, ગાળો, અપશબ્દો, ઇત્યાદી) ઉમેરિએ તો એનો ઘણો વધારે ફાયદો મળશે. અને હાં, દુરદર્શન, દુરવાણી, અગ્નિરથ જેવા ભદ્રંભદ્ર શબ્દો નથી જોયતા, પણ સાચ્ચા વાપરેલા શબ્દો. અને સ્લેંગમાં જે તમને યોગ્ય લાગે એ. DerekWinters (talk) 21:33, 26 July 2017 (UTC)
મેં થોડા અનુવાદ અને શબ્દ ઉમેર્યા છે. જોઈ જશો અને કોઈ સુચન હોય તો જણાવશો.--Nizil Shah (talk) 14:29, 1 August 2017 (UTC)
ટેમ્પ્લેટ સમજવાની માથાકુટમાં પડ્યા વગર નવા શબ્દ ઉમેરવા કોઈ ગેજેટ કે ટુલ છે? હોય તો જણાવશો. ઉપર દર્શાવેલી ડિક્ષનરીના શબ્દો સરળતાથી ફટાફટ ઉમેરી શકાય એવું કોઈ ટુલ?--Nizil Shah (talk) 14:36, 1 August 2017 (UTC)
@Nizil Shah એવું ધારું છું કે કોઈ ટૂલ હશે, પણ હું એક પહેલાથી બનાવેલું પાનુંની નકલ કરીને નવું પાનું બનાવવું. DerekWinters (talk) 19:56, 2 August 2017 (UTC)
@Nizil Shah I found a document of many basic Gujarati words (mostly nouns) that we are missing. Search for "Basic Vocabulary of Gujarati Babu Suthar" online, it's a 20-ish page PDF. —Aryaman (मुझसे बात करो) 07:04, 9 August 2017 (UTC)
@Aryamanarora, thanks. It is helpful. These basic words should be added here. I will email both of you a book on "A Grammar of Gujarati" explained in English.--Nizil Shah (talk) 11:22, 9 August 2017 (UTC)

Regional Hindi[edit]

So far I'm aware of Delhi, Mumbai, Hyderabad, and Indore-specific Hindi. Does Gujarat have any special Hindi dialects? Also, what exactly is Mumbai Gujarati like? —Aryaman (मुझसे बात करो) 20:22, 13 August 2017 (UTC)

I am not sure if it is considered as dialect or not but Gujarati do speak Hindi bit differently. They tend to add Gujarati words to fill the sentences when they do not know specific Hindi word. Apart from that some Gujarati common idioms and informal words are also used frequently. Gujarati Hindi is heavily influenced by Hindi TV shows and films. Mumbai Gujarati is different in two ways; first, almost all second generation Gujarati living in Bombay (Mumbai as they call it now) are educated in English Medium schools surrounded by Hindi and other regional language speaking friends. So their Gujarati is mostly limited and they had hard time understanding Gujarati words which are not in daily use. Their Gujarati is sometime more flavoured with Hindi and English words to fill in the gaps for which they do not know Gujarati words. Apart from that their tone differ from family to family as all their families have roots in different regions of Gujarat. Like Kutchi Gujarati family may speak Kutchi in family and their child may speak Gujarati with Kutchi flavour. I do not know if this aspects of Gujarati language is studied and documented by scholars or not.--Nizil Shah (talk) 05:38, 24 August 2017 (UTC)
@Nizil Shah: That's interesting. I don't think Indian-language scholars pay much attention to dialects (yet), I found very little so I decided to ask. I can make the Swadesh list btw. —Aryaman (मुझसे बात करो) 17:55, 24 August 2017 (UTC)
Appendix:Gujarati Swadesh listAryaman (मुझसे बात करो) 17:58, 24 August 2017 (UTC)
Aryaman ,Thank you for creating Swadesh list. I will fill it when free. And sorry DerekWinters for using your talkpage as discussion page. Do we have discussion page for discussing Gujarati/Indian language related things? Ideally we should talk there as interested parties can join in.-Nizil Shah (talk) 18:40, 24 August 2017 (UTC)
@Nizil Shah: Since it's just you and DerekWinters that actually know Gujarati, here is fine. Although Wiktionary talk:About Gujarati could be made. WT:BP is the common discussion place, that can be used as well. —Aryaman (मुझसे बात करो) 18:43, 24 August 2017 (UTC)
We do have some active Gujarati Wikipedians but none of them is active either on Gujarati/English Wiktionary. I just recently started editing though I am active on Wikipedias and Wikidata for very long time. Gujarati Wikipedians too want to have large Gujarati Wiktionary and had plans to add largest Gujarati dictionary (9000 pages) which came in Public Domain in 2016 but postponed the plan due to upcoming structured dictionary powered by Wikidata style. We may have more active Gujarati people once it become operational (with possibility of few bots).-Nizil Shah (talk) 18:52, 24 August 2017 (UTC)

Gujarati Swadesh list[edit]

Can we have Gujarati Swadesh List (most basic 200 words) like Appendix:Hindi Swadesh list?--Nizil Shah (talk) 05:38, 24 August 2017 (UTC)

I have completed Appendix:Gujarati Swadesh list finally. Have a look and create words which are not here.--Nizil Shah (talk) 12:49, 5 October 2017 (UTC)
@Aryamanarora, you too. Thanks for your help too.--Nizil Shah (talk) 12:50, 5 October 2017 (UTC)

કેમ છો?[edit]

બહુ સમયમાં તમને નહીં દેખા છે... I probably messed that up bad. —Aryaman (मुझसे बात करो) 21:41, 18 September 2017 (UTC)

@Aryamanarora Hello! I'm back, but only for a while. You would use જોવું instead: "બહુ સમયથી તમને જોયો નથી". By the way, can you read Gujarati well yet? DerekWinters (talk) 20:43, 24 September 2017 (UTC)
I can read it pretty well now, I'm trying to get to gu-2, then I'll probably slow down... it's amazing how easy it is to learn just by knowing Hindi. Good to see you back (if only for a while)! Be sure to look through WT:BP. —Aryaman (मुझसे बात करो) 20:46, 24 September 2017 (UTC)
@Aryamanarora. If you know Hindi, it is very easy to learn to read Gujarati. Both language follow same alphabets like Ka, Kha, Ga.. etc. Only script is bit different. Just memorise Hindi equivalent of Gujarati alphabet and you are good to read. @DerekWinters, "બહુ સમયથી તમને જોયો નથી" has a minor issue. "તમને" જોડે "જોયો"ના બદલે "જોયા" વપરાય છે = "બહુ સમયથી તમને જોયા નથી". અંગત વ્યક્તિ કે મિત્રો વગેરેને "બહુ સમયથી તને જોયો નથી" એમ કહેવાય છે જેમાં "તને" જોડે "જોયો" વપરાય. જો કે હવે રોજબરોજની વાતચીતમાં "બહુ ટાઈમથી તને જોયો નથી" એવું લોકો બોલે છે. સમયના બદલે ટાઈમ હવે વધુ વપરાતો જોવા મળે છે. :)--Nizil Shah (talk) 07:14, 4 October 2017 (UTC)
@Nizil Shah: Thanks for the help! —Aryaman (मुझसे बात करो) 11:02, 5 October 2017 (UTC)
@Nizil Shah ધન્યવાદ! હાં બોલીને જ ભૂલ સમજાઈ. મને ગુજરાતીમાં સઉથી મોટો વાંધો છે લિંગ. ક્યાં અને કેવી રીતે વપરાઈ એ મને સમજાતું જ નથી. DerekWinters (talk) 16:39, 4 October 2017 (UTC)
તને જોયો નથી is casual way to speak while તમને જોયા નથી is polite way. It is somewhat difficult to tell how to decide and use gender in Gujarati. But I will try to tell later. (after some research/reading)--Nizil Shah (talk) 11:52, 5 October 2017 (UTC)
Look here: https://en.wikibooks.org/wiki/Gujarati/Gender--Nizil Shah (talk) 12:03, 5 October 2017 (UTC)
@Nizil Shah ધન્યવાદ! DerekWinters (talk) 20:40, 7 October 2017 (UTC)
Hmm. That URL got mixed up with your signature --. Try this: Gujarati/Gender]. —Stephen (Talk) 13:31, 8 October 2017 (UTC)

Braj[edit]

I've unmerged Braj (and Haryanvi, but there was only one Haryanvi lemma) from Hindi, see CAT:Braj lemmas. I think the best way to deal with the Hindi "dialects" is something like {{zh-dial}}, which could also have rows for Shuddh Hindi and Colloquial Hindustani, as well as Persianized Urdu. That kind of system could also be useful in Punjabi (if you don't know, Punjabi has three standard dialects and a bunch of lesser ones), and I'd imagine other Indo-Aryan languages that have clear dialectical variations like Konkani. —Aryaman (मुझसे बात करो) 22:02, 10 October 2017 (UTC)

@Aryamanarora If we do plan on remerging any of the "dialects" again, then we should really do so in the way English on wikt. handles them, with a {{lb|hi|Braj}} marker or something similar. If we want, Hindustani wouldn't be bad for a merger idea (something I would strongly support). Chinese here does it is by merging what are very separate languages, or at least much more separate than Braj and Haryanvi are from Khariboli. And to really take Shuddh Hindi, Baazari Hindi, and Urdu as separate registers really just plays into the politicians in India and Pakistan. I think it would be best to treat the terms that are more Urdu as {{lb|hi|Urdu}} (or whatever the new code would be), the ones that are more Hindi as {{lb|hi|Hindi}}, and those that are extremely arcane Persian, Arabic, or Sanskrit (or other) learned borrowings as {{lb|hi|rare}} or something similar. DerekWinters (talk) 00:42, 11 October 2017 (UTC)
Also, since what is called "Hindi" isn't even monophyletic, let's keep the "Eastern Hindi" languages out of this. If we were to make an entry in Hindi that inherited a term from Ardhamagadhi Prakrit, that would be a bit problematic. DerekWinters (talk) 00:58, 11 October 2017 (UTC)
Well that was the original idea, see e.g. कौ before I unmerged. It has {{lb|hi|Braj}} and it was categorized into CAT:Braj Bhāṣā, which was under CAT:Regional Hindi. As for merging Hindustani, I would definitely be for that, it's just there's no real benefit in terms of duplication of content since you won't have Hindi and Urdu headers on the same page (unlike Serbo-Croatian or Chinese, where several "languages" would be on the same page; in the case of Chinese, the logogrammic script is the real problem). Also I'd rather do {{lb|hi|India}} and {{lb|hi|Pakistan}} (but that also alienates Urdu speakers in India, e.g. the prestige dialect of Lucknow), but I wouldn't be averse to your idea.
With a template like {{zh-dial}} we could keep the "languages" in separate headers and still integrate them tightly without having to deal with merging (not to mention the neat maps the template generates). Yes, you're right the Hindi isn't molophynetic, but I think some sort of convergent evolution has occurred here between Eastern and Western Hindi, where they are pretty much mutually intelligible nowadays. We can't merged Eastern Hindi lects into Hindi obviously, but a dialectical synonyms template could still be used to link them together. —Aryaman (मुझसे बात करो) 01:16, 11 October 2017 (UTC)
Do explain what you have in mind for the dialectical synonyms template. I'm quite interested. DerekWinters (talk) 22:54, 11 October 2017 (UTC)
@Aryamanarora DerekWinters (talk) 22:54, 11 October 2017 (UTC)

Basically something like this, but with data stored in a backend:

Dialectal synonyms of भाषा (bhāṣā) / زبان (zabān, language)
Variety Location Words
Hindi (literary) भाषा
Urdu (literary) زبان / ज़बान, لسان / लिसान
Western Hindustani बोली / بولی
Braj भाखा, बोली
Eastern Awadhi बोली
Bhojpuri बोली

Obviously more varieties are there, and for simple words like मैं it would be much more comprehensive. I also suppressed transliteration because it's visually distracting IMO. —Aryaman (मुझसे बात करो) 00:29, 12 October 2017 (UTC)

@Aryamanarora This is interesting. What purpose would this serve? Do you plan on removing the Braj, Haryanvi headers and replacing it with this? Also, though this need not be the case, certain Saaf Urdu terms derived from Arabic may be far too rare to include. Essentially the complexities of the two standards and the various sociolects I think are being simplified a bit too much in something like this. And unlike some of the "dialects" of Hindi, Braj and Awadhi have long had independent literary traditions, and Bhojpuri speakers today are quite adamant about keeping their language separate. I'm not happy about the situation with Chinese, because I think it hides the complexities within each form, and I think this would do the same here. DerekWinters (talk) 00:58, 12 October 2017 (UTC)
I know I'm not being the most coherent, but I'm just afraid that the mergers won't be the most helpful here. DerekWinters (talk) 00:59, 12 October 2017 (UTC)
No way! I'm not arguing for a merger here at all. मैं भारतीय सरकार का चमचा या हिंदी-वादी नहीं हूँ! The template would be placed in a synonyms section at Hindi भाषा (bhāṣā), Urdu زبان, Braj भाखा (bhākhā), and Bhojpuri बोली (bolī), and whatever other words mean "language" in the Hindi family languages. Chinese at least has the unified script going for it (things like Braj भाखा (bhākhā) are rare), so IMO it's easier to do a merger. I agree Bhojpuri and Braj etc. have independent evolution from Hindi (Khadiboli has a short literary tradition, and Manak Hindi is an artificial recent creation). Basically, this would allow us to have links between these sister (or in the case of Eastern Hindi, cousin sister) lects which are necessary if we continue to expand content. Plus, data would be kept in a database that we could edit and have it affect all entries involved, which really cuts down on maintenance.
I did originally think that a merger would work, but learning about different "varieties" of Hindi has shown me how daunting such a task would be. Not to mention it wouldn't make sense to merge Hindi and Urdu while keeping Braj, Awadhi etc. separate, and also vice versa. —Aryaman (मुझसे बात करो) 01:15, 12 October 2017 (UTC)
@Aryamanarora Interesting, it's not bad actually. What forms will choose for the "dialects"? Because to use भाखा or बोली would reduce the language to almost exhibition/display status: showing off it's differences when भाषा would be just as appropriate (if not more so) in any of these languages. Like, does बोली now mean a dialect in them (because it does in Gujju)? This is also a bad example, because काला wouldn't have this issue. DerekWinters (talk) 02:22, 12 October 2017 (UTC)
True, it was just something I whipped up to show how it would look like. Of course, they would all use भाषा in writing as well, but I wonder if in colloquial speech would they still? I feel like the enthusiastic use of Sanskrit borrowings is only a recent invention in the history of these languages. Braj and Awadhi subsisted just fine on Desi words and Perso-Arabic borrowings for 500 years. As for बोली, it means "dialect" in standard Hindi, but I noticed on Bhojpuri Wikipedia some articles say भोजपुरी बोली? Do they really mean dialect in that case? —Aryaman (मुझसे बात करो) 10:24, 12 October 2017 (UTC)
I'm not the most certain either to be honest. But regardless, this isn't a bad idea if you wish to pursue it. DerekWinters (talk) 18:15, 12 October 2017 (UTC)
I'll try to whip something up tomorrow (or later today if I can). I also am thinking about a Hindi declension module since our current templates are quite primitive. —Aryaman (मुझसे बात करो) 18:59, 12 October 2017 (UTC)
@Aryamanarora If you have the time, ability, and information to implement such a system it would look fantastic. The Deccan language also appears to be variety of Hindi-Urdu even though it is farther away from the 'Hindi belt'.
Perhaps Punjabi and Marathi-Konkani could use this system too as you mentioned. According to w:Maharashtrian Konkani, 'there is a continuum between standard Marathi and Goan Konkani', and according to w:Marathi-Konkani languages ‘several of the Marathi-Konkani languages have been variously claimed to be dialects of both Marathi and Konkani’. If there is information about words in these dialects and how they are related, then such a template would be useful. Kutchkutch (talk) 01:33, 12 October 2017 (UTC)
@Kutchkutch: It won't be that difficult, it would just be a slightly edited form of Module:zh-dial-syn for the backend code. I do agree that there is a severe lack of information about regional dialects, and I imagine much of it isn't even in English and so is harder to obtain. Specifically for Hindi belt languages though, there is plenty of information about Hindi and Urdu, and pretty comprehensive coverage of Braj and Awadhi (McGregor's Hindi dictionary has both), as well as local vocabulary for Mumbai and Hyderabad (CAT:Hyderabadi Hindi, but it is usually considered closer to Urdu AFAIK; it's a form of Dakkhini like you said). Madhavpandit knows quite a bit about Konkani dialects it seems, and Punjabi has some scattered resources online. It's about time we modernized the entries for Indic languages. —Aryaman (मुझसे बात करो) 01:39, 12 October 2017 (UTC)

I'm holding off on this until we have more coverage of Braj and Awadhi... anyway, you might like [9], it's a great introduction to Braj. —Aryaman (मुझसे बात करो) 20:57, 28 October 2017 (UTC)

@Aryamanarora: The link is broken :( DerekWinters (talk) 20:58, 28 October 2017 (UTC)
Huh, weird. The rest of the site is down for me but that one link works... it's in my cache probably. Here's Wayback Machine. —Aryaman (मुझसे बात करो) 21:04, 28 October 2017 (UTC)
@Aryamanarora: Thanks!! This is very cool. Can't wait to use it to add Braj words. Lol always liked Braj/etc. poetry for how freely they use their tadbhavs. Hindi's forced Sanskritization has honestly been so bad because now everything sounds forced, and it's unfortunately seeping into everything, be it Braj, Awadhi, or even Gujarati. The tadbhavs are seen as village/uneducated speech and the Sanskritization is horribly artificial, so English and Farsi step in. Like damn, there goes half the expressivity of the language. But regardless, thanks again for this! DerekWinters (talk) 23:42, 28 October 2017 (UTC)
No problem! Rupert Snell has written some really cool stuff. A lot of cool words like मीत (mīt, friend), हिया (hiyā) and साद (sād, word) have been replaced by tatsamas sadly. But to be fair, the Sanskritization is necessary if Hindi ever wants to be used in technical contexts. I suppose using tadbhavs would be possible, but words *पयासानू (*payāsānū, photon) just aren't suited for that kind of task. English does the same, borrowing from Latin instead of Sanskrit. But yeah, it's a shame the cool (and archaic) village dialects are subsumed by high-minded Hindi due to this kind of thing. Anyways, there's a lot of great Braj prose too. Lalluram's Rajniti is one I can think of. —Aryaman (मुझसे बात करो) 14:33, 29 October 2017 (UTC)
@Aryamanarora: I do have to disagree with you here. As words are inherently a set of sounds strung together, why not make them the most meaningful set of sounds for a speaker. For a native Braj speaker, something like *पयासकन would be infintely more understandable and useful than प्रकाशाणु, but inherently neither word is more suited to the concept. They both are, but one works for Braj speakers and the other for native Sanskrit speakers. English also could use something like *lightbit or *lightpart, and honestly, I would have preferred such a term in science class, as it's infintely more meaningful to me than *photon, which, without having an understanding of Greek (and as an English-language student, why should I have to), is essentially the same as any other meaningless string of sounds. DerekWinters (talk) 15:57, 29 October 2017 (UTC)
Well, *पयास (*payās) would actually be a borrowing from Prakrit lol, so it wouldn't be all that transparent. And at this point, Sanskrit borrowings have become far too entrenched in Hindi (and all other Indian languages) to be purged. Old Sanskrit morphemes have become productive again, and we end up with words like केंद्रक (kendrak, nucleus) which never actually existed in Sanskrit but are native coinages (much like English photon is a coinage from Greek components). And it's convenient that every Indian language has a word like kendrak meaning "nucleus". I guess we'll have to agree to disagree. —Aryaman (मुझसे बात करो) 20:06, 29 October 2017 (UTC)
Out of curiosity, what are tadbhavs and tatsamas? In any case, this discussion reminds me of this wikipedia article. I find the idea of "transparency" fascinating and worth aiming at, and I've always thought calques are much more interesting words than simple borrowings, which ofttimes simply look/sound hideous to me (χάμπουργκερ, I'm thinking of you). --Barytonesis (talk) 21:35, 29 October 2017 (UTC)
@Barytonesis: In the ancient Sanskrit grammatical tradition (and so in modern Indian-language linguistics), a तद्भव (tadbhava, literally coming/arising from that) is a word that is inherited into an Indo-Aryan (or Dravidian) language from Sanskrit by way of the Prakrits. The Prakrits (प्राकृत (prākṛta, literally natural)) were the vernacular languages of India c. 300 BCE (the composition of the Ashokan Edicts in an early Prakrit) to I'd venture about 900 CE. Meanwhile Sanskrit (संस्कृत (saṃskṛta, literally put together, well formed, perfect)) was by this time a literary language. The relation between Sanskrit and Prakrit is analogous to that between Latin and Vulgar Latin, except Prakrit flourished as a literary language for a long time and we have strong corpuses from three literary dialects of Prakrit.
A तत्सम (tatsama, literally same as that) is a word that is borrowed from Sanskrit into an Indo-Aryan (or Dravidian) language. These words are very important to the use of modern Indian languages in technical fields. However, these words can be intransparent and often cumbersome, and are very rare in spoken language, where they are replaced by English. Often, these words are used to calque English compounds and even whole expressions (e.g. Hindi एक शब्द धन्यवाद का (ek śabd dhanyavād kā), calque of a word of thanks).
Often, tatsama and tadbhava doublets are both in use in different semantic fields. One of my favorites is खेत (khet, a field for farming)/क्षेत्र (kṣetra, a region; field of study). There's also बाँस (bā̃s, bamboo)/वंश (vanś, lineage; dynasty), सब (sab, all)/सर्व (sarva, universal (in compounds)) etc. etc. —Aryaman (मुझसे बात करो) 17:29, 30 October 2017 (UTC)
Although, on the mater of being important in technical fields, that is simply matter of preference for fancy, high-sounding Sanskrit words due to the perception that common words are inherently not meaningful enough, similar to how English treats its technical/scientific vocabulary as well. DerekWinters (talk) 18:28, 30 October 2017 (UTC)
Yes, Sanskrit is now way too entrenched in technical language too really get rid of, much like no one "in the field" pays much heed to English linguistic purism and instead use the Greek and Latin words that were borrowed or coined a long time ago. Sadly, Hindi purism focuses too much on purging the Perso-Arabic element (which I think is essential to the language) and not on old Sanskrit borrowings that are "native". —Aryaman (मुझसे बात करो) 19:01, 30 October 2017 (UTC)
Fair, I guess we'll just have to agree to disagree. DerekWinters (talk) 19:11, 30 October 2017 (UTC)

BP discussion[edit]

Hello, would you mind sharing your thoughts at the discussion of standards for Coptic? There have so far only been three participants. Lingo Bingo Dingo (talk) 14:55, 6 November 2017 (UTC)

@Lingo Bingo Dingo: Hi, sorry about not responding sooner. I am not so knowledgeable on Coptic as I'd liked, but I gave my input on those matters that I have knowledge of. I'm glad to see an increase in coverage of Coptic (and Demotic!) on wikt! DerekWinters (talk) 20:54, 6 November 2017 (UTC)

Ashokan Prakrit[edit]

EdictsOfAshoka.jpg

Do you reckon this deserves a code? Turner lists it as a separate language, and it's pretty well documented AFAICT. Also @माधवपंडित, Kutchkutch, Sagir Ahmed Msa. —Aryaman (मुझसे बात करो) 03:04, 11 November 2017 (UTC)

@Aryamanarora: Wow! This is not Pali? If it's distinguished enough from other languages and also well documented, this certainly needs to be added! I love the name. -- mādhavpaṇḍit (talk) 03:19, 11 November 2017 (UTC)
@माधवपंडित: It's the language of the Ashokan Rock and Pillar Edicts. It's the coolest thing, because it was apparently intelligible by people all around India. I think it was pre-Pali. —Aryaman (मुझसे बात करो) 03:22, 11 November 2017 (UTC)
@Aryamanarora: I fully support this!! -- mādhavpaṇḍit (talk) 03:25, 11 November 2017 (UTC)
@माधवपंडित, Aryamanarora Many sources just refer to this as Pali. But since Pali was so widespread, making a separate code would be better for keeping this separate from Pali. So 'Piyadasi' would be 𑀧𑀺𑀬𑀤𑀲𑀺 (piyadasi), 'Kalinga' would be 𑀓𑀮𑀺𑀦𑁆𑀕 (kalinga) and 'Dhamma' would be 𑀥𑀫𑁆𑀫 (dhamma) even though धम्म already exists? Some inscriptions are in the Greek script, Kharosthi script, and Aramaic with Hebrew script. The Greek inscriptions use Πιοδασσης (Piodassēs) for 'Piyadasi' and εὐσέβεια (eusébeia) Eusebeia for 'Dhamma'. Kutchkutch (talk) 05:01, 11 November 2017 (UTC)
@Aryamanarora The Ashokan Prakrits were the just local languages where each of his edicts was written. Therefore there is no one language. DerekWinters (talk) 06:00, 11 November 2017 (UTC)
In that case, maybe we can figure out how many Ashokan Prakrits there were and create a a code each one of them if they don't already exist. Kutchkutch (talk) 07:42, 11 November 2017 (UTC)
I think that the languages used in each region represent things like Shauraseni, Maharashtri, etc. etc. Not different languages. DerekWinters (talk) 14:11, 11 November 2017 (UTC)
The Dramatic Prakrits ((Jain) Sauraseni, (Jain) Maharastri, (Ardha)magadhi, etc) only were written down about in the 3rd century CE, while these inscriptions are from the 3rd century BCE, 500 years before. The Indologist Amulyachandra Sen on page 8 says: "The language of the Aśokan inscriptions is Prakrit. But it is not quite the same as any of the other literary forms known of Prakrit, it has been called Aśokan Prakrit or Prakrit of the Aśoka inscriptions. It has affinities with Māgadhī Prakrit [I think the document means Ardhamāgadhī]. The language of the Girnār version of the REs [Rock Edicts] is close to Pali." It's the oldest Prakrit by far, I think it ought to have a code. —Aryaman (मुझसे बात करो) 20:28, 11 November 2017 (UTC)
Interesting! If that's so, we should consider it. Are you sure that it was unified though? Because I remember reading that it was quite different in different parts of India. DerekWinters (talk) 01:17, 12 November 2017 (UTC)
Apparently, there were some spelling differences in some of the edicts (many were just copying errors, since they were "translated" from a master tablet). AFAICT they all seem to very similar, and no doubt mutually intelligible. We could always use dialect tags, like Turner does. —Aryaman (मुझसे बात करो) 17:07, 12 November 2017 (UTC)

I just started looking through the edicts and translations provided by Hultzsch in 1925 for quotes and stuff. Apparently some Greek names are attested; they would make for some cool WT:FW nominations. —AryamanA (मुझसे बात करेंयोगदान) 02:23, 17 November 2017 (UTC)

@Aryamanarora Oh true! There are so many interesting names in the inscriptions. I wonder how many other cool terms will pop up. DerekWinters (talk) 06:01, 17 November 2017 (UTC)

More Hindi dialects[edit]

It seems the Indian government is working on something huge: [10]. I was able to find the project description online: "Under this project dictionaries of 48 dialects of Hindi are to be developed. At the initial level Unicode based trilingual digital dictionaries of Bhojpuri, Brijbhasha, Rajashthani, Chhattisgarhi, Bundeli, Awadhi, and Malvi, Kangari, Gadhwali, Magahi, and Hariyanavi dialects are being prepared." We should keep a watch on it. —Aryaman (मुझसे बात करो) 17:09, 12 November 2017 (UTC)

@Aryamanarora: Oh wow this is super interesting! I hope it does come to something. DerekWinters (talk) 18:30, 12 November 2017 (UTC)

Bolding in Indic scripts[edit]

It seems that some of the Indic scripts (including Devanagari for me) don't actually become bold when using '''? I used a hack in my personal css a while ago to force it to appear bold, should we add it to global css? I also made {{hi-x}} make the bolded text bigger and highlighted, if the user's font doesn't support bolding.

यह एक बोल्ड शब्द है।
yah ek bolḍ śabd hai.
This is a bold word.

Also @माधवपंडित, Kutchkutch. —AryamanA (मुझसे बात करेंयोगदान) 04:09, 25 November 2017 (UTC)

(Sorry for cluttering your user page, I have no idea where to put this stuff) —AryamanA (मुझसे बात करेंयोगदान) 04:10, 25 November 2017 (UTC)
@AryamanA: The lack of bolding happens for me as well. I had always assumed it was an issue on my end rather than a system-wide issue. Kutchkutch (talk) 04:16, 25 November 2017 (UTC)
@AryamanA: The bolding works for me on mobile but does not work (Devanagari too) in the desktop mode. -- माधवपंडित (talk) 04:19, 25 November 2017 (UTC)
@AryamanA: Please do, I was thinking about this issue just today. And no worries, that's what talk pages are for haha. DerekWinters (talk) 04:20, 25 November 2017 (UTC)
@Kutchkutch, माधवपंडित: Okay, I've figured out the problem I think. It only affects Devanagari on desktop mode, the global CSS has:
/* Devanagari */

.Deva {
        font-family: Devanagari Sangam MN, Devanagari MT, Mangal, Raghu, Gargi, JanaSanskrit, JanaHindi, Arial Unicode MS, Code2000, Bitstream Cyberbit, Bitstream CyberBase, Siddhanta, sans-serif;
        font-size: 125%;
}

.Deva, .Deva * {
        font-style: normal;
        font-weight: normal;
}
I think font-style: normal; suppresses bolding and italicizing. Also, since we're on the topic, I think the Devanagari fonts looks horrible. I use Noto Serif Devanagari on my personal CSS, but most people don't have that font. Do you guys have any suggestions for some fonts? —AryamanA (मुझसे बात करेंयोगदान) 04:28, 25 November 2017 (UTC)
lol, I fixed it, but now all uses of {{m}} and {{ux}} are italicized. It looks ugly imo. —AryamanA (मुझसे बात करेंयोगदान) 04:31, 25 November 2017 (UTC)
Adobe Devanagari isn't bad. DerekWinters (talk) 04:33, 25 November 2017 (UTC)
@AryamanA Wait Utsaah is much cleaner, unless you prefer Adobe Devanagari. DerekWinters (talk) 04:55, 25 November 2017 (UTC)
I added both to the stack, Adobe Devanagari first. —AryamanA (मुझसे बात करेंयोगदान) 14:41, 25 November 2017 (UTC)
@AryamanA I don't have several of the fonts mentioned such as Adobe Devanagari or Utsaah. Is there a way to embed the the consensus font into the system so that the viewer doesn't need to worry about fonts? I've heard it's possible to to this on websites so that the viewer can see the custom font, but perhaps that's only possible for private websites. Kutchkutch (talk) 01:08, 26 November 2017 (UTC)
If copyrights and terms of use for fonts are an issue, Google Fonts appear to indicate their licenses, but they may not be as good as the fonts already included with operating systems by default. Kutchkutch (talk) 02:58, 26 November 2017 (UTC)
@Kutchkutch: We can render the fonts by loading them into the browser, but that increases page load times and uses memory, which may bother people who never look at Devanagari stuff. I personally use Google's Noto Fonts (that I downloaded on my system) by adding CSS rules to my personal User:AryamanA/common.css. So you can see that I use a serif font for Devanagari and Nastaliq style for Urdu. You could add the fonts that you like at User:Kutchkutch/common.css and it will override the default. btw, Utsaah and Adobe Devanagari are preinstalled on Windows I think. —AryamanA (मुझसे बात करेंयोगदान) 17:35, 26 November 2017 (UTC)
@AryamanA: Well, that explains why loading fonts into the browser hasn't been done especially for the new scripts in Unicode. That's similar to the reasoning about why templates could be faster and more efficient than modules for doing simple things.
With this edit, you removed 'Devanagari Sangam MN', and perhaps that's better.
Thanks for the suggestion about making User:Kutchkutch/common.css. I've been reluctant to experiment with it because the customisations may not apply when logged out. Kutchkutch (talk) 01:26, 27 November 2017 (UTC)