Template talk:ja-pron

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

Initial discussion[edit]

Discussion moved from Talk:精神分裂病.

Having done Module:ko-pron, I'd like to start the work on Module:ja-pron now. Wyang (talk) 03:19, 16 April 2014 (UTC)
@Eirikr. Might as well move this discussion to a more visible place. ;) --Anatoli (обсудить/вклад) 03:23, 16 April 2014 (UTC)
  • @Atitarev Dunno where that is, but please go ahead and move it. I'm bogged down in meatspace work and don't have time for WT for the next week or two.  :(
  • @Wyang please do! I don't have time, as much as I'd love to dig in and get my hands dirty. ‑‑ Eiríkr Útlendi │ Tala við mig 06:07, 16 April 2014 (UTC)
  • @Eirikr OK. I'll think of a different location. If you have a way to contact Haplology, please let him know we need him here! Take it easy and come back when you can.
  • @Wyang. I'm happy to do the testing and using the future module but I'd need to brush up my IPA for Japanese. We'll have to rely on your skills and knowledge again. :) がんばってね!--Anatoli (обсудить/вклад) 06:17, 16 April 2014 (UTC)

Thanks. I've done a crude version at Template:ja-pron and Module:ja-pron. Currently,

{{ja-pron|せいしん ぶんれつ びょう|acc=h|y=on}}


  • On’yomi
    • (Tokyo) ーしんぶんれつびょー [sèéshíń búńrétsú byóó] (Heiban – [0])
    • IPA(key): [se̞ːɕĩm bɯ̟̃ᵝnɾe̞t͡sɨᵝ bʲo̞ː]

Any suggestions? Wyang (talk) 07:52, 16 April 2014 (UTC)

Well done! No suggestions yet but some documentation would be helpful, specifically on parameters and types of accents. It's for 標準語, isn't it or for any variety? Should the template link to/mention the variety name? --Anatoli (обсудить/вклад) 08:01, 16 April 2014 (UTC)
Can we also have Accent: 0, Accent: 1, 'etc. next to accent names? --Anatoli (обсудить/вклад) 08:14, 16 April 2014 (UTC)
It would refer to standard Japanese, although I don't know where that information should be placed. How do the accent numbers correspond to accent types 'h,o,a,n'? Japanese pitch accent isn't very helpful. Do they refer to the accented morae? (in which case, nakadaka could get more than one number, correct?) Wyang (talk) 13:10, 16 April 2014 (UTC)
Letters must be the best way, then. Since we don't know if a given transcription is for the standard Japanese, I'll drop this request as well. They can be marked in brackets for any variety, potentially. Do you think we should have a default IPA info, when the pitch is unknown? Unfortunately, I don't have enough sources for the Japanese accents, nothing online, only some old Japanese-Russian dictionaries with accents. --Anatoli (обсудить/вклад) 22:02, 16 April 2014 (UTC)

  • Looks good. I tweaked the module to replace ɽ with ɾ̠. The former is a retroflex tap, not used in Japanese, while the latter is more generally accepted as the /r/ tap preceding an /a/ sound.
@Atitarev you're talking about the numbers used in some dictionaries to indicate the number of the syllable right after which the downstep in pitch occurs, yes? Or something else? If you mean the downstep syllable, calling it accent isn't quite correct. If you mean something more like dialect, maybe some other less ambiguous term could be used.
@Atitarev, Wyang {{ja-accent-common}} refers to 標準語 as defined by Tokyo standard pronunciation and the NHK pronunciation guidelines for broadcasters. I haven't seen any resources that give pitch information for any other dialects, but I would be quite happy to include those, provided we can find such resources.
With that in mind, is there any easy way to make the module, um, modular :), to allow for pluggable pitch sub-modules or functions? I haven't gone through the code at all really, I just made that one change to swap out ɽ.
Also, @Wyang the pitch drop on long vowels looks plug-ugly at カリフラワー, at least on my machines -- the ` invalid IPA characters (`) that's supposed to show the lower pitch shows up too far to the right, so that it's not even over the vowel at all, appearing instead as a stray mark over the closing square bracket.
Lastly, combos like 'çʲj' don't quite work -- this should just be 'çj' instead, with just the main palatal glide of /j/.
Cheers folks, thank you for your help with this! ‑‑ Eiríkr Útlendi │ Tala við mig 22:03, 16 April 2014 (UTC)
  • Oh, and to clarify, when I say I haven't seen any resources that give pitch information for any other dialects, I mean on a word-by-word basis. Shibatani and others do discuss the broad trends of pitch patterns, but the only lexicographical information about pitch that I've seen in actual dictionaries and the like has been for Tokyo dialect. ‑‑ Eiríkr Útlendi │ Tala við mig 22:05, 16 April 2014 (UTC)
  • Oo, also, じ should be rendered as d͡ʑi, not (d͡)ʑʲi (not really clear what the parens are doing there, and no need for the small "j"). C.f. 御御籤.
For compatibility (and legibility) purposes, {{ja-pron}} should support yomi as a synonym param for y.
And, how / where do we put the reference footnote? If I put it right after the call to {{ja-pron}}, the footnote shows up on the IPA line -- which isn't correct, since I'm using the reference for the pitch accent, not the IPA. See 御御籤 again for an example.
Thank you again! ‑‑ Eiríkr Útlendi │ Tala við mig 22:27, 16 April 2014 (UTC)
@Wyang Re: "nakadaka could get more than one number, correct". Does that mean than "n" may not be sufficient? I'll dig up my old dictionaries/textbooks and check if there is a straightforward mapping between numbers and letters (pitch accent names). --Anatoli (обсудить/вклад) 22:32, 16 April 2014 (UTC)
Pitch and names and numbers:
  • Heiban (平板, “flat”): Pitch rises after first syllable, falls gradually thereafter. Pitch number 0 -- no downstep.
  • Atamadaka (頭高, “high head”): First syllable takes high pitch, downstep immediately thereafter. Pitch number 1 -- downstep after first mora.
  • Nakadaka (中高, “high middle”): Pitch rises after first syllable, downstep after some number of morae. Can only apply to terms with at least 3 morae. Pitch number varies, must be at least 2, and less than the total number of morae in the term -- downstep after mora indicated by number.
  • Odaka (尾高, “high tail”): Pitch rises after first syllable, downstep after last mora. Pitch number varies, must be the number of the last mora in the term -- downstep after mora indicated by number. For odaka terms, the downstep is actually heard on the following particle.
Hope that clarifies! ‑‑ Eiríkr Útlendi │ Tala við mig 23:45, 16 April 2014 (UTC)
Thanks. You have just confirmed that for Nakadaka there are variants. That means that just using "n" won't produce 100% correct pitch accent. Users may want to know when the downstep starts on long words, right? Please clarify. --Anatoli (обсудить/вклад) 00:08, 17 April 2014 (UTC)


  1. 'replace ɽ with ɾ̠' - great
  2. 'the accent fall on long vowels looks plug-ugly'
    Um... not sure how to solve this. Unicode only has combined grave-macron for e (ḕ) and o (ṑ). It is caused by the font formatting <tt>kàrífúráwā̀</tt> (kàrífúráwā̀). cf. normal unformatted 'kàrífúráwā̀'. We could decompose it into single vowels, eg. kàrífúráwàà, or use either no formatting or some other font (which I don't know). For now I have decomposed it into single vowels and it now looks like Navajo.
  3. Palatalisation. It is currently consistently marked. Questions are:
    1. Should it be consistently marked? eg. mi -> mʲi, ki -> kʲi, çj -> çʲj. I have removed this now.
    2. Should it be marked by default after ɕ, t͡ɕ and d͡ʑ, or when those are followed by non-'ij', or not at all? This version of 統合失調症 used the second option. I have changed it to match that.
  4. About '(d͡)ʑ'... Japanese phonology says d͡ʑ ~ ʑ and d͡z ~ z are in free variation for romaji 'j' and 'z', respectively. Hence the notation there... Should these be written as 'd͡ʑ' and 'd͡z' regardless of the environments they are in? I have converted them to 'd͡ʑ' and 'd͡z' for now.
  5. I have added |yomi, |accent, |accent2, |acc_loc, |accent_loc (these are 'Tokyo' by default), |acc_ref, |accent_ref ('DJR' by default), |acc2_loc, |accent2_loc ('Tokyo' by default), |acc2_ref, |accent2_ref ('DJR' if |acc2 exists).

Anatoli: The single-letter accent types in that template mainly match {{ja-accent-common}}, except 'nakadaka' which needs further specifying. Thus |acc=o (o), |acc=a (a), |acc=h (h), |acc=2,3 (n), |acc=2,2 (n). I'm not sure how the accent numbers correspond to this. Maybe they are positions of accented morae? Wyang (talk) 00:20, 17 April 2014 (UTC)

  • @Wyang re: single-letter accent types, see above about Pitch and names and numbers.  :) Theoretically, it should be possible to specify h, a, or o without needing any number. Only nakadaka would require a number to be able to figure out where the downstep happens. As such, one should ideally be able to specify nakadaka using 'nX, where X is the number, or by using X alone.
For that matter, it might make sense to allow accent types to also be specified by number alone, where 0 or 1 would be heiban or atamadaka respectively, and any greater value would wind up as odaka or nakadaka, depending on how many morae are in the term.
  • Re: d͡ʑ ~ ʑ and d͡z ~ z, d͡z happens, but is rarer. Likewise, ʑ happens, but is rarer. This is mostly an issue of geographical variations in dialect. For NHK purposes (i.e. one of the closest things to a standard pronunciation), my understanding is that romaji "j" == [d͡ʑ], and romaji "z" == [z]. This gets complicated, but it might make sense in the longer term to add a param to allow for specifying this variation, since I think it might sometimes be contrastive and / or emphasized in certain careful speech.
Similarly, whether or not certain /i/ or /u/ sounds are unvoiced should also be specifiable. is つき in hiragana, and is usually [t͡sɯ̥ᵝki] as I hear it. Meanwhile, 付き as in about, regarding is also つき in hiragana, and is usually [t͡sɯᵝki] as I hear it. So it's not really possible to tell just from the kana spelling whether a given /i/ or /u/ is unvoiced.
  • Re: where to put references, there's also the question of where to put qualifiers. Sometimes, albeit rarely, certain pitch patterns for a single term are specific to certain senses. See デッキ for one such example.
  • Thanks again! ‑‑ Eiríkr Útlendi │ Tala við mig 01:00, 17 April 2014 (UTC)
How is non-initial /g/ handled? Are both g/ŋ produced? Please demonstrate on ありがとう.
To me, it seems most Japanese who start learning foreign languages late, have difficulty pronouncing /ʑ/, even if they make an effort. :) --Anatoli (обсудить/вклад) 01:11, 17 April 2014 (UTC)


  • For arigatou:
  • IPA(key): [a̠ɾʲiɡa̠to̞ː]
  • d͡z: changed to 'z'.
  • For vowel devoicing: There was a rule in the module, which devoices vowels between voiceless consonants, and then only keeps the first when two devoiced vowels occur in adjacent morae. I have removed that rule and added a |dev= parameter. Please see Template:ja-pron/documentation.
  • I have added |acc_note, |accent_note, |acc2_note, |accent2_note, which are placed at the end of the accent line.
  • Accent types '0' and '1' treated as 'h' and 'a'. If not single characters 'hao01', then remove 'o'. If resulting string is equal to the length of text, then 'o'. If not, then 'n'. eg. |acc=0 (h), |acc=1 (a), |acc=h (h), |acc=2 for 2-morae word (o), |acc=3 for 3-morae word (n3), |acc=o for 5-morae word (o).
  • I have added an accent reference template so that |acc_ref=NHK etc. can now call the reference template. (Template:ja-pron/documentation)

How about now? Wyang (talk) 02:10, 17 April 2014 (UTC)

Arrowred.png More thoughts :) --
  • There are sometimes more than just two pitch accent patterns. The most I can recall running into is three, but I suppose it's possible that a handful of terms might even have four.
  • I'm changing the description for the dev param -- the number should really be described as the mora number, as some syllabic analyses would give incorrect results. For instance, かんした could be analyzed as having two syllables (sounding like /kan.ɕta/ in casual speech), but four morae, and the devoiced mora is the third one.
  • For references, I think it's best to have the default be nothing. There are terms where Daijirin doesn't include any pitch accent, and I've misplaced my NHK pronunciation dictionary (probably in a box in storage), but I work with native speakers and sometimes crib from them. In these cases, I deliberately don't list any reference, since there isn't really any -- but I think the pitch information is important enough to include, until such time as I can find a real reference to add.
Also, by You can also do it the conventional way, |acc_ref=[1] -- do you mean that it's possible to add the call to {{R:Daijisen}}, etc., directly as the acc_ref param value?
Thanks again, again!  :D ‑‑ Eiríkr Útlendi │ Tala við mig 19:23, 17 April 2014 (UTC)

  • I have added |acc3, |acc4 and |acc5, since there might be occasions where non-Tokyo accent patterns would like to be specified too.
  • 'dev': I think I might have described it inaccurately... As the module analyses it, the 'dev' parameter is the position of the devoiced syllable in the kana string. eg. hyakushou should have |dev=3 (not 2), and だいこん やくしゃ should have |dev=6 (spaces are not counted). This is inconsistent with the format of the accent parameter, but I think it is easier to specify and easier for the module to handle.
  • 'ref': Oops, I forgot to put nowiki tags around it. It should read |acc_ref=<ref name="NHK">{{R:NHK Hatsuon}}</ref>.
  • 'default ref': I have removed the default reference, so that there is no reference listed when the parameter is unspecified.
  • 'dehijacking the talkpage': I agree... Hence it is here now.

Thanks. Wyang (talk) 22:01, 17 April 2014 (UTC)


Long vowel oddity spotted at パーソナルコンピューター[edit]

Just created this entry, and noticed that the ピュー got romanized oddly in the romaji-with-tone-marks bit in the pronunciation section:


I'm about to log off for the night. If you have time, could someone look at the module and see what's going on there?

Cheers, ‑‑ Eiríkr Útlendi │ Tala við mig 04:53, 20 April 2014 (UTC)

Sorry for the delay. Fixed now. Wyang (talk) 08:02, 22 April 2014 (UTC)

Pitch on moraic /n/ > ん[edit]

I was reformatting 日本 and noticed that the downstep that occurs on the final ん isn't being indicated in the romanized version with tone marks. For instance, {{ja-pron}} is giving [nìhón] and [nìppón], when it should be outputting [nìhóǹ] and [nìppóǹ] instead.

For that matter, even if there were no downstep, the template should still show tone marks for moraic /n/. Could that be fixed? ‑‑ Eiríkr Útlendi │ Tala við mig 05:59, 21 April 2014 (UTC)

Thanks, I think it's fixed now. Wyang (talk) 08:02, 22 April 2014 (UTC)

Displaying numbers next to pitch accent names[edit]

@Wyang It would be good to add numbers next pitch accent names, similar to how some paper and online dictionaries mark accents, e.g. 現在 on Weblio. it would make it easier to cross-reference Wiktionary accent names to those numbers. User:Eirikr seems to agree. Do you think it's a good idea? --Anatoli (обсудить/вклад) 01:45, 20 June 2014 (UTC)

Added now. Wyang (talk) 02:22, 20 June 2014 (UTC)
Thank you. I've added [ ]. Is that OK? From what I've seen so far, either a superscript number is used or a number in square brackets. --Anatoli (обсудить/вклад) 03:01, 20 June 2014 (UTC)
Yes, please prettify anything. :) Wyang (talk) 03:58, 20 June 2014 (UTC)


@Wyang, @Eirikr

It didn't work on 宿題 (しゅくだい), the しゅ part. --Anatoli (обсудить/вклад) 02:19, 2 July 2014 (UTC)

Why is |dev=1? I thought vowel devoicing only occurs interconsonantally. Wyang (talk) 06:48, 2 July 2014 (UTC)
It's between two devoiced consonants ɕ and k. Same with しかし (working fine) and 少し (adding now). NHK even uses a similar notation to ours for devoiced vowels. --Anatoli (обсудить/вклад) 07:20, 2 July 2014 (UTC)
It should be |dev=2 instead. |dev= is the position of kana with devoiced vowel in the input kana string. Wyang (talk) 07:31, 2 July 2014 (UTC)
Oh, thank you. Silly me. :) --Anatoli (обсудить/вклад) 07:37, 2 July 2014 (UTC)
  • Just saw this thread again. Wyang, when kana compounds like しゅ are devoiced, the whole thing should be marked as devoiced, like しゅ. Marking it as し makes it look like [ɕiɯ̥ᵝ] or some such oddness, when what we want to indicate instead is [ɕɯ̥ᵝ] or [ɕʲɯ̥ᵝ].. ‑‑ Eiríkr Útlendi │ Tala við mig 01:40, 5 March 2015 (UTC)
@Eirikr I didn't notice this, thanks. You're right. I'm also inviting you to join Wiktionary:Beer_parlour/2015/February#Simplification_of_topic_categories_adding, which may affect Japanese categories, hopefully for the better, if implemented. --Anatoli T. (обсудить/вклад) 02:00, 5 March 2015 (UTC)

dev2, dev3?[edit]

@Wyang Frank, can there be more dev's, please, as in 蛋白質 to get [tã̠mpa̠kɯ̥ᵝɕit͡sɯ̥ᵝ], e.g. ...|dev=4|dev2=6...? --Anatoli T. (обсудить/вклад) 03:22, 28 January 2015 (UTC)

OK, second devil added. I want to rewrite its code... so that the devils can be written as たんぱ(く)し(つ), avoiding the need for |dev11=. Keep it like this for now, I will change the format if I ever get around to doing that... Wyang (talk) 03:43, 28 January 2015 (UTC)
Ah, thanks. I guess all uses will need to be updated? --Anatoli T. (обсудить/вклад) 03:50, 28 January 2015 (UTC)
That's not a big problem when done semi-automatically. We've managed to do all the Chinese format changes... Wyang (talk) 03:53, 28 January 2015 (UTC)
You're genius. :) --Anatoli T. (обсудить/вклад) 04:18, 28 January 2015 (UTC)
@Wyang Hi Frank, I'm back in Melbourne after three weeks in France (also a bit of Belgium) I'm eager to see the change, as 少し and しかし also need to be fixed. :) --Anatoli T. (обсудить/вклад) 01:20, 5 March 2015 (UTC)

Bug with sutegana at the beginning of a term[edit]

{{ja-pron|ふぁふぃとぅふぇふぉふぁ|acc=1}} {{ja-pron|ふぃとぅふぇふぉふぃ|acc=1}} {{ja-pron|とぅふぇふぉふぃ|acc=1}} "fúァ" and "fúィ" probably isn't desirable—umbreon126 07:37, 7 March 2015 (UTC)

'tis okay now —umbreon126 05:11, 27 March 2015 (UTC)
Thanks, User:Wyang! --Anatoli T. (обсудить/вклад) 05:19, 27 March 2015 (UTC)
No worries. :) Wyang (talk) 20:58, 27 March 2015 (UTC)

Delimiting vowels[edit]

For example, on 女王 (applies to the readings じょおう [2] and にょおう [2]), there is a need to delimit the vowels for the IPA to render properly, but when this is done using . as is done in ja-noun, the . is printed in the kana and it also messes up the accent because the dot gets counted as a kana. Nibiko (talk) 08:25, 4 June 2015 (UTC)

Fixed by Kc kennylau! Thank you so much, Kc kennylau! <3 Nibiko (talk) 04:36, 25 April 2016 (UTC)

Twofold long vowels[edit]

Different to my above-mentioned concern, I noticed that on 蓊鬱 おううつ is represented as òóótsú just before the section where it says Heiban. I would expect it to be òóútsú. Nibiko (talk) 03:12, 24 August 2015 (UTC)


It gives the pronunciation [çiβa̠kɯ̥ᵝɕʲa̠] for 被爆者. I don’t know where this [β] comes from. The intervocalic /ɡ/ is often realized as a fricative but the phoneme /b/ doesn’t change at least in my pronunciation. In addition, [ɕʲ] is redundant because [ɕ] is already palatalized. — TAKASUGI Shinji (talk) 12:27, 9 September 2015 (UTC)

@Wyang. --Anatoli T. (обсудить/вклад) 00:03, 14 July 2016 (UTC)
Thanks. [β] is from Japanese phonology#Weakening - should the rule be removed? Removed palatalisation of [ɕʑ]. Wyang (talk) 00:11, 14 July 2016 (UTC)
I think we should remove the rule of β. — TAKASUGI Shinji (talk) 06:11, 14 July 2016 (UTC)
Ok no problem. Removed. Thanks! Wyang (talk) 09:01, 14 July 2016 (UTC)
Thank YOU! — TAKASUGI Shinji (talk) 00:53, 15 July 2016 (UTC)


Can a sort parameter be added to the template? —britannic124 (talk) 16:28, 13 July 2016 (UTC)

  • Since this template already uses a kana-ized string as its primary input, a sort key shouldn't be necessary.
I do see that the underlying module is not changing katakana to hiragana for sorting purposes, but this should be fixed in the module itself, so that the sort key is correctly and automatically derived from the data that the module is already using. @Wyang, is that something you could do? If not, could you ping someone who could? ‑‑ Eiríkr Útlendi │Tala við mig 18:14, 13 July 2016 (UTC)
Sure, I added sortkeys to the IPA and audio categories. It's a bit of an ugly hack though, since those templates do not seem to support |sort=. Wyang (talk) 22:20, 13 July 2016 (UTC)

ん (‘n’) before approximates[edit]

Shouldn’t “n” before “w” be represented as [ɰ̃ᵝ], like in “denwa” [dẽ̞ɰ̃ᵝɰᵝa̠]? (Or least [dẽ̞ɴɰᵝa̠]?) And “n” before “y” as [j̃], like in “shin’ya” [ɕĩj̃ja̠]? —britannic124 (talk) 18:02, 2 September 2016 (UTC)

Recent change to Module:ja-pron[edit]

Hi @Eirikr! Just letting you know that there was a change to Module:ja-pron recently by User:Nardog. I'm not qualified to comment on the IPA changes, but I know you are definitely. :) Wyang (talk) 09:17, 17 May 2017 (UTC)


Since the Japanese attention category has no organisation, I'm leaving my concern here. It would be good if you could override the value of the dev parameter for a certain accent, as this would allow to express exceptions in a single use of the template. See 増幅器 and 屹度. Nibiko (talk) 13:02, 19 June 2017 (UTC)

Co-occurring pitch accents[edit]

ja-pron does not currently support co-occurring pitch accents. If a term has multiple pitch accents divided across words, then see 因果応報 and 一期一会 for the current way to format them. Nibiko (talk) 02:29, 29 June 2017 (UTC)

Error in display of devoiced vowels[edit]

{{ja-pron|だいこん やくしゃ|acc=5|acc_ref=DJR,NHK|dev=6|y=o}}

currently yields (refs removed)

[dàíkóń yáꜜkùshà] should be [dàíkóń yáꜜkùshà], to match the hiragana (and because apparently only u and i can be devoiced). Something must be making the function think there's an extra vowel or mora somewhere before the second word. — Eru·tuon 18:42, 20 August 2017 (UTC)

(I think that part of {{ja-pron}} is somehow generally problematic. (麻婆豆腐 [màbóódóꜜòfù], 野馬 [nóꜜòmà]) —suzukaze (tc) 09:14, 21 August 2017 (UTC))

Distinguishing [oɯ] from [oː] and [ei] from [eː][edit]

Though relatively rare, [oɯ] and [oː] and [ei] and [eː] do contrast in Japanese, as in ō ('king') vs. ou 追う ('to pursue'), mei-sha 名車 ('great car') vs. 目医者 me-i-sha ('eye doctor'). As far as the IPA is concerned this isn't much of a problem since e.g. {{ja-pron|o.u}} yields [o̞ɯ̟ᵝ], but this workaround wouldn't work as soon as |acc= is introduced. They should be supported in some way or another. Nardog (talk) 08:48, 28 August 2017 (UTC)

Or rather, maybe all instances of [oː] and [eː] should be represented by おー and えー instead of おう and えい, so that おう and えい would always stand for [oɯ] and [ei], as in せーしん ぶんれつ びょー instead of せいしん ぶんれつ びょう. This would make it clearer too because おう and えい being restricted to [oː] and [eː] is inherently ambiguous since, orthographically, they could always represent either [oː]/[eː] or [oɯ]/[ei]. Nardog (talk) 08:54, 28 August 2017 (UTC)

は/へ vs. わ/え[edit]

I also noticed the particles は and へ would still be shown as は and へ in the kana representation even though they are pronounced [ɰa] and [e], not [ha] and [he]. But I believe, since the kanas are in and of themselves phonetic symbols, in pronunciation illustrations, they should be the phonetic わ and え instead. (By extension one could argue を should be お too, but since there's no pronunciation variation in を and some do preserve [ɰo] for を so I don't support changing it as strongly as for the other two.) Nardog (talk) 11:31, 28 August 2017 (UTC)

IPA module[edit]

Hey @Wyang, is there a reason this template doesn't call format_IPA_full in Module:IPA? I ask primarily because this module is adding non-entry pages to Category:Japanese terms with IPA pronunciation. Thanks! —JohnC5

@JohnC5 If I remember correctly, it was because the IPA module does not allow sortkey categorisation. Wyang (talk) 05:26, 28 September 2017 (UTC)
@Wyang: How about now? ;PJohnC5
@JohnC5 All right! Changed. Wyang (talk) 05:45, 28 September 2017 (UTC)
@Wyang: Thanks! That fixed it. —JohnC5 06:01, 28 September 2017 (UTC)


Discussion moved from User talk:Wyang#Japanese pronunciation oddity.

Hello Wyang, long time no write.  :)

I was cleaning up the 日向 entry, and found that the pronunciation given at 日向#Etymology_3 is a little weird. It's showing up as [çɨᵝːɡa̠], when it should be something more like [çjɯːɡa̠]. I don't suppose you could have a look at the module code? ‑‑ Eiríkr Útlendi │Tala við mig 05:47, 15 September 2017 (UTC)

(@Nardogsuzukaze (tc) 05:48, 15 September 2017 (UTC))
@Eirikr: It's perfectly accurate. See w:Japanese phonology. /u/ [ɯ̟ᵝ] becomes centralized to [ɨᵝ] after /j/, and /hj/ is [ç]. Vance (2008) had [çj], but this was criticized by Akamatsu. Nardog (talk) 05:55, 15 September 2017 (UTC)
Hello Eirikr! Long time no write. I will hand the mic now to ... Wyang (talk) 06:08, 15 September 2017 (UTC)
(after edit conflict... :) )
@Nardog:, there is a definite glide in the ひゅ sound as pronounced by speakers, which is not represented anywhere in [çɨᵝ]. The IPA is thus misleading.
Akamatsu seems to make the argument that there is no /j/ glide anywhere after /ç/, which is frankly baffling to me, as this does not match my experience at all. It leads me to wonder if he's describing a dialect, or if his home lect might be biasing his interpretation.
FWIW, I'm more interested in descriptively representing Japanese sounds, rather than hewing to any particular academic theory. ‑‑ Eiríkr Útlendi │Tala við mig 06:15, 15 September 2017 (UTC)
@Eirikr: [ç] is a palatal fricative, i.e. [j̝̊], so it is only natural a [j]-like sound is heard during the transition from [ç] to [ɨᵝ] (that is why they're called "glides" in the first place). Hence [çj] is inherently redundant, unless some language somewhere contrasted [çV] and [çjV]. Nardog (talk) 06:40, 15 September 2017 (UTC)
@Nardog: I'm familiar with palatal fricatives, but it seems I've spent too much time working on /phonemics/ and not [phonetics]. I'm happy to concede the point. ‑‑ Eiríkr Útlendi │Tala við mig 07:19, 15 September 2017 (UTC)
FWIW, it would also be very inconsistent if it were [çj]. All consonants are palatalized before /i, j/ either phonetically ([kʲ], [ɡʲ], [mʲ]...) or phonologically ([ɕ], [tɕ], [(d)ʑ]...). So if /h/ became [çj] before /i, j/, or, even more capriciously, /hi/ became [çi] but /hj/ became [çj], that would be quite an exception. Nardog (talk) 06:37, 29 September 2017 (UTC)


Discussion moved from User talk:Wyang#More JA pronunciation questions.

I noticed that ざ・ず・ぜ・ぞ are now being rendered by the module with initial consonant [d͡z], indicating a harder onset than I hear around me. I also note that some dialects of Japanese distinguish between づ and ず, which would ostensibly be [d͡zʉ͍] and [zʉ͍].

Do you have any insight? ‑‑ Eiríkr Útlendi │Tala við mig 04:57, 28 September 2017 (UTC)

I felt we discussed at Template talk:ja-pron before when designing the template. Pinging @Nardog for his or her opinion. Wyang (talk) 05:20, 28 September 2017 (UTC)
It's in free variation, so we can't say for certain it's one or the other. The compromise the template currently adopts is to treat /z/, /zi–di/, /zu–du/, /zj–dj/ as affricates when word-initial or after /N/ and as sole fricatives when intervocalic.
Do the speakers around you at least pronounce it with the tip of the tongue at first in contact with the roof of the mouth? If so it's without a doubt an affricate, although it might not be as striking as cards in English. In fact most speakers of Standard Japanese can't (and don't realize they can't) pronounce English zoo or French genre properly without training, or grasp the difference between cars and cards.
Not only is the number of speakers who still make the distinction between /zu/ and /du/ very small (see the map at w:Yotsugana), but the distinction is not represented in orthography in many cases since the spelling reform of 1946. So we can't possibly integrate the pronunciation for speakers without the neutralization into the template. (Also note that even in words still spelled with ぢ or づ, speakers who have /zu, du/ and /zi, di/ neutralized might still pronounce them as [z]/[ʑ].) We can of course manually add non-neutralized pronunciations on the entries for relevant words, though.
See Labrune (2012:64–66) for more. Nardog (talk) 07:40, 28 September 2017 (UTC)
@Nardog: Re: yotsugana, thank you for the link, I couldn't think of the term earlier this evening. FWIW, one of my teachers years ago was from Kyushu and made the four-way distinction. Later, I lived in the Tōhoku, but in Morioka, which appears to be the pocket of yellow on the map; later on, I was in Tochigi, and later still in Tokyo. FWIW, I recall students in Tochigi deliberately overpronouncing づ in names to clarify the spelling, so at least in that rarified context, even Kantō speakers may make some distinction.
I'll keep my ears open at work over the next several days, and see if I can tease out the specifics of articulation by the native speakers around me (mostly Tokyo-ites, with some folks from Kyoto and elsewhere in Kansai). ‑‑ Eiríkr Útlendi │Tala við mig 16:19, 28 September 2017 (UTC)
Here's a relevant quote from Vance (2008:85–86):

Typically, though not consistently, [dz] occurs at the beginning of a word or in the middle of a word immediately following a syllable-final consonant [i.e. /N/ or /Q/], and [z] occurs in the middle of a word immediately following a vowel. In short, [dz] and [z] are allophones of this /z/ phoneme. Most native speakers of Japanese are quite surprised to discover that there's actually a phonetic difference to worry about, but you'll hear it if you listen carefully to pronunciations of zu [dzɯ] 図 'diagram' and chizu [cɕizɯ] 地図 'map'.

He then goes on to cite the minimal pairs (traditionally spelled くづ) vs. (traditionally くず) and 記事 (traditionally きじ) vs. 生地 (traditionally きぢ), which used to be pronounced differently "until about 400 years ago" but are now both spelled with ず/じ and not distinguished by Tokyo Japanese speakers.
Interestingly, he diverges a little bit from Labrune in saying that, in "careful pronunciation", modern Tokyo Japanese speakers always realize j (じ, じゃ, じゅ, じょ) as [dʑ], as opposed to the "typical, though not consistent," production of z (ざ, ず, ぜ, ぞ) as [z] intervocalically and as [dz] otherwise.
If this account is supported by several other scholars, I'd be willing to change the template's current realization of じ, じゃ, じゅ, じょ, i.e. [ʑ] intervocalically and [dʑ] otherwise, to always [dʑ]. Nardog (talk) 06:16, 29 September 2017 (UTC)
I support your proposition just as a native speaker. For me, /dʑ/ is the base phoneme and [ʑ] is a casual intervocalic allophone, just like the intervocalic allophone [ɣ] for the phoneme /ɡ/. — TAKASUGI Shinji (talk) 00:48, 10 March 2018 (UTC)

Verb ending with "ou"[edit]

I know that verbs ending with "ou" such as 競う, 囲う and 惑う are pronounced not /o:/ but /ou/. Naggy Nagumo (talk) 08:10, 8 December 2017 (UTC)

Sorry, I can distinguish them by writing like "まど.う". I am sorry for making noise. Naggy Nagumo (talk) 08:15, 8 December 2017 (UTC)

Katakana for IPA[edit]

Something's going wrong at 首長国: there's a katakana in the IPA transcription. —Mahāgaja (formerly Angr) · talk 14:52, 9 March 2018 (UTC)

And at ニュースキャスター too. —Mahāgaja (formerly Angr) · talk 14:55, 9 March 2018 (UTC)
ニュースキャスター has wrong parameter, I'll fix it. 首長国: When yōon like "しゅ" is devoiced, it seems go wrong. --Naggy Nagumo (talk) 23:18, 9 March 2018 (UTC)
(Notifying Eirikr, Wyang, TAKASUGI Shinji, Nibiko, Suzukaze-c, Dine2016, Poketalker, Cnilep, Britannic124, Fumiko Take, Dine2016): Anatoli T. (обсудить/вклад) 00:19, 10 March 2018 (UTC)
I think the entire module needs to be overhauled (;・∀・) —suzukaze (tc) 00:33, 10 March 2018 (UTC)
Just curious, why do you think so. It's mostly working fine, doesn't it? --Anatoli T. (обсудить/вклад) 00:51, 10 March 2018 (UTC)
It's working, but it seems fragile... —suzukaze (tc) 01:37, 10 March 2018 (UTC)
Re dev: I think it would be much easier if the dev parameters are incorporated into the kana string (e.g. つ'くよみ). Wyang (talk) 00:54, 10 March 2018 (UTC)
@Wyang, @Naggy Nagumo -- my sneaking hunch is that the handling of the devoicing parameter is screwy from the get-go. For reasons unknown to me (I haven't gone through the module codebase), dev uses a different count than acc. While acc is based on the actual mora count, dev appears to be based on the character count -- which will always diverge from the mora count for any term with yōon or other small non-moraic vowel kana (such as ファン or シェル, where the small ァ and ェ are not technically yōon as I've understood it). Since the string processing for dev is based on character count, it seems like the module can incorrectly split up phonographemes like ファ or しゅ, leaving the small kana dangling and unprocessed -- where it then appears in the final output string. ‑‑ Eiríkr Útlendi │Tala við mig 00:59, 10 March 2018 (UTC)
@Eirikr There's the |devm= parameter that counts by mora, added by Kenny before, which is one bug (out of two) less buggy. I've fixed 首長国. Wyang (talk) 02:39, 10 March 2018 (UTC)
@Wyang I think your changes have introduced a Lua error on 少し. Could you please take a look? —Internoob 06:41, 10 March 2018 (UTC)
@Internoob Thank you, it has been fixed. Wyang (talk) 06:50, 10 March 2018 (UTC)
By the way, is there a need to have the y/yomi parameter? It is already present in {{ja-kanjitab}}, and in a few cases it may depend on which spelling you choose as the main entry. --Dine2016 (talk) 14:10, 11 March 2018 (UTC)
{{ja-kanjitab}} describes the spelling, in which case which yomi is in use is potentially useful information. {{ja-pron}} describes the pronunciation, in which case, again, which yomi is in use is potentially useful information. Given the current infrastructure, if we want to have yomi in both places, we need to add the value in both places -- so far as I understand it, the scope is limited such that one template invocation on a page cannot reference any of the parameters given to another template invocation.
Not sure what you mean by "in a few cases it may depend on which spelling you choose as the main entry". The yomi in either {{ja-kanjitab}} or {{ja-pron}} should match the headword for the relevant etymology section. Any given spelling with multiple readings should have a separate etymology section for each reading. ‑‑ Eiríkr Útlendi │Tala við mig 21:13, 13 March 2018 (UTC)
"It may depend" might be referring to cases like 気まぐれ / 気紛れ, in which there is on'yomi/on'yomi+kun'yomi. —suzukaze (tc) 21:17, 13 March 2018 (UTC)
For cases like that, which don't cleanly fit even into 湯桶読み or 重箱読み categories, and for cases of longer mixed-reading compounds, I find myself coming back to the need to revamp {{ja-kanjitab}} (at a minimum) to allow editors to specify yomi for each kanji, not just for the whole term. (I mean, allow specifying for the whole term where that fits, but also allow per-kanji yomi values where whole-term reading categories won't fit.) In fact, thinking it through now, I'd prefer to have *detailed* yomi information in {{ja-kanjitab}}, and leave it out of {{ja-pron}}.
Is this idea sensible? Would that work for others? ‑‑ Eiríkr Útlendi │Tala við mig 21:26, 13 March 2018 (UTC)
There is Template_talk:ja-kanjitab#Feature_request:_jukujikun_readings and Module:User:Suzukaze-c/Hani-tab (although we are diverging from the original topic). —suzukaze (tc) 21:29, 13 March 2018 (UTC)

@Naggy Nagumo, Atitarev, Suzukaze-c, Wyang, Eirikr, Internoob, Dine2016: Now at the kanji itself is appearing in the IPA transcription, even though the template specifies the hiragana as the first positional parameter. —Mahāgaja (formerly Angr) · talk 14:48, 21 March 2018 (UTC)

I suspect that was caused by the second instance of {{ja-pron}}, {{ja-pron|a=もり.wav}}, which didn't specify any parameter except a= for the audio file. I've merged the two, and now things are displaying correctly. ‑‑ Eiríkr Útlendi │Tala við mig 19:05, 21 March 2018 (UTC)

Atamadaka notation not accounting for long vowels in first syllable[edit]

In creating the page for 聖句 (せいく), I gave {{ja-pron}} the |acc=1| parameter, since it uses an atamadaka-gata pitch accent. However, it is showing séꜜèkù for the pitch, when it should be sééꜜkù because of the long vowel in the first syllable. Should a rule be added in the module where it would place the pitch fall differently between, say, せー and せい in the first parameter? (This exception would also occur when the first syllable ends with ん.) BlueCaper (talk) 17:05, 22 June 2018 (UTC)

Each せい, せー and せん is single syllable but two morae. It should be séꜜèkù. --Naggy Nagumo (talk) 15:03, 6 July 2018 (UTC)
Yes, as Naggy stated. Atamadaka means high pitch on the first mora, not the first syllable, so séꜜèkù is correct. ‑‑ Eiríkr Útlendi │Tala við mig 20:09, 6 July 2018 (UTC)

Verb 囲う "kako.u" incorrect IPA[edit]

On the entry for 囲う, using "かこ.う" the kana correctly renders as "かこう" but the IPA says "kàkóó". The second "o" should be a "u". —This unsigned comment was added by Aogaeru4 (talkcontribs).

Agreed. (Notifying Eirikr, Wyang, TAKASUGI Shinji, Nibiko, Suzukaze-c, Dine2016, Poketalker, Cnilep, Britannic124, Fumiko Take, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone): . --Anatoli T. (обсудить/вклад) 03:11, 10 July 2018 (UTC)
Note that the verb  (おも) (omou) with a similar ending is working fine. --Anatoli T. (обсудить/вклад) 03:16, 10 July 2018 (UTC)
It works correctly only when acc=2 is specified. — TAKASUGI Shinji (talk) 03:27, 10 July 2018 (UTC)
I think the significant feature is that there is no problem when the accent falls on the kana before the "u". (Aogaeru4 (talk) 03:30, 10 July 2018 (UTC))
It should be fixed now. Wyang (talk) 03:43, 10 July 2018 (UTC)
Yay! Thanks, Frank. --Anatoli T. (обсудить/вклад) 03:46, 10 July 2018 (UTC)