Module talk:ko-pron

From Wiktionary, the free dictionary
Jump to navigation Jump to search

@Wyang For some reason it gets into Category:Language code missing/IPA.

What's the meaning of all parameters? --Anatoli (обсудить/вклад) 04:09, 16 April 2014 (UTC)[reply]

Fixed. 1st param = text to be transcribed, 2nd parameter (rr,mr,yr,rrr,ph,ipa) = type of transcription/transliteration. Wyang (talk) 04:14, 16 April 2014 (UTC)[reply]

Long vowel on 길다[edit]

@Wyang Hi, should the syllable length be marked as /ˈkiːɭda̠/ rather than /ˈkiɭːda̠/? --Anatoli T. (обсудить/вклад) 13:58, 5 August 2014 (UTC)[reply]

Thanks, fixed. Wyang (talk) 23:09, 5 August 2014 (UTC)[reply]

IPA for ᄏ[edit]

@Wyang Hi Frank,

Shouldn't ᄏ be pronounced /kʰ/? As in 크림 (keurim): /kʰɯɾim/, not /kxɯɾim/? Correct me if I'm wrong.

Also, could you please consider adding alternative forms and verb/adjective conjugation to {{ko-new}}? Thanks! --Anatoli T. (обсудить/вклад) 22:00, 22 March 2015 (UTC)[reply]

I might be wrong about "ᄏ", judging by Korean_phonology --Anatoli T. (обсудить/вклад) 22:04, 22 March 2015 (UTC)[reply]
Yeah, I followed the usage at Korean phonology#Vowel assimilation. Maybe /kʰɯ/ would be better in this case? I will try to have ko-new Lua-cised soon; for the moment I've been trying to make {{ko-l}} better. :) Wyang (talk) 00:16, 23 March 2015 (UTC)[reply]
Thanks, mate. Sorry, I'm not a guru in Korean phonology and I don't know or hear the difference between /kʰ/ and /kx/. Letting you decide. No rush with conjugations but it would be good to have. Perhaps 하다 verbs/adjectives could use automatic etymologies too? Like: "From {{ko-l|BLAH||}} + 하다 (hada)"? --Anatoli T. (обсудить/вклад) 00:23, 23 March 2015 (UTC)[reply]
Yes, that's my plan - integrating all the ko-etym templates and create {{ko-etym}} so that just "Template:ko-etym" would be enough at those -hada, etc. entries. Wyang (talk) 00:28, 23 March 2015 (UTC)[reply]

No geminate[edit]

@Atitarev, Wyang, KoreanQuoter: I have noticed some mismatches between official phonetic hangeul and actual pronunciation, and it may be a good idea to modify this module to reflect the latter. The current version gives the following results:

ㅆ vs. ㄷㅆ
있음 있습(니다)
Phonetic hangeul 이씀 읻씀니다
IPA /is͈ɯm/ /is͈ɯmnida̠/
ㄸ vs. ㄷㄸ
이따가 있다가
Phonetic hangeul 이따가 읻따가
IPA /it͈a̠ɡa̠/ /it͈a̠ɡa̠/
ㄲ vs. ㄱㄲ
아끼다 악기다
Phonetic hangeul 아끼다 악끼다
IPA /a̠k͈ida̠/ /a̠k͈ida̠/
ㅉ vs. ㄷㅉ
어찌 얻지
Phonetic hangeul 어찌 얻찌
IPA /ʌ̹t͡ɕ͈i/ /ʌ̹t͡ɕ͈i/
ㅃ vs. ㅂㅃ
가빠 갑빠
Phonetic hangeul 가빠 갑빠
IPA /ka̠p͈a̠/ /ka̠p͈a̠/

They are actually homophones, and the IPA characters in bold should be deleted. I don’t know whether it is a recent pronunciation change or not. As far as I know, 있습니다 is never pronounced with /t̚/. — TAKASUGI Shinji (talk) 17:55, 21 November 2016 (UTC)[reply]

The National Institute of the Korean Language once tweeted that phonetic Hangul for 있습니다 is 읻씀니다. Perhaps it is only reduced in the colloquial pronunciation, in which case it would be better to show /t̚/ in parentheses? #발음 Wyang (talk) 19:40, 21 November 2016 (UTC)[reply]
I've got Learner's Korean dictionary from Tuttle. It uses even a more phonetic version of RR than we do. The dictionary uses eg "isseumnida" and geminates consonants even when it doesn't match the spelling (but we show it in the phonetic hangeul). It fails to render long vowels in the transliteration, though (like Wiktionary). I think Shinji is right. In any case, this is the pronunciation I learned from all the training videos and audios so far but I don't have the confidence to say that's standard. --Anatoli T. (обсудить/вклад) 20:53, 21 November 2016 (UTC)[reply]
I can confirm that 있습니다 is never pronounced with /t̚/. 잇습니다, on the other hand, would be pronounced with /t̚/ based on how I say it as well people surrounding me. --KoreanQuoter (talk) 23:30, 21 November 2016 (UTC)[reply]

The standard pronunciation described by the standard dictionary published by the National Institute of the Korean Language (표준국어대사전) does distinguish the pronunciation pairs above. In practice, they are only distinguished in artificially slow and careful pronunciation, because tense consonants usually sound like geminates between vowels. So in theory, 있다가 is pronounced [읻따가] and 이따가 is pronounced [이따가], but [이따가] is usually indistinguishable from [읻따가]. I would write [이따가] as [ittaɡa] or [i(t)taɡa] to show that it is optionally geminate and [읻따가] as [ittaɡa]. Meanwhile, ㄷㅅ /dz/ or ㄷㅆ /ds/ assimilates to [ss] in Korean instead of producing an affricate. If you are listening for [t], you will never hear it. So 있습니다 [읻씀니다] is [issɯmnida], at least in standard pronunciation. Again, it is usually indistinguishable from [이씀니다] [issɯmnida] or [i(s)sɯmnida]. --Iceager (talk) 14:20, 21 December 2016 (UTC)[reply]

McCune–Reischauer romanization error in certain verbs/adjectives[edit]

I haven't looked at the romanizations automatically generated for Korean lemmas in Wiktionary systematically, but I noticed that the McCune–Reischauer romanization for 넓다 was given as nŏlda when it should instead be nŏlta because the ㄷ is voiceless. I seem to find the same error whenever a verb or adjective stem ending in ㄴ/ㄵ/ㄻ/ㄼ/ㅀ/ㅁ is followed by -다, as in 안다 anda 'to embrace', which should be anta; 앉다 anda 'to sit', which should also be anta; 젊다 chŏmda 'to be young', which should be chŏmta; 얇다 yalda 'to be thin', which should be yalta; 핥다 halda 'to lick', which should be halta; and 감다 kamda 'to close (one's eyes)', which should be kamta. The automatically generated IPA pronunciation correctly shows the fortition of ㄷ in each of these cases, so I'm sure this can be easily fixed, but I'm not sure how to edit the module. --Iceager (talk) 13:40, 21 December 2016 (UTC)[reply]

Module errors[edit]

@Wyang: There are a few errors from this module in CAT:E now, for instance in (n). Could you see if you can fix it? — Eru·tuon 01:28, 17 September 2017 (UTC)[reply]

I will fix them. Wyang (talk) 01:40, 17 September 2017 (UTC)[reply]
@Wyang: I've made Module:ko-pron and Module:ko-translit return nil if there is no boundary data, as the error messages are not helpful to editors. Go to the tracking ko-pron/no boundary data to find the pages on which there was previously an error message (though it will take a while to update). You can see a message regarding the missing data in the Lua console. Examples: be, shall, let's. — Eru·tuon 10:45, 20 September 2017 (UTC)[reply]
I think it is more beneficial to display them as module errors. Sorry I didn't fix the ones in the translation tables. Please just let me know and I will fix them as soon as I can. The tracking links are easily forgotten in the future, and the inconsistencies in the content could remain for a long time, whereas Cat:E would allow early attention and fixing of these errors (as long as there's someone taking charge). Wyang (talk) 10:54, 20 September 2017 (UTC)[reply]
Well, I don't really understand what was triggering the error (as in why there isn't boundary data for the particular characters that didn't have boundary data) or how you fixed it. if the problem is related to the template input, there should be an error message that tells how to fix the problem, as there is if you tried to enter {{l|word}} without a language code. It's bad practice to have an uninformative and vaguely troubling attempt to index field '?' message. (If the problem relates to the modules, then a more specific category would be more helpful than an error message. It wasn't clear that the errors related to Korean, because they were in English entries, and they stayed there for a few days, hiding more easily fixed errors.) In any case, look at the tracking category ko-pron/no boundary data to see the stuff that needs fixing. — Eru·tuon 05:39, 21 September 2017 (UTC)[reply]
I've fixed those now. Wyang (talk) 06:02, 21 September 2017 (UTC)[reply]

use of Cambria[edit]

it stands out like a sore thumb to me _(:3 」∠ )_ —suzukaze (tc) 10:05, 20 November 2017 (UTC)[reply]

@Suzukaze-c Oh really? It wasn't so obvious on my browser. Anyway, I changed it now to match the text below. Got a bit sick looking at the previous design for months - I hope this new format is more aesthetic. Wyang (talk) 14:26, 20 November 2017 (UTC)[reply]
I don't know anything about Korean, but the box looked nice imo. —AryamanA (मुझसे बात करेंयोगदान)
@AryamanA 不明覺厲不明觉厉 (bùmíngjuélì). Wyang (talk) 11:35, 22 November 2017 (UTC)[reply]
@Wyang: lol, did {{ko-IPA}} not have a box around it before? Like {{zh-pron}} does now? —AryamanA (मुझसे बात करेंयोगदान) 12:05, 22 November 2017 (UTC)[reply]
@AryamanA It did have a box around it before, like {{zh-pron}}, but in hindsight the layout was ugly as. Wyang (talk) 12:09, 22 November 2017 (UTC)[reply]
@Wyang: Fair enough. Aesthetics is definitely not Wiktionary's strong point... —AryamanA (मुझसे बात करेंयोगदान) 12:23, 22 November 2017 (UTC)[reply]
@AryamanA Not mine either... लोल्! Wyang (talk) 12:25, 22 November 2017 (UTC)[reply]
@Wyang: No need for the virama, 哈哈 —AryamanA (मुझसे बात करेंयोगदान) 12:32, 22 November 2017 (UTC)[reply]
@AryamanA Just so that it's unambiguously what I intended it to be... Wyang (talk) 12:34, 22 November 2017 (UTC)[reply]

Non-standard pronunciation of 쉐[edit]

As you know, , and have the same pronunciation in modern Korean. A weird thing is that is recently pronounced differently from and : their vowels are the same, but the consonant of is palatalized just like while those of and are not.

쇠고기 쇄골 쉐이크 쉬다
{{ko-IPA}}
output
Romanizations
Revised Romanization?soegogi
Revised Romanization (translit.)?soegogi
McCune–Reischauer?soegogi
Yale Romanization?soykoki
Romanizations
Revised Romanization?swaegol
Revised Romanization (translit.)?swaegol
McCune–Reischauer?swaegol
Yale Romanization?swaykol
Romanizations
Revised Romanization?sweikeu
Revised Romanization (translit.)?sweikeu
McCune–Reischauer?sweik'ŭ
Yale Romanization?sweyi.khu
  • (SK Standard/Seoul) IPA(key): [ˈʃʰɥi(ː)da̠] ~ [ˈʃʰy(ː)da̠]
  • Phonetic hangul: [(ː)]
    • Though still prescribed in Standard Korean, most speakers in both Koreas no longer distinguish vowel length.
Romanizations
Revised Romanization?swida
Revised Romanization (translit.)?swida
McCune–Reischauer?shwida
Yale Romanization?swīta
Popular
Pronunciation
[sʰwe̞ɡo̞ɡi] [sʰwe̞ɡo̞ɭ] [ʃʰwe̞ikxɯ] [ʃʰɥida̠]

The new pronunciation of 쉐 is criticized by proscriptionists, but Wiktionary is not proscriptive but descriptive. What do you think about changing our Korean module? I have already applied it to a Korean module on French Wiktionary (fr:Discussion module:ko-hangeul#Prononciation populaire de 쉐). — TAKASUGI Shinji (talk) 04:15, 19 March 2018 (UTC)[reply]

Thanks, Shinji. Looks interesting and I don't have objections. Pinging the group subscribed to Korean: (Notifying Wyang, TAKASUGI Shinji, HappyMidnight): . --Anatoli T. (обсудить/вклад) 04:35, 19 March 2018 (UTC)[reply]
It can be /ɕwe/ and /swe/ as well, or even /ʃe/ and /ɕje/, and which pronunciation is dominant can be word-dependent. I'm not sure we should change it at this stage; /ʃwe/ is commonly perceived by Korean speakers as wrong, and the spelling 쉐이크 is rejected in favour of 셰이크. Takasugi-san, do you know of studies showing how common this pronunciation is by any chance? Wyang (talk) 07:31, 19 March 2018 (UTC)[reply]
I don’t know whether there are theses on this. The use of 쉐 is already an established transcription of the English [ʃeɪ] and [ʃɛ] in proper nouns: 쉐보레, 쉐이크쉑 and 로열 더치 쉘. The established transcription of the foreign [swe] and similar sounds is rather 스웨: 스웨덴 and 스웨터, and I can safely say that 쉐 is, regardless of its real pronunciation, used only for the foreign [ʃe] and similar sounds. Namuwiki (non-academic wiki-based dictionary) has an article for 쉐, but it cites no sources. Some questioners on a Q&A site of the National Institute of Korean Language say that 쉐 is pronounced as 쉬에, 수예 or 슈에: [1], [2]. — TAKASUGI Shinji (talk) 09:34, 19 March 2018 (UTC)[reply]
(1) Here I attached url of National Institute of Korean Language's dictionary, for your reference. Also you can listen the pronunciations. 쇠고기/ 쇄골 / 쉬다 (2) "쉐이크" is not standard spell. "셰이크" is standard. This is Korean version of that dictionary. [3] [4] (3) "쉐" or other words that begin with "쉐"("쉐...") are all dialect. So, we can't fix the exact pronunciation for these words. for your reference. HappyMidnight (talk) 10:21, 19 March 2018 (UTC)[reply]
“쉐이크 is not standard spell. 셰이크 is standard” seems to me a proscriptive statement. We must be descriptive because Wiktionary is not a national dictionary, which is good. We can just write {{lb|ko|nonstandard}} for 쉐이크. The interchangeable use of 쉐 and 셰 rather indicates that their prononciations are close. — TAKASUGI Shinji (talk) 10:34, 19 March 2018 (UTC)[reply]
Dear TAKASUGI Shinji, I am sorry you feel proscriptive. I didn't mean that. You may ignore "national" dictionary's attitude. HappyMidnight (talk) 11:18, 19 March 2018 (UTC)[reply]
@HappyMidnight: You don’t have to be sorry at all. Thank you for your information. — TAKASUGI Shinji (talk) 12:43, 19 March 2018 (UTC)[reply]

North Korean pronounciation[edit]

Moved from Template talk:ko-IPA.

Hi there, I found this webpage. 南北の言語の違い It says some words have different pronounciation in south and north Korea. e.g. 심리[심니/심리](心理),독립[동닙/동립](独立),생산량 [생산냥/생산량](生産量) --Hahahaha哈 (talk) 00:05, 31 May 2018 (UTC)[reply]

(Notifying Wyang, TAKASUGI Shinji, HappyMidnight): : Just notifying. @Hahahaha哈: Thanks. Word initial difference are well-known and they are often reflected in a different spelling but I didn't know mid-word differences. --Anatoli T. (обсудить/вклад) 00:33, 31 May 2018 (UTC)[reply]
The difference is well known but who can confirm real North Korean pronunciation here? The professor who wrote the page you mention is a Japanese resident of North Korean nationality. There is probably no North Korean resident among us. — TAKASUGI Shinji (talk) 16:24, 31 May 2018 (UTC)[reply]
Maybe we can use this North Korean dictionary 《조선말대사전》, 독립[獨立]발음: [동-] --Hahahaha哈 (talk) 16:45, 31 May 2018 (UTC)[reply]

Font[edit]

Why was the font displaying the phrase IPA(key) specified to Dejavu Sans? I changed it. --Mahmudmasri (talk) 09:19, 25 January 2020 (UTC)[reply]

<ㅔ> ~ <ㅐ> merger[edit]

@Suzukaze-c, could you possibly reformat this so that it gives [e̞] as an alternate pronunciation of all forms with <ㅐ>, as is common in the actual speech of most South Koreans who have merged the two vowels? So the IPA for 각개 (gakgae) should show up as ka̠k̚k͈ɛ~ka̠k̚k͈e̞.

The phonetic Hangul should also represent this, and for all <ㅐ> forms write e.g. 각깨/각께 as is already done in <ㅚ> forms like 회의 (hoe'ui).

I would do this myself but I'm hesitant to make a mess of the Lua code.--Karaeng Matoaya (talk) 13:14, 7 November 2020 (UTC)[reply]

@Karaeng Matoaya:  DoneSuzukaze-c (talk) 05:58, 8 November 2020 (UTC)[reply]
@Suzukaze-c: Thank you! Could you check (wae) and (gyae) as well? I don't see where you find the codepoints for <ㅙ> and <ㅖ>...--Karaeng Matoaya (talk) 08:50, 8 November 2020 (UTC)[reply]
As an addendum, in the case of (gyae) it should be [kjɛ] ~ [kje̞] ~ [ke̞], with the same yod-dropping already correctly marked in (gye). 걔, 계, 개, 게 are not distinguished at all for most younger people.--Karaeng Matoaya (talk) 09:15, 8 November 2020 (UTC)[reply]
 Done: (wae) and (gyae).
N Not done: (gyae) = [kjɛ] ~ [kje̞] ~ [ke̞]
Suzukaze-c (talk) 23:53, 8 November 2020 (UTC)[reply]
@Erutuon, Benwing2 Could I request help? I can't figure out how to realize this, and you two are much more advanced programmers than me. —Suzukaze-c (talk) 22:02, 9 November 2020 (UTC)[reply]

Bug?: In phonetic hangul, long vowel mark is not shown for single syllables without batchim[edit]

  • {{ko-IPA|게|l=y}}Phonetic hangeul: []
  • {{ko-IPA|경|l=y}}Phonetic hangeul: [ː]
  • {{ko-IPA|예시|l=y}}Phonetic hangeul: [ː]

Suzukaze-c (talk) 23:37, 8 November 2020 (UTC),[reply]

Suzukaze-c (talk) 17:03, 17 November 2020 (UTC)[reply]

Fixed (?) —Suzukaze-c (talk) 17:08, 17 November 2020 (UTC)[reply]

Vowel length[edit]

@Suzukaze-c, sorry to bother you again, but would it be possible to add a small italicized statement under the table that In both Koreas, most speakers no longer distinguish vowel length. if long vowel is marked ({{ko-IPA|l=y}})?--Karaeng Matoaya (talk) 09:29, 7 December 2020 (UTC)[reply]

Done, but under the Phonetic hangul bullet point, for consistency with Template:ko-hanja-pron. —Suzukaze-c (talk) 08:21, 2 March 2021 (UTC)[reply]

Assimilation (n->l) and others over a space[edit]

@Suzukaze-c, Karaeng Matoaya: Hi. The assimilation across a space between words is not working. The module doesn't have test cases but IMO:

  1. {{ko-IPA|일 녠}} should produce [iʎ ʎje̞n] ~ [iɭ ɭe̞n]
  2. {{ko-IPA|서울 날씨}} should produce [sʰʌ̹uɭ ɭa̠ɭɕ͈i]. --Anatoli T. (обсудить/вклад) 03:48, 28 January 2021 (UTC)[reply]
Added to Module:ko-pron/testcases. —Suzukaze-c (talk) 09:06, 2 March 2021 (UTC)[reply]

q in Yale romanization[edit]

@Suzukaze-c Just wanted to note that the letter q in Yale romanization is used for non-obvious tensing, e.g., in loanwords or compounds, and not for something as a result of assimilation (as far as I know, at least, but I pinged the Korean group in case you guys know something I don't). For example, you would expect 산보 sānqpo, but 학교 hak.kyo. (Notifying TAKASUGI Shinji, Atitarev, HappyMidnight, LoutK, Karaeng Matoaya, B2V22BHARAT, Quadmix77): LogStar100 (talk) 22:27, 23 February 2021 (UTC)[reply]

Added to Module:ko-pron/testcases. —Suzukaze-c (talk) 09:06, 2 March 2021 (UTC)[reply]
@LogStar100: The final-initial combination "k-k" is "kqk" according to Module:ko-pron/data. I'm not familiar enough with Yale or the intricacies of this module, so I'm not sure if changing it there is acceptable or not, but I encourage you to add more Yale examples to Module:ko-pron/testcases and then preview edits to Module:ko-pron/data using the "Preview page with this template" box at the bottom to see how Module:ko-pron/testcases would be affected. —Suzukaze-c (talk) 09:09, 2 March 2021 (UTC)[reply]
@LogStar100, Suzukaze-c: Martin's reference grammar confirms that <q> is supposed to be used only for unpredictable tensing (i.e. what would have to be manually inputted with the {{ko-IPA|com=}} parameter). Furthermore, tensing in verbal conjugation should not use <q> either: Martin has <kam.ta> for 감다 (gamda) and not <kamqta>.--Tibidibi (talk) 09:23, 2 March 2021 (UTC)[reply]

dubeolsik[edit]

  1. {{Han char}} Cangjie is restricted to single characters, just like {{character info}} Dubeolsik. Not sure why you think this is a good argument?
  2. Wiktionary is not for teaching people how to type.
  3. "People who know Chinese or Japanese can easily expect that Korean input methods also use a romanization-based input" is a strawman, and not our problem.
    1. Taiwan has Zhuyin keyboards, and Japan has the (unpopular) JIS layout.

Suzukaze-c (talk) 03:13, 30 November 2021 (UTC)[reply]

Agreed on all points.--Tibidibi (talk) 03:19, 30 November 2021 (UTC)[reply]
Agreed here as well. AG202 (talk) 03:37, 30 November 2021 (UTC)[reply]
Let me clarify the first one: Every Han character gets its own entry, but a hangul syllable does not always get one. For example, there is no way to get the Dubeolsik–QWERTY keystrokes for even though it is actually used in 일컫다 because 컫 does not get an entry. So providing Dubeolsik–QWERTY keystrokes in each syllable entry is actually not sufficient.
Also, I am not against providing Zhuyin keystrokes for each Chinese entry. --74.134.27.22 03:39, 30 November 2021 (UTC)[reply]
- If you click on the red link, you will be met with {{character info}} hovering above the text box. "Dubeolsik input: z-j-e".
- If you can't read the key labels on your new Hangul keyboard or print out an image of one, that's your problem.
Suzukaze-c (talk) 03:44, 30 November 2021 (UTC)[reply]

McCune–Reischauer: ㄴㄱ with tensification should be nk instead of n'k[edit]

{{ko-IPA|산골|com=1}}--Mrhso2014 (talk) 14:45, 8 January 2022 (UTC)[reply]

Fixed. Mrhso2014 (talk) 12:15, 9 January 2022 (UTC)[reply]

곧이어 should be [고디어] but there doesn't seem to be a way to denote this pronunciation[edit]

This results in a wrong (or at the very least non-prescriptive) pronunciation [고지어] in https://ko.wiktionary.org/wiki/%EA%B3%A7%EC%9D%B4%EC%96%B4 (Looking at the Lua code, I believe a new parameter must be created to handle this) Hsjoihs (talk) 16:24, 16 January 2022 (UTC)[reply]

The bcred parameter can also block palatalization. I just created 곧이어. --2607:FB91:30B:D215:AC4F:A6A8:73C5:FDF1 22:39, 15 July 2022 (UTC)[reply]

@Fish bowl, Tibidibi: Hi. Terms are added to this category, even if the long vowel is not on the on the first syllable, e.g. 원피스 (wonpiseu).

투피스 (tupiseu) has TWO long vowels - on the 1st and 2nd syllables. Anatoli T. (обсудить/вклад) 06:13, 9 February 2023 (UTC)[reply]

Vowel length[edit]

I would like to wikify the text on vowel length to w:Korean_phonology#Loss_of_vowel_length_contrast, but I can't find the relevant text when I click "edit". Where is it stored? Apokrif (talk) 22:47, 5 August 2023 (UTC)[reply]

Bug in vowel variation code that causes duplicates[edit]

This is visible on pages like 소비에트 사회주의 공화국 연방 or 남아프리카의 외침.

The first two phonetic hangul or IPA results are the same instead of having another variation that is omitted.

The bug is present here in the code starting at line 235. When the vowel id is 11, we insert at i, but we're modifying the table while iterating over the items in it.

for i = 1, pre_length do  
   local item = mw.text.split(word_set[i], "")  
   for num, it in ipairs(item) do  
       if math.floor(((codepoint(it) - 0xAC00) % 588) / 28) == vowel_id then  
           item[num] = char(codepoint(it) + vowel_variation_increment)
       end
   end
   if vowel_id == 11 then
       table.insert(word_set, i, table.concat(item))
   else
       table.insert(word_set, table.concat(item))
   end
end

I don't really use Lua, but I recreated this code in Python and that's how I found the bug. I'm not sure the ideal way to change it in Lua, but I think if we have a separate variable, j, access word_set[j] instead of i, and then increment j each iteration as well as an additional time if we insert into the table, I believe that would fix it.

local j = 1
for i = 1, pre_length do  
   local item = mw.text.split(word_set[j], "")  
   for num, it in ipairs(item) do  
       if math.floor(((codepoint(it) - 0xAC00) % 588) / 28) == vowel_id then  
           item[num] = char(codepoint(it) + vowel_variation_increment)
       end
   end
   if vowel_id == 11 then
       table.insert(word_set, j, table.concat(item))
       j += 1
   else
       table.insert(word_set, table.concat(item))x
   end
   j += 1
end

Profesor Caos (talk) 03:49, 19 February 2024 (UTC)[reply]