User talk:Erutuon

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
Archives: 20092010201120152016

Phonemic symbols for RP[edit]

Hi. Do you have an idea how we could tackle the issue of the hopelessly outdated set of phonemic symbols for RP? The issue is relevant because it makes me refrain from putting Australian IPA in entries, because the outdated RP set would make it look like there are two different sounds here (e.g. lot would have to be transcribed /lɒt/ in RP, which (phonetically) is correct only in case of the traditional RP, spoken e.g. by Richard Dawkins, whereas the Australian IPA would have the correct /lɔt/). Any thoughts? I know that I kind of argued otherwise on Wikipedia... Mr KEBAB (talk) 03:35, 16 October 2016 (UTC)

I have thought a little about this. I recently added some of the newer, more accurate, symbols for RP to the English pronunciation appendix. I think switching to the newer symbols in entries would require (1) consensus and (2) a lot of monotonous find-and-replace, or bot work.
The correct place to get a feel for consensus would be the beer parlour. Nobody has objected to my changes to the Appendix, but that page isn't heavily watched, and who knows if everyone would be happy with a lot of edits imposing the newer symbols on entries.
It seems like there needs to be some way to mark the transcription system in entries, whether they are based on this or that system from this or that author. That way, both /ɒ/ and /ɔ/ for the lot vowel could be used in entries. The labels for transcription system would have to be added to Module:a/data. But probably that idea would require consensus too (and I'm not sure how it could be displayed). — Eru·tuon 03:52, 16 October 2016 (UTC)
Thanks. It seems to me that we need a template, in which symbols are converted depending on which system you want to see. When you enter the RP transcription, you should do so using traditional symbols (so /lɒt/, then you could choose whether you want to see /lɒt/ or /lɔt/.
"Nobody has objected to my changes to the Appendix, but that page isn't heavily watched" - yep, that's... basically the reason :) I can assure you that that debate will not be an easy one to win, but we don't have much to lose here, just time.
Can you write at the beer parlour? You'll do a better job than me at explaining things (my English isn't the best), just make sure you mention the RP vs. General Australian thing. Mr KEBAB (talk) 04:31, 16 October 2016 (UTC)
@Mr KEBAB: I will eventually. I have to think what to say and I'm a bit overwhelmed at the moment. — Eru·tuon 03:53, 18 October 2016 (UTC)
Ok, no problem. Mr KEBAB (talk) 06:57, 18 October 2016 (UTC)


You may have thought you had time to tinker with this, but Module:Alternative forms assumes any module with the correct name is a functioning module and invokes it if it has a parameter to check. It turns out that there were dozens of English entries with text in the dialect parameter, and all of them had module errors due to your leaving your module without the return statement at the end.

Please be more careful, and check Cat:E at least once after you've been doing module work. Thanks! Chuck Entz (talk) 06:44, 16 October 2016 (UTC)

Thanks for the note. I'll make sure to check for module errors in the future. — Eru·tuon 18:19, 16 October 2016 (UTC)

Curly apostrophes[edit]

I noticed that you are already converting Ancient Greek links (such as here) to use curly apostrophes, while the modules have not been updated to convert link targets to plain apostrophes. Did we decide that the entries are going to themselves be located at the curly apostrophe spelling? --WikiTiki89 21:50, 1 November 2016 (UTC)

I've come to the conclusion that you're right that modules should not automatically convert plain apostrophes to curly (or at least that it's a waste of effort to try to make it happen), and at the same time it seems to be generally agreed on the WT:AGRC talk page that entries should have plain apostrophe and displayed text should have curly apostrophe. @Angr and I have made edits to institute this practice. I suppose the plain apostrophe is simply being used because it seems to already be the practice here on the English Wiktionary to use it in entry names. Module:languages/data2 should probably convert curly apostrophe (as well as spacing coronis and smooth breathing) to plain apostrophe for entry names, but nobody has made the edit yet. It's not horribly urgent, because there are in many cases redirects from the curly-apostrophe entries to the plain-apostrophe ones. I'm unable to myself, since I'm not a template editor. I was going to ask an admin to make the edit, but just haven't yet. — Eru·tuon 22:53, 1 November 2016 (UTC)
Ok, I think you should have made sure that that was done before proceeding to edit links. --WikiTiki89 22:54, 1 November 2016 (UTC)
I can do this for you if you tell me what exactly you want to do with the spacing coronis and smooth breathing, and whether you want this for both Ancient and Modern Greek. --WikiTiki89 22:57, 1 November 2016 (UTC)
Okay, that would be great. I'm not sure what should be done for Modern Greek, since I am not very involved in entries for that language (I realize I should've referred to the data3/g module, which contains grc not the data2 module, which contains el), but I would very much appreciate it if you added "["..u(0x2019)..u(0x1FBD)..u(0x1FBF).."]" to the "from" array within the "entry name" array for m["grc"] and "'" to the corresponding "to" field. The characters referred to here are the single right quotation mark, spacing smooth breathing (psili) and spacing coronis (koronis), all of which are sometimes correctly or incorrectly used as apostrophes in Ancient Greek texts. They all look almost identical, so are better referred to using their Unicode numbers. — Eru·tuon 23:11, 1 November 2016 (UTC)


@JohnC5, CodeCat, I'm so meta even this acronym: where does heiulor come from? --kc_kennylau (talk) 03:02, 6 November 2016 (UTC)

@Kc kennylau: It appears to be an extension of (h)ei. The suffix is mysterious to me, perhaps similar to ululō? —JohnC5 04:16, 6 November 2016 (UTC)
@JohnC5: More importantly, is the "i" geminated? --kc_kennylau (talk) 04:23, 6 November 2016 (UTC)
I don't know of an etymological reason why it would be geminated. —JohnC5 04:27, 6 November 2016 (UTC)
@kc kennylau, JohnC5: The etymology of “ēiulō” on page 596 of the Oxford Latin Dictionary (1st ed., 1968–82) reads “ei + -i- + -ulo; for suffix cf. iubilo”, which strongly suggests to me that the -i- is geminated. The etymology of “iūbilō” on page 977/2 of the Oxford Latin Dictionary (1st ed., 1968–82) reads “cf. io¹; for term. cf. sibilo”, the one for “ĭō¹” on page 963/3 of the Oxford Latin Dictionary (1st ed., 1968–82) is simply “Gk. ἰώ”, and “sībilō” on page 1,753/1 of the Oxford Latin Dictionary (1st ed., 1968–82) has “next + -o³”, where “next” is “sībilus¹” on page 1,753/1 of the Oxford Latin Dictionary (1st ed., 1968–82), whose etymology reads “onomat., cf. Gk. σίζω, ψίθυρος”. I’m sure y’all can look up the Greek yourselves (!)… — I.S.M.E.T.A. 19:20, 6 November 2016 (UTC)

Flood flag[edit]

Could you please get a flood flag if you're going to be making a lot of automated edits. DTLHS (talk) 18:52, 30 December 2016 (UTC)

I don't know what that means. (I found the Requests for flood flag page though.) I have only 19 entries left to go through at the moment. — Eru·tuon 18:56, 30 December 2016 (UTC)
Oh, ok, don't worry about it then. DTLHS (talk) 18:59, 30 December 2016 (UTC)
I didn't pay attention to how many pages were in my AWB list, but if there is some number at which I should look for a flood flag, I can put in a request next time. — Eru·tuon 19:01, 30 December 2016 (UTC)

X-SAMPA template[edit]

This was deleted for a reason. Please don't recreate it. The main problem is that old discussions have uses of the old template that end up using the new template, and we get module errors and other strange results. If you're going to create an X-SAMPA template, don't use the same name. Also, one would expect a template called "X-SAMPA" to display X-SAMPA, not IPA.

To be honest, I don't see the point of even having a dedicated X-SAMPA template- if you want to convert X-SAMPA to IPA, you're better off adding a named parameter to one or more of the IPA templates that takes X-SAMPA as its input. Chuck Entz (talk) 04:33, 13 January 2017 (UTC)

({{x2i}} already converts X-SAMPA to IPA. —suzukaze (tc) 04:36, 13 January 2017 (UTC))
Ahh, I wasn't aware there was already a template for this. My mistake. It would help if it were linked somewhere. Anyway, the template I created doesn't work. Please delete it again... — Eru·tuon 05:32, 13 January 2017 (UTC)

Arktos, bear, Ursa Major, North, Arctic[edit]

Hi, there are several problems here, probably aggravated by Wiktionary's lack of citations; and there are several places where the etymology might be relevant. Greek arktos of course just means "bear", and only indirectly is the name of the constellation: feel free to indicate that however you wish. Arktos is thought by some linguists to be cognate with L. Ursa, but not by others; those who think so may derive "arctic" from arktos, while others derive it from the Proto-Indo-European root *Rtko that appears in Sanskrit rksas, "North": in which case the Greek for bear comes from the constellation, and not the other way around. (Becker, Carl J. (2004). A Modern Theory of Language Evolution. iUniverse. pp. 228–229. ISBN 978-0-595-32710-2) I have no opinion on the matter, nor any desire to edit Wiktionary, but just note that the matter does not appear to be settled. All the best, Chiswick Chap (talk) 09:15, 13 January 2017 (UTC)

@Chiswick Chap I do not see any uncertainty in this issue, after looking up the Sanskrit word you mention. As you say, the Greek word ἄρκτος(árktos) is cognate with Latin ursus and Sanskrit ऋक्ष(ṛkṣa), derived from Proto-Indo-European *h₂ŕ̥tḱos. But according to the entries on the Sanskrit and Latin words, the words only have the meaning "bear", not "north". The meaning "north" only arose in the Greek word, presumably because of the constellation. I guess Sanskrit must call the constellation Ursa Major something other than "bear". According to the Translations table in the entry for the noun north, the Sanskrit word for "north" is उत्तर(uttara) (though there's no entry on that word yet). — Eru·tuon 09:35, 13 January 2017 (UTC)
Thanks. Then Becker must be a flaky source: very possible. Chiswick Chap (talk) 09:39, 13 January 2017 (UTC)
"Flaky" is a polite way to express it. I've quickly previewed the book in Google Books and let's just say, Becker is clearly not a linguist. --Florian Blaschke (talk) 02:11, 17 January 2017 (UTC)


The clipping occurred in either of the languages. If the clipping occurred in Old East Slavic, then the word can be said to inherit from the OES clipped form that was so formed. —CodeCat 00:18, 14 January 2017 (UTC)

"Truth" and "Most likely the truth"[edit]

@Erutuon Thank you for your message. You neither know me, nor I you; but you would probably be aware that only a very small proportion of etymologies traced back as far as they do are 100% accurate, except those of classical origin; but most etymologies are most likely to be true, including the reconstructed roots. That is the difference between those and what you have presented on the Talk page of cat that is true to every sober thinking mind; therefore my reason for leaving your paragraph to stand alone without any qualification. I may have left that ambiguity open, so will correct it shortly. My part is to try to eliminate the "art" aspects of etymologies where they are not true, so to retain the scientific ones that are as true as can be established. In the case of the creatures' names that you presented the Celtic dialects have their own words for them; whereas, as far as I understand - subject to correction - those dialects have no other words for "cat"; and therefore, it cannot be conjectured that those forms were borrowed from Late Latin, unless it be proved that they did not exist during the Iron Age period. There are a few such lexemes that have been retained in Old English due to their analogous forms in the Germanic dialects and therefore part of Anglo-Saxon. Although, I resent the word "evolution" due to all the Darwinian myths, there are a few lexemes in English - I can only think of care at the moment - whose meanings have evolved semantically from a hybrid of that in Brittonic Celtic and from Germanic whence they derive. In the aforesaid example, its Germanic meaning is "sorrow, distress, cark" et cetera; whereas in Welsh and Cornish its analogous forms mean "love" and in Latin carus is "dear". Most of the Old English words are certainly Germanic as is Old English with all its grammar, of course: but the simplest etymologies are usually the best and most accurate! Andrew H. Gray 11:01, 21 January 2017 (UTC)Andrew talk


See w:Fragment identifier. —CodeCat 02:24, 28 January 2017 (UTC)

@CodeCat: Ahh! I suppose technically anchor refers to the item on the page, not the text used to identify it. I'll revert myself. — Eru·tuon 02:27, 28 January 2017 (UTC)
As far as I know, it refers to the HTML tag that creates the link target. It's <a>, which is short for anchor. —CodeCat 02:37, 28 January 2017 (UTC)
@CodeCat: Anchor also has the other sense that seems to be equivalent to id, and strangely that's the only sense mentioned in the entry on anchor. Not sure how that semantic change happened. — Eru·tuon 02:48, 28 January 2017 (UTC)
id= is the HTML attribute that sets the link target on an element. In the past, you'd do <a name="target">...</a> but that doesn't work in HTML 5 anymore, you use the id= attribute instead. If you look at the HTML for anchor for example you see that every header has <span id="(header title)">(header title)</span>. —CodeCat 02:55, 28 January 2017 (UTC)

Why did I delete the request in the etymology of ψιλός(psilós)?? "Learn...! "[edit]

Reason of why I deleted the request in the etymology of ψιλός(psilós): That etymology is true, some kind of guy typed like "the lambda and the iota do not correspond to the PIE "s" and "o" of the root term *bhosós..." while in English bare the "r" does correspond to the "s" of root *bhosós (and also the "a" of bare does correspond to the first "o" of PIE root *bhosós). So that's why I had to delete "the lambda and the iota do not correspond to the PIE "s" and "o" of the root term *bhosós...". Just simple like that! Helolo1 (talk) 00:06, 5 February 2017 (UTC)

@Helolo1: Okay, well, I was the "some kind of guy" who wrote the note, and I see your reasoning. But it's not correct; the English r is explained by the sound change of rhotacism and the change of o to a by a Proto-Germanic vowel shift, while the Greek iota and lambda are not explained by any sound change, and an explanation is needed. — Eru·tuon 00:12, 5 February 2017 (UTC)
@Erutuon: How cute, you still have to delete that request, I already messaged you my reason, I don't wanna talk about it again. PLEASE change for the reader's sake! ("Listen: iicanhhackthiswtpg") —This unsigned comment was added by Helolo1 (talkcontribs) at 00:31, 5 February 2017 (UTC).
@Helolo1: I don't have to delete the request. The etymology needs more information. And it was @CodeCat who reverted you this time. — Eru·tuon 00:37, 5 February 2017 (UTC)
Where did you get that etymology? The argument is presumably a zero-grade *bʰs- with a suffix -īlós, but this suffix doesn't exist anywhere else, and all of the given cognates have o-grade (which leads me to believe that this wasn't even ablauting and should probably just be reconstructed as *bʰasos, unless there are other cognates.) Beekes says there's no etymology unless it's with ψῆν, and while the second part is laughable I'm inclined to agree with the first. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 21:28, 5 February 2017 (UTC)


Sorry for deleting one of the modules that you made. I didn't know that you were working on it. Pkbwcgs (talk) 17:27, 6 February 2017 (UTC)

@Pkbwcgs: The module is the page with programming code. What you deleted was an example on the documentation page. It's fine as long as you don't do it again. Steer clear of module documentation pages unless you know what's going on. — Eru·tuon 17:40, 6 February 2017 (UTC)
@Erutuon: How can I deal with module documentation page? Pkbwcgs (talk) 17:43, 6 February 2017 (UTC)
@Pkbwcgs: I don't understand. Why would you need to do anything on module documentation pages? — Eru·tuon 17:47, 6 February 2017 (UTC)
That sounded sort of contemptuous. What I mean is, unless you're writing module code, why would you be editing the documentation pages? — Eru·tuon

Diaereses in Ancient Greek[edit]

If you look at the pages in Cat:E, they pretty much all have the same thing in common: at least one iota with a diaeresis. Apparently your code is unable to deal with them. You need to find a solution for this, or get help from someone who can. Thanks! Chuck Entz (talk) 07:58, 7 February 2017 (UTC)

@Chuck Entz: Hmm, I'm skeptical that it's my function (diacritic reordering). The errors from mw.ustring do not tell the module or line, so the match that's returning the error could be anywhere. I added a diaeresis-acute testcase to Module:typing-aids/testcases, which uses the diacritic reordering function, and it appears to work fine. Similarly, Module:grc-translit transliterates diaeresis-acute just fine too, so it's not the tokenize function. I suspect the error is somewhere else, perhaps in Module:grc-decl or Module:grc-accent. @ObsequiousNewt, what do you think? — Eru·tuon 10:33, 7 February 2017 (UTC)
@Erutuon, Chuck Entz: The error is definitely in the tokenize function of Module:grc-utilities. For instance, the errors "Lua error in Module:grc-decl at line 1301: can't find a conjtype ἡρω ἡρω" from ἡρώϊος(hērṓïos) and "Lua error: bad argument #1 to 'match' (string expected, got nil)" from Ὀϊζῡ́ς(Oïzū́s) tell me that the tokenize function is inserting a nil before the vowel with the diaeresis. In Module:grc-decl, which calls this function through strip_tone in Module:grc-accent which is then called by several match statements, this is resulting in nil-terminated string truncation in some cases and attempts to perform string operations on a nil in others. The distribution is not yet clear to me. —JohnC5 15:54, 7 February 2017 (UTC)
@JohnC5, Chuck Entz: I've moved my version of the tokenize function from Module:grc-translit/sandbox, as it fixes the module errors resulting from the previous way of handling diaereses. — Eru·tuon 19:22, 7 February 2017 (UTC)
All, but two AG errors has disappeared from CAT:E, but those seem to pertain to something else. If you could fix them? —JohnC5 19:52, 7 February 2017 (UTC)
@JohnC5: I don't think I can help with those. They seem to relate to one of the other modules. — Eru·tuon 20:59, 7 February 2017 (UTC)

IPAchar and superscript H and X in Middle Chinese pronunciations[edit]

I noticed that you've been active in editing {{IPAchar}} lately. I was just updating a Japanese entry with a derivation from Middle Chinese, when I stumbled across a change in template behavior. Middle Chinese readings often include superscript H or X, as at or . I have been copying those when adding tr= values to {{der}} calls in JA etymologies, applying superscript as <sup>H</sup>. This previously worked just fine, but recently, it now produces a rather obnoxious inline error message, invalid IPA characters (<>H</><>H</>).


  • {{IPAchar|/d͡ziɪ<sup>H</sup> ʔʉi<sup>H</sup>/}}

Current result:

  • /d͡ziɪH ʔʉiH/ invalid IPA characters (HH)

Would you kindly undo the change that now rejects <sup>H</sup>?

Looping in @Wyang as one of the more active ZH editors.

‑‑ Eiríkr Útlendi │Tala við mig 21:09, 10 February 2017 (UTC)

It still works fine. It only produces the error message in previews. DTLHS (talk) 21:18, 10 February 2017 (UTC)
@Eirikr: Actually, the module rejected superscript capital H before; the only difference is that before it didn't show a message, and simply placed entries in Category:IPA pronunciations with invalid IPA characters. Perhaps matched HTML tags should be removed before searching for invalid symbols. — Eru·tuon 21:24, 10 February 2017 (UTC)
Now the error message ignores the HTML tags, only listing H as an invalid character. — Eru·tuon 01:10, 11 February 2017 (UTC)
  • Aha! Thank you both for the explanations (DTLHS and Erutuon), and thank you Erutuon for the change.
Given the use of H and X (and possibly other superscript capital Latin letters) to mark pronunciations in Middle Chinese, are any of you aware of a way of including these in a phonetic transcription, that does not fall afoul of the invalid character categorization? {{IPAchar|/d͡ziɪ}}<sup>H</sup>{{IPAchar|/}} seems rather inelegant...
As a side note, I thought that the angle brackets were part of IPA notation, to show graphemes as opposed to phonemes, but these also produce warnings, as with {{IPAchar|⟨a⟩}}. Did I get the wrong end of the stick on that one? Are angle brackets disallowed in IPA? ‑‑ Eiríkr Útlendi │Tala við mig 23:59, 13 February 2017 (UTC)
I just noticed that too. I've added angle brackets as marking a graphemic representation to Module:IPA (alongside slashes and square brackets). As to H and X, the best solution would be to add them to the list in Module:IPA/data/symbols (and any other capitals commonly used in IPA transcriptions), but you should propose that in the beer parlour to see what others think. — Eru·tuon 00:13, 14 February 2017 (UTC)

Throw cold water on[edit]

Any idea why throw cold water on is showing up as a six-syllable term? — SMUconlaw (talk) 17:32, 12 February 2017 (UTC)

Ahh! It's because it uses /ɔʊ/, which was not in the regular expression for English diphthongs ending in ⟨ʊ⟩. I've added the diphthong, so now it is only counted as 5-syllable. — Eru·tuon 17:41, 12 February 2017 (UTC)
Great! Thanks. — SMUconlaw (talk) 17:43, 12 February 2017 (UTC)

Edits to -eius[edit]

Hi! Thanks for your question (why unliking?). So, I hope I'll be able to explain my motivation clearly.

I am extracting etymological relationships from Wiktionary, using an automated code. It is working pretty well. To test it go to [1]. To create this tool I created a database of etymological relationships. I am finding incorrect entries thanks to this tool, or etymologies that are written in a slightly incorrect way. The entry -eius showed up while I was exploring the data not because it is incorrect but because it is written in a slighly "not formal" way. Let me explain what I mean.

Etymology Sections are written in a pretty standard way which actually allows automatic data extraction. Usually they have the following structure - for an arbitrary ENTRY:

   From Proto-Indo-European *WORD, from Middle English ANOTHER WORD etc. Cognate to English COGNATE. 

Alternatively they have the following structure:

   From Proto-Indo-European *WORD, from Middle English ANOTHER WORD etc. Cognate to COGNATE.

This regular structure implies I can use an algorithm to extract etymological relationships:

   ENTRY etymologically derives from WORD
   WORD etymologically derives from ANOTHER WORD.

This authomatic extraction breaks when there are additional words in the etymology section that are embedded into links or templates but are not relevant to infer etymological relationships. I have to say this does not happen very often. In this case it happened. I thought removing the link would't reduce comprehensibility, so I did it. But I am open to discussion.

Actually I was thinking a way around this would be to have some kind of new template or html code to signal when a word is not relevant for the etymological definition, some kind of "qualifier". Words like genitive, ablative etc can be signaled by linking to the glossary: e.g.

   From French *WORD, from genitive ANOTHER WORD etc. Cognate to English COGNATE. 

Not sure about your case, as of now. Epantaleo (talk) 22:12, 13 February 2017 (UTC)

@Epantaleo: Huh. As far as I know, words involved in the etymological descent of the term are already tagged differently from words simply being linked: they use {{etyl}} plus {{m}}, or {{der}}, or {{inh}}, or {{bor}}, or {{transl}} (and so on). So your code should not be thinking (to anthropomorphize) that the template {{m}} on its own contains a word involved in the etymological descent of the term. That template is only used for linking, not for indicating descent. — Eru·tuon 22:29, 13 February 2017 (UTC)
I see your point. It makes perfect sense. And in fact the majority of Etymology Sections use those templates. However many Etymology Sections do use links (i.e. [[]]) instead of templates for words that should have the templates instead. Maybe we should aim at fixing them? See for example refuge, expiate.
Epantaleo (talk) 23:11, 13 February 2017 (UTC)
@Epantaleo: I run across such cases occasionally, and usually correct them. It would be nice if a bot could be configured to do it, though. — Eru·tuon 23:14, 13 February 2017 (UTC)
Good idea. We can have a bot to find them. There will be many false positives, though, I think and I don't think it will be possible to replace them automatically. Epantaleo (talk) 23:22, 13 February 2017 (UTC)