User talk:CodeCat

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
Archives: 2009-2010 · 2011 · 2012
Start a new discussion


Thread titleRepliesLast modified
Autocannons are not machine guns1306:17, 1 November 2014
l docs418:22, 31 October 2014
Another Day, Another Odd Error102:38, 31 October 2014
Module:compound/templates: derivsee1402:13, 31 October 2014
Remove Category:Italian verb forms when {{temp|head|it|verb form}} is present.716:23, 30 October 2014
Hebrew Suffix Category Weirdness411:35, 30 October 2014
Module:el-translit218:33, 29 October 2014
Just leave805:30, 27 October 2014
hu-suffix723:49, 26 October 2014
Latin pronunciation200:17, 26 October 2014
Removed Category:Proto-Slavic words suffixed with *-...018:36, 22 October 2014
Moved/broken rhyme links205:44, 21 October 2014
Your bot010:07, 15 October 2014
Releasing, 15 October 2014
Script-related issue in the templates405:24, 15 October 2014
Remove duplicates when there are multiple heads300:41, 15 October 2014
Rename {{temp|ar-numeral}} to {{temp|ar-cardinal}}?715:14, 11 October 2014
Could you add a poscatboiler category for "definite nouns"?207:26, 11 October 2014
Can you add a poscatboiler category for "letter forms"?007:22, 11 October 2014
Template:list:Arabic script letters/ar013:57, 10 October 2014
First page
First page
Previous page
Previous page
Last page
Last page

Autocannons are not machine guns

By definition, an autocannon and a machine gun are two different things. That's the point I was making on the Cannon definition.

2602:47:239E:3D00:2930:9CA0:7714:8E5702:28, 30 October 2014

But the point of the entry is to show what people might mean when using the word "cannon". One of those things is what is more accurately described as an autocannon.

CodeCat02:48, 30 October 2014

That's exactly what I said, that "cannon" is often short for "autocannon" when discussing aircraft. So why remove it?

2602:47:239E:3D00:2930:9CA0:7714:8E5703:31, 30 October 2014

I thought you removed it, not me?

CodeCat14:12, 30 October 2014

No, I added it and then you removed it.

2602:4B:AC4E:6800:388B:977B:6489:E4CA01:23, 31 October 2014

I'll go ahead and put it back then since it seems to have been a misunderstanding.

2602:4B:AC4E:6800:388B:977B:6489:E4CA01:26, 31 October 2014

Again you removed the definition "A large-bore machine gun". So yes, you did remove it.

CodeCat01:36, 31 October 2014

This part should be updated as you recently changed the behaviour of script detection: "and if it fails to detect the script, the language's default script (the first listed in Module:languages) will be used"

Z08:37, 31 October 2014

Ok, done. We should probably unify the documentation of all the linking templates anyway, as they all work the same way.

CodeCat16:46, 31 October 2014

Thanks. We need to have some sort of a common sub-doc for the module that would be transcluded in the middle of doc page of these templates.

Z18:02, 31 October 2014

We could just redirect them all, like we did with {{t}}.

CodeCat18:05, 31 October 2014

I see what you mean now. Yes, we can put everything (description, shortcuts) about l, m, etc. in one page, good idea.

Z18:22, 31 October 2014

Another Day, Another Odd Error

I bet you don't remember typing what's displayed here...

Chuck Entz (talk)02:36, 31 October 2014

No, not really. But that template is barely used anywhere so orphaning it would be good enough. If we want to use it again we should make a Lua version.

CodeCat02:38, 31 October 2014

Why don't we have {{suffixsee}} et al. infer the script from the language?

WikiTiki8911:50, 30 October 2014

I've added script detection now.

CodeCat14:16, 30 October 2014

Thanks! The problem now is that the line that says "Xish words suffixed with -Y" is also in that script, even though it is in English. If possible, it should separate the "Xish words suffixed with" from "-Y" and use Latn for the former and script detection for the latter. Otherwise, it would be better if the whole line used Latn. If you need an example, see ־עניו (-enyu).

WikiTiki8915:44, 30 October 2014

That's harder to do. It's not the module that displays that text, but rather Template:deriv. I suppose I could integrate that template into the module more directly. Should I?

CodeCat15:54, 30 October 2014

Actually, it seems that even the template doesn't show the text, but it's directly built into the #categorytree parser function. Do you know if there's a way to override the displayed text on that?

CodeCat15:56, 30 October 2014

I think all the settings are hardcoded, or only editable through LocalSettings.php.

DTLHS (talk)16:08, 30 October 2014

I don't know. The documentation is here, but it doesn't seem to cover that. Maybe MediaWiki developers would now.

WikiTiki8916:10, 30 October 2014

I guess you could try replacing it with some global CSS.

DTLHS (talk)16:14, 30 October 2014

Or local CSS immediately surrounding the parser function.

WikiTiki8916:31, 30 October 2014

Remove Category:Italian verb forms when {{temp|head|it|verb form}} is present.

That's it, really. Do you fancy it?

Renard Migrant (talk)15:20, 30 October 2014

Why would you specify "verb form" if you don't want it to be categorised under "verb form"? That's just contradictory.

CodeCat15:41, 30 October 2014

I mean literally remove the text [[Category:Italian verb forms]] and multiple consecutive line breaks when {{head|it|verb form}} is present, as the former is redundant to the latter.

Renard Migrant (talk)15:52, 30 October 2014

Oh like that. It's possible but it's not easy to do. The problem is knowing which entries have it and which don't. The simplest way I know of would be rather invasive at least temporarily: edit Module:headword so that it does not add "Italian verb forms" and then remove the category from any that remain in Category:Italian verb forms, until it's empty, then restore Module:headword again.

CodeCat15:58, 30 October 2014

It's easy to do with AWB but the list is so big it keeps crashing. And it would take about 5 days continuously to do the whole lot.

Renard Migrant (talk)16:19, 30 October 2014

How would AWB know which entries to edit?

CodeCat16:21, 30 October 2014

Hebrew Suffix Category Weirdness

In case you haven't noticed, the members of Category:Categories with incorrect name are all recent MewBot creations. As far as I can tell, something has changed about the way Hebrew suffixes are handled, so that there are a slew of new suffix categories without the Hebrew hyphenation character. The fact that these are all new suggests that the hyphenation character used to be added automatically. The correctly-named category exists, but at the moment seems to be solely populated by the {{he-adj-i}} template. {{poscatboiler}} then converts the unhyphenated spelling in the category to the hyphenated spelling- which doesn't match the actual category name, and causes an ugly error.

I'm not sure what needs to be changed, but the status quo looks really bad.

Chuck Entz (talk)02:40, 30 October 2014

It's odd; the entry בדיוק does display correctly, but it links to the wrong entry and adds the wrong category. I'll look into it.

CodeCat02:50, 30 October 2014

I tracked it down to this edit by User:Wikitiki89. The new version specified a range of Unicode codepoints to strip from names to produce the entry name. But this range also contains U+05BE - HEBREW PUNCTUATION MAQAF, which is the hyphen used in Hebrew. So this edit is causing it to be stripped both from all links to affixes, and from affix category names. I think the fact that the categories as generated by {{prefixcat}} and {{suffixcat}} do not have characters stripped is the actual bug.

CodeCat02:59, 30 October 2014

I've now fixed the bug I described above, so diacritic stripping is now applied to affix category names, and these categories are no longer indicated as having an incorrect name. The actual name is still wrong of course, but that's because of Wikitiki's edit.

CodeCat03:06, 30 October 2014

Whoops sorry, I didn't mean to include the maqaf in that range (in fact I thought that I had checked whether it was in that range, but I guess I missed it). I will fix it momentarily. Side note: pings don't work in liquid threads, since you don't actually sign the edit.

WikiTiki8911:35, 30 October 2014


I'm reluctant to edit this myself, please can you help again. el-translit allows for letter combinations at the beginning of an entry - but not at the beginning of a word preceded by a space (illustrated by: μπανάνα μπανάναbanána banána ― -). μπ and ντ are affected. Many thanks!

Saltmarshαπάντηση06:30, 29 October 2014

I think it should be fixed. It still won't work after anything other than a space though.

CodeCat13:40, 29 October 2014

Thanks - as usual :)

Saltmarshαπάντηση18:33, 29 October 2014

Just leave

Please do me a favor. Leave the English Wiktionary. Forever. Your substantive contribution is not all that impressive and your rude dictatorial behavior is well over the top.

Dan Polansky (talk)23:01, 24 October 2014

Seeing as Dan hasb een blocked for the above (first indef and now for 3 months), I'll solidarize with him and repeat what he said. It would be best for all of us if you looked for a different hobby. I bet there are things you're good at - but Wiktionary definitely isn't it.

Liliana 16:32, 25 October 2014

CodeCat, like Dan, is a valuable part of the Wiktionary community, in spite of some abuses. Dan at least had an edit war as an excuse- you're just recycling the same old bile from previous grudge matches.

Chuck Entz (talk)19:19, 25 October 2014

CodeCat, like Dan, is a valuable part of the Wiktionary community - lol. Do you have any substantial proof to back that up? Has anyone ever been happy with one of his changes? All I've seen so far is endless drama and disputes and the like, which has already led to our first real contributors quitting. I'd consider that a much greater loss than the marginal (if at all?) gain we get from his presence. And just looking at that edit war shows he has no interest at all in adhering to rules or even *gasp* collaborating with the community which is god fucking what Wiktionary has been about from day one!

Liliana 09:30, 26 October 2014

I don't know why you thought edit-warring was a good idea, particularly in the manner which you did: first two edits without an edit summary, then an edit with a wholly inappropriate one here. At Wikipedia, someone who habitually edit warred in the manner you do (and this is hardly your first rodeo, not even your first rodeo acting in this manner) would either face significant blockage or be forced into 1RR in some or all topic areas. While we're not Wikipedia, I think it would be beneficial for sanctions on that order to be imposed on you. At the very least, it's ridiculous that Dan gets a three-month block and you get nothing.

Purplebackpack8907:50, 26 October 2014

Edit wars are too rare a sight here to have policies such as 3RR/1RR in place. We also don't have "topics" that cover typical user activity where conflicts arise. I wish there were a mute option while blocking where user would only be forbidden to edit some or all discussion pages, with other edits unhindered. Plus, most of the established users here are or become admins sooner or later, so we'd have to elect an ArbCom first whose decisions would everyone be bound to respect. But it's all probably too much of a trouble, with questionable net benefit.

Ivan Štambuk (talk)21:42, 26 October 2014

In regard to your comments:

  1. There's probably been enough edit-warring by CodeCat to justify 3RR
  2. There's a term for the mute button. It's called topic or interaction ban
  3. I've been here for years and I don't have a mop why?
Purplebackpack8902:23, 27 October 2014

Adminship isn't something like a free trip that you can receive as soon as you amass enough frequent-flyer miles. You have to have a good understanding of Wiktionary's rules & practices, and the right temperament- of which you have neither.

Chuck Entz (talk)05:20, 27 October 2014

"Mute" is doable. As for forming an ArbCom, I doubt it could even be possible (given that most regulars either get deeply involved in heated disputes, or avoid all drama entirely; which leaves no "neutral" party willing and able to arbitrate), let alone help much. ArbCom is kind of a joke even on TOW.

But I am afraid that this thread is not the best place to discuss it.

Keφr05:30, 27 October 2014

The changes you made in {{hu-suffix}} created an unwanted side-effect. In some cases, we need to use multiple suffixes in a single etymology, but we want to categorize the entry only by the last suffix. With the new change, even the intermediate suffixes produce a category which is not correct. See rugalmas. Can you add this functionality to {{suffix}}? If not, please restore the original hu-suffix. I don't understand why it was necessary to replace the Hungarian-specific template. Not to mention that this was done without discussing it with the Hungarian editors. Is there a template policy that says FL entries must all use the general templates?

Panda10 (talk)23:05, 25 October 2014

There's no policy, but it seemed to make more sense to have it written in terms of the generic template, if at all possible. I wasn't aware of that specific case though so I guess it will need to be undone.

But I do wonder how it works with rugalmas. If you just attach the first suffix, is that not a valid word?

CodeCat23:52, 25 October 2014

The word rugalom is valid archaic noun, rugalmas is an adjective. The Hungarian suffix categories contain the PoS in their name because some of the suffixes have multiple purposes, so we want to separate the derived terms by PoS. The {{suffix}} template allows only one PoS parameter and would put the headword that has many suffixes into different categories marked with the same PoS. Another consideration is that sometimes the ety has to indicate linking vowels which are not suffixes, so categorization would make no sense. See nézeget as an example. Hungarian words can have a number of suffixes appended after each other. E.g.: szabálytalanság = szab + -ály + -talan + -ság. If I use {{suffix}}, the noun headword would appear not only in Hungarian nouns suffixed with -ság, but also in the categories Hungarian nouns suffixed with -ály and Hungarian adjectives suffixed with -talan. Too much clutter, the result would not be clean.

Panda10 (talk)16:48, 26 October 2014

But we don't normally show the complete morphological/etymological breakdown of words. Instead we only show the "outermost" derivation, the one that was applied last. So for rugalmas it would just be rugalom +‎ -as and for szabálytalanság szabálytalan +‎ -ság. Compare this for example to unevenness, which is not un- +‎ even +‎ -ness but just uneven +‎ -ness.

CodeCat17:00, 26 October 2014

In the case of szabálytalanság, you're right. I would not divide it up. I used this word only to illustrate the number of suffixes. In the case of rugalmas, I wanted to show only the modern stem and not the archaic word.

Panda10 (talk)17:10, 26 October 2014

But is that the real etymology? I suspect that when the word was first created, it was from rugalom when it was still in normal use.

CodeCat17:35, 26 October 2014

Latin pronunciation

@Kephir: I am aware that CodeCat have been editing Module:la-pronunc to make the vowel system exactly the same as that of German.

kc_kennylau (talk)02:08, 25 October 2014

Forgetting for a while that I failed to get the ping (apparently it does not work on LQT pages), why did you write this message?

Keφr20:53, 25 October 2014

I am not sure if Latin follows German's vowel system, hence the message.

kc_kennylau (talk)00:17, 26 October 2014

Removed Category:Proto-Slavic words suffixed with *-...

I'm not going recreate ~68 categories again. It's waste of my time because i don't memorize template names. I wanted to simplify the work but you have complicated.

Useigor (talk)18:36, 22 October 2014

Moved/broken rhyme links


I see that your bot MewBot has moved the English rhymes pages from "Rhymes:English:Stressed on /xxx/" to "Rhymes:English:xx-". Unfortunately, it didn't update the pages that linked to those pages or add a redirect, so now all the links at Rhymes:English are broken. Can you explain the reason for the move (I don't think there was anything wrong the previous format), and fix the broken links? Thanks.

Paul G (talk)17:36, 19 October 2014

Ok I updated the links.

CodeCat17:58, 19 October 2014

Thank you. There are still a lot of broken links in many of the other rhymes pages, though. See the Notes section on this page, for example. Could your bot have a look through the pages it moved to see what links to the old pages?

Paul G (talk)06:30, 20 October 2014


I used in the bot I created (User:WingerBot), with a few changes (mainly, I extracted calls to pywikibot.output() to a function and changed it to directly output UTF-8 encoded text to stdout/stderr, because pywikibot.output() behaves strangely when stdout is redirected to a file, automatically transliterating Arabic text). I would like to make the source code for the bot available. Are you willing to allow to be released? If so, what license do you want on it (e.g. GPL or MIT)?

Benwing (talk)23:43, 14 October 2014

GPL is ok.

CodeCat23:57, 14 October 2014

Thanks. Code is now on github, at [1].

Benwing (talk)08:20, 15 October 2014

Script-related issue in the templates

I noticed a script-related issue in our templates, I don't know which module is responsible exactly as I've completely forgotten which module do what because I wasn't here for a while but it must be related to language utilites or script utilites or their related modules so I'm bringing it up here. The problem is script is chosen solely based on the script detection function and if it fails, the "None" class is used, instead of the first script in m.lang.scripts.


{{head|ccp|noun|head=𑄚𑄳𑄟}}: 𑄚𑄳𑄟 (transliteration needed)
{{l|ccp|𑄚𑄳𑄟}}: 𑄚𑄳𑄟

Related data in Module:languages/data3/c, note "scripts":

m["ccp"] = {
        names = {"Chakma"},
        type = "regular",
        scripts = {"Cakm"},
        family = "inc"}

Related data in Module:scripts/data, note the lack of "characters":

m["Cakm"] = {
        names = { "Chakma" },
Z22:24, 19 July 2014

I suppose that if there is only one script listed, we could use it as fallback if detection fails.

CodeCat22:26, 19 July 2014
Edited by 0 users.
Last edit: 22:34, 19 July 2014

But as far as I recall we have always treated the first script in the list as the default one. and I think it's a good practice. Anyway, in this case we have only one script listed, but our templates mistakenly use "None" instead.

Z22:34, 19 July 2014

That was before we had Lua. Now, all scripts are treated as equal, with none given priority. This is still useful because there are cases where detection fails because the text actually isn't in any of the scripts. But in this case it fails because it's just not able to detect it at all. So that's a different case, and we could look at that.

That said, why can't the characters just be added to the script data instead? That would solve it.

CodeCat22:38, 19 July 2014

So that was intentional? Ok, but we should have add the characters first, it has broken older entries and has caused confusion for users.[1]

By the way, the functionality of the detect_script is not perfect.

Z23:00, 19 July 2014

Remove duplicates when there are multiple heads

The entry for أنتليجنسيا has one Arabic vocalization with two possible transliterations, representing two different pronunciations. This is expressed using 1= and head2=, but for this to work properly, the code in Module:headword should check for and remove duplicate heads. (I already do this in various places in Module:ar-verb; you could reuse e.g. the contains() and insert_if_not() functions from there.) Thanks.

Benwing (talk)00:15, 15 October 2014

This doesn't work because Module:headword expects each transliteration to match up with its corresponding headword. If we're going to allow multiple headwords and multiple transliterations per headword, it'll get really messy. Furthermore, no other template on Wiktionary supports multiple transliterations for a single term.

CodeCat00:17, 15 October 2014

What I'm asking you to do is to modify line 287 so that you copy 'heads' to a new array with duplicates removed before concatenating. This is easy to do. I would do this myself if I had permission to modify this file. Take a look at أنتليجنسيا and you'll see what I'm talking about.

Benwing (talk)00:32, 15 October 2014

You have to consider the implications of such a change though. Let's say that you have these parameters: head=A|head2=A|head3=B|tr=X|tr2=Y|tr3=Z. If your change is implemented, that ends up looking like this:

A or B (X or Y or Z)

It's now no longer obvious which transliteration belongs to which headword.

Concerning this specific case, though, what you're doing seems to be something other than transliteration. You're really adding additional pronunciation details into the transliteration field. Those should really go in the pronunciation section. Transliteration should not be used as a substitute for such distinctions.

CodeCat00:41, 15 October 2014

Rename {{temp|ar-numeral}} to {{temp|ar-cardinal}}?

The template {{ar-numeral}} is being used exclusively for cardinal numerals, and I modified it to reflect this, so it now puts things into Category:Arabic cardinal numbers as well as Category:Arabic numerals. Since you're quick with bot operations, can you rename the template and calls to it to {{ar-cardinal}} or similar? Thanks.

Benwing (talk)10:00, 11 October 2014

"Cardinal number" is not a part of speech though. That's why it's categorised differently.

CodeCat12:49, 11 October 2014

I don't understand. What's the purpose of Category:Arabic cardinal numbers if not to put cardinal numbers in it? Maybe pedantically it should be Category:Arabic cardinal numerals but so be it.

Benwing (talk)14:28, 11 October 2014

Headword-line templates are meant to reflect parts of speech. But "cardinal number" is not necessarily a part of speech as we discussed before, some cardinal numbers belong to other parts of speech (Dutch miljoen).

CodeCat14:30, 11 October 2014

Well, you'd have the same objection with just "numeral", I suppose. The thing I was trying to avoid was people thinking that {{ar-numeral}} is useful for e.g. Abjad numerals like ب or Eastern Arabic numerals like ٢.

Benwing (talk)14:40, 11 October 2014

Things with numbers and numerals are kind of confused, and they were a rather contentious issue for a long time. No consensus could be reached and so things were left in an indeterminate state with multiple conflicting uses of terms and categories. For example, there were "numeral", "number", "cardinal numeral", "cardinal number", "ordinal number" and "ordinal numeral" categories all containing similar entries, with different languages using different names.

Now, we've settled on this situation:

  • "Numeral" is considered a part of speech, and contains all (cardinal) number terms that are not clearly members of another part of speech. These entries receive the ===Numeral=== header.
  • All cardinal number terms, regardless of part of speech, go in the "cardinal numbers" category.
  • All ordinal number terms, regardless of part of speech (although they are generally adjectives), go in the "ordinal numbers" category.
  • Symbols for numbers, such as 1, 2, 10, 12 go in the "numeral symbols" category. These entries receive the ===Symbol=== header. (For languages like Chinese, there's not a clear distinction here because all symbols stand for concepts. There's little difference in Chinese between the word for the number 1, and the symbol for it.)

I hope that clears things up some.

CodeCat15:00, 11 October 2014

Could you add a poscatboiler category for "definite nouns"?

This is used for Category:Arabic definite nouns. Ideally this should automatically get triggered when using {{definite of}}, or something of that sort (maybe a different {{definite noun of}} would be needed?)

Benwing (talk)10:27, 8 October 2014

If it's really a category for noun forms, then the entries should probably go in Category:Arabic noun forms instead.

CodeCat12:30, 8 October 2014

It's not. The entries that go in it are lemma forms that have a definite article in them.

Benwing (talk)07:26, 11 October 2014

Can you add a poscatboiler category for "letter forms"?

Category:Arabic letter forms is non-empty but not yet created because there isn't an obvious way to insert a category boiler. In this case, a "letter form" is one of the forms of an Arabic letter, which may have up to four separate glyphs (initial, medial, final, isolated).

Benwing (talk)07:22, 11 October 2014

This template looked strange, and had the word حرف sitting in place of the letter for b. I commented out the hypernym=حرف param and this seems to have fixed it, along with adding a translit for ا. Not sure the purpose of the hypernym arg.

Benwing (talk)13:57, 10 October 2014
First page
First page
Previous page
Previous page
Last page
Last page