User talk:CodeCat

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
Archives: 2009-2010 · 2011 · 2012
Start a new discussion

Contents

Thread titleRepliesLast modified
Lemma List1419:38, 31 July 2014
wrong reverts217:36, 31 July 2014
MewBot got the language wrong116:23, 31 July 2014
"Is it really that hard to check entries before/after saving?"022:45, 30 July 2014
Why mentions from you do not work008:07, 29 July 2014
Is there a Lua sandbox or equivalent on Wiktionary?219:52, 28 July 2014
Wiktionary:Beer parlour/2013/October#Multiple headwords and transliterations216:43, 28 July 2014
Edits by 94.12.225.64313:28, 28 July 2014
Qualifiers in pt-noun317:44, 26 July 2014
sami-110:01, 25 July 2014
Are you a sysop (to get added to the AWB list)?205:33, 24 July 2014
Thank you!010:41, 22 July 2014
Macedonian Lemma List307:54, 22 July 2014
Script-related issue in the templates423:00, 19 July 2014
Apoteozo521:58, 19 July 2014
Vandalizm words in Turkish.119:41, 19 July 2014
Japanese Template misuse by MewBot218:15, 19 July 2014
change to template lv-adj416:00, 18 July 2014
"Loan Oversetting"118:50, 17 July 2014
Script list519:46, 16 July 2014
First page
First page
Previous page
Previous page
Last page
Last page

Lemma List

I have noticed that not all lemma entries are automatically going to a lemma list for their respective language. Is this intentional or are there some issues? For example, Dutch seems to have only around 700 lemma terms in the lemma list, whereas many more lemma terms have been entered for it all in all. I thought this odd. I also noticed that for some Serbo-Croatian nouns, such as ogrlica, there is no link to the lemma list at the bottom of the page, although the Serbo-Croatian lemma list has quite a few entries, almost 20,000.

Martin123xyz (talk)19:22, 16 July 2014

Right now, only {{head}}, and by extension any templates that use it, have been modified in this way. But there are still several headword templates that don't use it, but something else. There are also some that use a special module instead of going through {{head}}. I do intend to fix those eventually, but it will take a while. Right now I'm trying to make sure that all pages that use {{head}} also specify a part of speech. There's still 3000 to fix...

CodeCat19:33, 16 July 2014

I'm looking forward to the English lemma list (also very empty at the moment) being fully populated. It's something I would use relatively often (as a reader rather than an editor), so thanks! Maybe we should destroy or redirect the out-of-date Index:English at that future stage.

Equinox 19:30, 23 July 2014

If you want to help out, something like a list of all headword templates that use neither {{head}} nor (possibly indirectly) Module:headword would be very useful. Those are the ones that would be the hardest to track down and update.

CodeCat19:32, 23 July 2014

Not really sure how I'd approach that, as I haven't learned any Lua (maybe some day, hrmm). However, I was scanning the lemma list again today and it's looking much better, actually better than the old Index:English. I really get a kick out of browsing the weird words in the J,Q,X,Z lists for some reason.

Equinox 22:51, 30 July 2014

That's because it's not just {{head}} that adds the category, but Module:headword itself now. So it also works for all the modules that use it, like Module:en-headword which all the English ones use. But there may still be some that don't use the module or the template yet, so we need to find them. Not easy, but still...

CodeCat22:55, 30 July 2014
 
 
 
 
 

wrong reverts

Hi. this and this reverts are wrong. I'm admin at tr.wiktionary and those meanings are not Turkish. Thanks...

Sabri76'talk 06:14, 31 July 2014

You should submit entries that you believe are wrong to WT:RFV. In the first edit, you removed the RFV notice, that's why I reverted. In the second, you used the {{delete}} template. That template is only for completely obvious things like if someone typed the name of the entry wrong or if there is clear nonsense in the entry that nobody would dispute. That is not the case here, as it's not clear to me that it's wrong or nonsense, so RFV is where you should go.

CodeCat10:58, 31 July 2014

I've made listings on WT:RFV for both entries now. If no evidence of the words' existence is found over the next month, they'll be deleted. (Someone might even delete them sooner because the user who added them is known to make up words.) Cheers, all.

- -sche (discuss)17:36, 31 July 2014
 
 

MewBot got the language wrong

Here. In the Turkish section, MewBot tagged it Irish (the first language on the page). I fixed it, but maybe you should make sure it isn't doing that a lot.

Aɴɢʀ (talk)16:13, 31 July 2014

Thank you for letting me know. I'm not quite sure why it happens, but I think it may have something to do with replacing parts of a list while iterating over that same list. Python doesn't like that. I changed it now so that whenever it does do a replacement, it starts iterating over again until there's no changes.

CodeCat16:23, 31 July 2014
 

"Is it really that hard to check entries before/after saving?"

No need to be so snotty.

Victar (talk)22:45, 30 July 2014

Why mentions from you do not work

I recently discovered why I never get mention notifications from you: because your signature uses a template. Echo uses regular expressions on unparsed diffs to detect signed comments. If a timestamped signature is not found, notifications are not generated. Read it yourself.

And I recall reading somewhere that templates in signatures put an unacceptable amount of strain on servers. TOW forbids them, for what it may be worth. So you should probably change it anyway.

Keφr08:07, 29 July 2014

Is there a Lua sandbox or equivalent on Wiktionary?

There's Module:Sandbox on WP to allow you to try out Lua modules before putting them in the main namespace. Is there any equivalent in Wiktionary? I created Module:fro-verb and so far I've been editing and testing it in-place, which necessarily means that I sometimes commit bugs. I'm thinking of expanding it a lot and I'd rather do the testing elsewhere.

Benwing (talk)05:33, 28 July 2014

Module:User:Benwing and everything below.

Keφr09:17, 28 July 2014

Thanks.

Benwing (talk)19:52, 28 July 2014
 
 

That discussion didn't really go anywhere, but it seems that there was 2-1 support for the second option, and it is too small of a detail to be controversial. Can you implement it?

WikiTiki8915:21, 28 July 2014

I'm working on the headword module anyway so I will probably include this along with it.

CodeCat16:18, 28 July 2014

Alright, thanks!

WikiTiki8916:43, 28 July 2014
 
 

This is the IP who geolocates to Sky Broadband or Easynet in London (occasionally elsewhere in the UK), and who's been making a real mess of Japanese entries and everything in English in the realm of the supernatural. They tend to come up with terms in languages they don't know by a combination of guessing, Bing Translate, and various fan and amateur sites, but even their English edits show a very poor understanding of what they're working with. Now they've apparently branched out into Dutch. I reverted some edits that introduced Dutch and German terms not in any of the online sources I've checked, but I'd appreciate it if you could take a look to weed out any less straightforward mistakes, and verify that I've been correct in the things I've done so far. It would also help to know their level of (in)competence in this new area so I have a better idea how to deal with them.

Chuck Entz (talk)05:07, 28 July 2014

What they created was mostly ok, except the formatting was a bit off (two headword lines in one section). I removed the usage note at blad because it gave the impression that people normally say "blaren" in speech which they certainly don't, it's very rare.

CodeCat11:58, 28 July 2014

What about the edits I reverted at imp and eviscerate?

Chuck Entz (talk)13:13, 28 July 2014

The edit at eviscerate was ok, but I've never heard of imp in Dutch so that would need RFV.

CodeCat13:28, 28 July 2014
 
 
 

Qualifiers in pt-noun

Why did you remove them?

Ungoliant (falai)17:32, 26 July 2014

Because there is no such feature currently in Module:headword, and there is no established practice for such qualifiers.

CodeCat17:34, 26 July 2014

So add it to Module:headword or don’t use it. The template was working well before. And there is established practice for qualifiers in headword lines: read, sit, eat.

Ungoliant (falai)17:43, 26 July 2014

Ok, I'll try to add it into the module then. It won't be too hard, but please be patient and don't revert it again. Your revert broke every Portuguese noun entry on Wiktionary.

CodeCat17:44, 26 July 2014
 
 
 

Is [[sami-]] salvageable? I think you know better than me.

Keφr09:48, 25 July 2014

I fixed it.

CodeCat10:01, 25 July 2014
 

Are you a sysop (to get added to the AWB list)?

Hello. Do you have sysop privileges? I'm looking to do a bunch of regex-type changes to the conjugations of Old French verbs and it's too painful to do this without something like AWB.

Benwing (talk)00:34, 24 July 2014

I am but I have no idea how I would go about that. It might be better to ask in WT:GP?

CodeCat00:42, 24 July 2014

I've asked on WT:BP, which is what the AWB page recommends.

Benwing (talk)05:33, 24 July 2014
 
 

Thank you!

Thank you for deleting Modulis:it-head. It is my mistake. I try to made this in Latvian wiktionary, and mistaken. --Čumbavamba (talk) 10:41, 22 July 2014 (UTC)

New thread10:41, 22 July 2014

Macedonian Lemma List

I've noticed that the number of Macedonian lemmas in the list is decreasing. Why is this so? Is someone deleting entries? Are you removing entries from the lemma list because they were not actually lemmas and had gotten misplaced? I can't really know what's happening because there are over 5,000 entries in the list, such that I can't notice when something has changed.

Martin123xyz (talk)14:19, 21 July 2014

I have no idea to be honest. Can you point me to an entry that was there before, but isn't anymore?

CodeCat14:22, 21 July 2014

I'm afraid I cannot because I haven't noticed anything specific. I just know that yesterday there were about 5440 entries in the list, whereas there are 5320 now.

Martin123xyz (talk)17:38, 21 July 2014

Now, the list suddenly has about 5760 terms - how odd.

Martin123xyz (talk)07:54, 22 July 2014
 
 
 

Script-related issue in the templates

I noticed a script-related issue in our templates, I don't know which module is responsible exactly as I've completely forgotten which module do what because I wasn't here for a while but it must be related to language utilites or script utilites or their related modules so I'm bringing it up here. The problem is script is chosen solely based on the script detection function and if it fails, the "None" class is used, instead of the first script in m.lang.scripts.

Examples:

{{head|ccp|noun|head=𑄚𑄳𑄟}}: 𑄚𑄳𑄟
{{l|ccp|𑄚𑄳𑄟}}: 𑄚𑄳𑄟

Related data in Module:languages/data3/c, note "scripts":

m["ccp"] = {
        names = {"Chakma"},
        type = "regular",
        scripts = {"Cakm"},
        family = "inc"}

Related data in Module:scripts/data, note the lack of "characters":

m["Cakm"] = {
        names = { "Chakma" },
}
Z22:24, 19 July 2014

I suppose that if there is only one script listed, we could use it as fallback if detection fails.

CodeCat22:26, 19 July 2014

But as far as I recall we have always treated the first script in the list as the default one. and I think it's a good practice. Anyway, in this case we have only one script listed, but our templates mistakenly use "None" instead.

Z22:34, 19 July 2014

That was before we had Lua. Now, all scripts are treated as equal, with none given priority. This is still useful because there are cases where detection fails because the text actually isn't in any of the scripts. But in this case it fails because it's just not able to detect it at all. So that's a different case, and we could look at that.

That said, why can't the characters just be added to the script data instead? That would solve it.

CodeCat22:38, 19 July 2014

So that was intentional? Ok, but we should have add the characters first, it has broken older entries and has caused confusion for users.[1]

By the way, the functionality of the detect_script is not perfect.

Z23:00, 19 July 2014
 
 
 
 

Hello CodeCat,

Thanks for your additions to apoteozo. Though it could certainly use a bit of expansion, in what way does the entry need clean up? I'm afraid I don't quite follow. The entry is new, less than an hour old.

Wikijeff (talk)19:20, 19 July 2014

The definition "to become a god" is worded as a verb, so it suggests that the word is a verb too. But the word is a noun so this is confusing.

CodeCat19:20, 19 July 2014

Okay, that makes sense. I have changed the definition to read "Apotheosis, to become a god." The first clause (before the comma) is in keeping with the definition of apoteozo in ESPDIC. The second clause (after the comma) in the sentence is intended simply for clarity. I'm going to remove the clean up tag now, unless you object.

Wikijeff (talk)21:37, 19 July 2014

Maybe something like "(the act of) becoming a god"?

CodeCat21:41, 19 July 2014

How about "Apotheosis: the fact or action of becoming or making into a god." This is the English language definition of the term almost verbatim. I've only slightly modified the punctuation and dropped the reference to "deify" because that, in Esperanto, would be something like apoteozita.

Wikijeff (talk)21:54, 19 July 2014

That's ok.

CodeCat21:58, 19 July 2014
 
 
 
 
 

Vandalizm words in Turkish.

Hi, 88.XXX.XX.XXX on IP doesn't writes true words. "emes, karamazdan, yağday..." words isn't Turkish. You can look at there http://tdk.gov.tr/index.php?option=com_bts&arama=kelime&guid=TDK.GTS.53ca6718384e41.16143698 for example.

123snake45 (talk)13:16, 19 July 2014

Düşerge isn't camp at Turkish. It is "pay, miras payı". Camp is düşərgə at Azeri. 88.XXX.XXX.XX IP prefabricated that word.

123snake45 (talk)19:41, 19 July 2014
 

Japanese Template misuse by MewBot

There are three types of Japanese kanji categories that normally show up in Wanted Categories:

  1. [[Category:Japanese terms spelled with <kanji> read as <hiragana>]]. The category takes takes {{ja-readingcat}} with three numbered parameters:
    1. The kanji (required)
    2. The hiragana (required)
    3. The type of reading- normally either "on" or "kun". It can be left blank, but it gets added to a cleanup category.
  2. [[Category:Japanese kanji read as <hiragana>]]. The category takes {{ja-readascat}}, with only one parameter:
    1. The hiragana (required)
  3. [[Category:Japanese terms spelled with <kanji>]]. This isn't a Japanese-specific type of category- I've always just used {{charactercat}}, which takes two parameters for these entries:
    1. The language code (ja)
    2. The kanji

The first category is added by a template to the entries themselves. {{ja-readingcat}} in the first category adds the other two categories to that category. All three category types should be very easy to automate: all of the information needed to populate the template parameters is included in the category name in a very consistent pattern. The third parameter for {{ja-readingcat}} is an exception, since it's completely unpredictable- but it's optional

Haplology never got around to putting much error-checking in these, so bad input creates categories that look deceptively normal. In the 28 categories of the second type that I just fixed, {{ja-readingcat}} is the wrong template for such categories and "kanji" is the wrong first parameter, but it sort of works. For example, in the original version of Category:Japanese terms spelled with kanji read as ゆみ, the first line reads "This category lists Japanese terms spelled with kanji read as ゆみ." It's only when you check the linked word that you find that it's linking to "kanji#Japanese" as if kanji were the entry for a CJKV character. All three categories at the bottom of the page are bogus, but look real. The first should be something like Category:Japanese terms spelled with 弓, but is instead [[Category:Japanese terms spelled with kanji]]. The second is a self-reference, and the third shouldn't be there at all.

Before I started using the Japanese templates, I spent a good bit of time looking at existing categories and the entries that referenced them to see what the current practice was, and how everything worked. You should have, too. Please be more careful.

Chuck Entz (talk)07:31, 19 July 2014

I suppose that rather than considering this an error, we could adjust the template so it handles the case where the first parameter is "kanji" properly?

CodeCat10:59, 19 July 2014

Why? {{ja-readascat}} and {{ja-readingcat}} do different things. If you want a template to do both, you don't need a parameter to tell which one- it's all there in the category name:

  1. "Japanese terms spelled with " + <character>:
    1. Nothing following: {{charactercat|ja|<character>}}
    2. Followed by " read as " + <hiragana>: {{ja-readingcat|<character>|<hiragana>}}
  2. "Japanese kanji read as " + <hiragana>: {{ja-readascat|<hiragana>}}

It would be nice to have something with a name like {{ja-readcat}}, having one optional parameter for the reading type, which would only be used in cases now covered by {{ja-readingcat}}. This would make it extremely easy for humans to use and require no decisions by bots beyond recognizing the categories that use it. Using the new name rather than either of the old ones would mean not having to allow for the old parameters.

The simplest thing would be to just have the new template behave as above, acting as a front end to call the correct template with the correct parameter(s). If you want to merge the templates, then incorporate the logic of the two ja templates in the new template, but don't mess with the charactercat part of it, for compatibility with all the other languages.

Chuck Entz (talk)18:14, 19 July 2014
 
 

change to template lv-adj

Hi! I undid your changes to {{lv-adj}}, since they caused a problem with indeclinable adjectives like rozā or aizkuņģa: these were being presented as if they had a definite form, when in fact they don't. This is probably because of some little glitch in the conditional structure after you introduced all those spaces in the template. Personally, I prefer templates without spaces (strangely enough, I find them much easier to read without all the extra spaces and '<--'s and '-->'s), but I have nothing against you introducing them in templates if you want to -- just make sure you let me know when you make a change like that to a Latvian template, so that I can check that nothing has changed in the output (I will know places to look that you probably won't think of); if something has changed, I can let you know and you can fix it without me undoing your changes. No big deal, just a little thing. --Pereru (talk) 05:16, 21 June 2014 (UTC)

Pereru (talk)05:16, 21 June 2014

Could you tell me what needs to be changed then? I find your original code completely unreadable so I have no idea what is wrong.

CodeCat11:50, 21 June 2014

I also find your version unreadable, so it's difficult for me to find out exactly what was wrong with it; besides, I'm not exactly swimming in free time these days, as you may have noticed. If you're interested, you can try to do the detective work yourself -- just run a page like rozā or starpzvaigžņu, which describe indeclinable adjectives, through one of the versions you created (which incorrectly labels them as having a definite form), and compare it to my version (which doesn't). If you're not that much into it, then just leave the template as it is. As they say, don't fix what ain't broke.

Pereru (talk)06:02, 22 June 2014

That's why I fixed it in the first place! Can you describe under what conditions there is no declined form?

CodeCat10:29, 22 June 2014

You fixed it because it wasn't broken? I don't understand your answer.

Usually, borrowed words (like rozā), or 'genitival adjectives' (really old genitive forms) like starpzvaigžņu. I indicate these with a hyphen as the first argument in {{lv-adj}} (the first #ifeq statement handles that). Is this what you wanted to know?

Pereru (talk)15:58, 18 July 2014
 
 
 
 

"Loan Oversetting"

Edited by author.
Last edit: 18:50, 17 July 2014

(NOTE: When I say "cleansed/ceunde/pure/native English", I refer to the use of English words that are of Germanic origin. They can come through any route (Spanish gave English "ranch" which ultimately derives from Proto-Germanic *hringaz, French gave English "seize" which ultimately derives from Proto-Germanic *sakjaną, Japanese gave English "skinship", and English also inherits many native words from Old English. When I say "cleansed/ceunde/pure/native English", I also refer to words that have been present in English since Old English times, even if they aren't of Germanic origin.)

Hey.

I just wished to ask you, as a Dutch speaker, if Dutch has any particular method of coining words from old roots or forming metaphors that English might lack.

See, I'm a linguistic purist, and I also write poetry in cleansed/ceunde English. Sometimes, I find it difficult to describe something in just pure English. At those times, I look to metaphors.

In this respect, various pidgin languages have come in handy. But sometimes, I like to look to other, fellow Germanic languages for answers. Most oftentimes, I will look to Icelandic, as it is a particularly conservative Germanic language. Other times, I prefer to look to Dutch for answers instead, among other things due to its long trade history with English. Years ago, my delvings into Dutch led to the word "unforstandy" (loan translation of "onverstandig") permanently entering the vocabulary of myself, my family and my friends (albeit with the slightly semantically shifted meaning of "foolish").

Since you are a native speaker of Dutch, I thought to ask you for advice on this matter.

Tharthan (talk)18:04, 17 July 2014

It's probably best to look at the structure of the word first. onverstandig comes from on- (un-) + verstandig (sensible, wise). The latter in turn derives from verstand (reason, mind, understanding), which finally is closely tied to verstaan (to understand). So you would need to follow this structure in English too.

However, the first hurdle is already that English uses a slightly different root word, understand. The second is that English does not have a noun paired with this word in the same way that Dutch has, unless you use understanding. But this doesn't allow an adjective to be derived from it in the same way, something meaning "of or related to understanding"... understandingy just doesn't really cut it.

Another approach is to look for synonyms of any of the intermediate steps. Starting from the end, you might translate verstand with mind, and following the process then gives mind(l)y and unmind(l)y. But you can also translate verstandig directly, giving wise, and then of course you simply end up with unwise which is a perfectly good translation of the Dutch onverstandig. :)

CodeCat18:13, 17 July 2014
 

Script list

Hi CodeCat, For the translations, there does seem to be a (more) automatic way of adding the word "script" (or indeed any other convenient sign) to script names used in the translations. However, I do not seem to be able to enter them. You, on the other hand, seem to have able to alter at least part of the page. How can I help clarify the script indications that are supposedly hidden on that page?

Redav (talk)01:24, 15 July 2014

Which page are you referring to exactly?

CodeCat01:38, 15 July 2014

https://en.wiktionary.org/wiki/Wiktionary:List_of_scripts

Redav (talk)19:17, 16 July 2014

The information on that page is automatically generated, based on the data in Module:scripts/data.

CodeCat19:20, 16 July 2014

And the page (https://en.wiktionary.org/wiki/Module:scripts/data) it points to is (also) protected against editing by me. I would be prepared to add the word "script" wherever appropriate in that script, if it is easy to find out how to do it in a correct and helpful way.

Redav (talk)19:45, 16 July 2014

That would not work, because firstly, this page does not control the text in the pages you are editing. Rather, it's used by templates. Secondly, those templates also need to be able to use the name without "script" attached to it.

CodeCat19:46, 16 July 2014
 
 
 
 
 
First page
First page
Previous page
Previous page
Last page
Last page