User talk:Tbot

Hi[edit]

I update {{t}} to t- or t+ as needed, and add some section header optimizations. I may be doing some related things in translations sections soon as well. Tbot 05:51, 4 September 2007 (UTC)[reply]

Um,[edit]

WTF? Isn't this still being discussed in WT:GP? Why have you started it prematurely? --Connel MacKenzie 18:26, 4 September 2007 (UTC)[reply]

Um, ... the SOP is to run some so that people can see what is going on and have it properly tested, and discuss it, and then have a bot vote. Or are people supposed to vote on hand-waving vaporware now? Robert Ullmann 22:41, 4 September 2007 (UTC)[reply]

It looked like more than trivial volume. But anyhow, yes, you are supposed to propose it before writing it. --Connel MacKenzie 06:55, 5 September 2007 (UTC)[reply]

Huh? See WT:BOTS Process: try it under your account, create a bot account, Then, do a test run (under the bot account) on some 10-100 entries until you're certain everything goes well, THEN post a request for bot status on the BP.

Besides which, this has been discussed extensively since April. And was tested and run on all the (then existing) uses of {t} then. Robert Ullmann 07:21, 5 September 2007 (UTC)[reply]

Please. Yes, I know the general idea has been thrown about, but the sub-template thing (AFAIK) does not have support. That doesn't mean you should announce you wrote something and start running it! The opposite is supposed to happen: when everyone (well...some apparent consensus anyhow) thinks an approach is plausible, then you start writing it and testing it, no? --Connel MacKenzie 21:27, 5 September 2007 (UTC)[reply]

Oh come on Connel: you are the only person with the slightest issue about the sub-templates; and your complaint appears to be about the user interface, which is irrelevent to sub-templates: they are sub-templates, not part of the user I/F. And they are a performance optimization not a penalty. So your issue makes little sense. The other feedback is all positive.

I was going to reply on WT:GP but I can't; the page is now way too big to load except in the middle of the night. When are we ever going to fix the archiving? I understand Werdna is only 16, but leaving without giving anyone the code was extraordinarily irresponsible! (If we had subpages like I tried to set up on BP, I could just go to GP/2007/September ;-) Robert Ullmann 09:52, 6 September 2007 (UTC)[reply]

Testing progress[edit]

I'd like to explain this on BP or GP, but I can only do something with those pages at some times of day, most of the time they won't load completely. (Way too big ;-)

Tbot was run on the entire wikt in April, and again on 30 August. Since then I've been making improvements, so that it will really be production status, I'm running small tests.

See the user page for documentation. Robert Ullmann 13:00, 10 September 2007 (UTC)[reply]

Purpose[edit]

Is this meant to do the {{trans-top}}/{{trans-mid}}/{{trans-bottom}} corrections as well?

While I disagree entirely with {{t-}} and {{t+}}, I was under the impression a tool like this would convert all existing translation lines of an entry to use {{t}}. But it doesn't?

--Connel MacKenzie 15:03, 10 September 2007 (UTC)[reply]

Introducing trans-top: if there is a gloss, in the usual format, either ; or triple ' or as a parameter to {{top}}, then AF already does that. If there isn't a gloss, there isn't anything a bot can do. (AF was converting it to trans-top with no parameter, but that was flooding a cleanup cat that DAVilla was trying to deal with ;-)

As to converting lines to use t, there are a number of issues (see last section of the user page). It can't deal with text in ()'s because it can't know if it is a transliteration or not, and so on. And automated conversion hasn't been discussed much at all. I'm getting the first version finished, dealing with +/-, fixing codes that are not the FL wikt codes (nb->no, ido->io, etc). Robert Ullmann 15:21, 10 September 2007 (UTC)[reply]

Request for additional text[edit]

I have found a couple of bad (or not quite correct) translations via your system. So, could you add something to the text in the generated terms such as "If you find that the translation of <x> was incomplete or in error, please correct it and (optionally) request this term to be deleted". Cheers SemperBlotto 15:51, 10 November 2007 (UTC)[reply]

Did you follow on to the (more information) link? Or is that not clear enough? Hmmm. Robert Ullmann 15:56, 10 November 2007 (UTC)[reply]

Oops. I have read it now. (p.s. Has it done enough tests yet to determine its usefullness (and find any problems)?) SemperBlotto 17:18, 12 November 2007 (UTC)[reply]

It probably has done about enough in some languages (note that it automatically limits the number of as-yet-unchecked entries per language), there are some amusing translations; but except for translation table entries that are flat wrong (which we haven't seen yet, but undoubtedly will), the entries are mostly useful even when they haven't been checked. Eight or ten editors have checked, sometimes expanded, and removed the tag in as many languages. Even Esperanto. And it's only been doing this for a few days. Robert Ullmann 20:21, 12 November 2007 (UTC)[reply]

I dislike this bot already[edit]

I think creating articles should be left to the human editors. This bot makes a lot of mistakes and leaves out a lot of information, at least in the Romanian articles I've seen. limba afgană isn't a proper noun, -ul, while it translates as an article, is a suffix, etc. It also sometimes leaves out gender and plural information, and in cases like moartă the basic form of the word. All this might not bother me so much, if there wasn't so much to go back and fix now. I don't really think it would be possible to fix these problems completely, especially if they're based on what's in translation tables in English entries. ~~— [ ric | opiaterein ] — 21:03, 19 November 2007 (UTC)~~ It also has created a couple of things like a controla which should be seen at controla. Humans are more apt to recognize typos and things like that. This bot is messy. — [ ric | opiaterein ] — 21:08, 19 November 2007 (UTC)[reply]

While I disagree with the header of this section ("I dislike this bot already"), I do wish to add a suggestion based on opiaterein's comments. There seem to be a few new entries by tbot of phrases, and a number of these seem to be mistaken, presumably added because someone mistakenly put the entire phrase in square brackets in a translation table somewhere even though it is not entry-worthy. (While it's good to know where these are, so that we can go back and change the translation tables, obviously adding deletable entries is not the way to call attention to them.) Perhaps tbot should be rigged to add only single-word entries? (I'm not thinking here of CJKV entries: I don't know enough about those languages to know whether my question makes sense for them. I'm thinking specifically of Hebrew, which I worked on, but also Romanian, per opiaterein's comments, and other alphabetic-script entries.)—msh210℠ 21:13, 19 November 2007 (UTC)[reply]

The translation tables still need to be fixed -- hiding the problem by turning off the bot doesn't solve anything. It's not the bot that's messy, it's the translation tables. Cynewulf 21:23, 19 November 2007 (UTC)[reply]

Correct, but calling attention to the fact that they need to be fixed by adding bad entries is not the way. Perhaps instead tbot should flag all multi-word square-bracketed translations as {{ttbc}} or {{tbot-ttbc-multi-word}} until such time as someone removes that tag?—msh210℠ 21:26, 19 November 2007 (UTC)[reply]

The entries are bad (or if not bad, then drastically incomplete). They can be just plain wrong, but they also lack etymology, synonyms, declension/inflection, conjugation and a lot of other things that should be included. We don't need a bot to add entries, they're useful for other things. — [ ric | opiaterein ] — 22:16, 19 November 2007 (UTC)[reply]

Drastically incomplete isn't the same as bad (as you note). I see nothing wrong with adding drastically incomplete articles; in fact, that's what Connel's plural-form-adding and present-tense-form-adding bots (and whatever else he has) do already. So do many flesh-and-blood users.—msh210℠ 22:25, 19 November 2007 (UTC)[reply]

I try not to be one of those users. I try to add as much as I can find out as possible, and if I can't find out enough to satisfy myself, I just skip it and leave it for someone who knows what they're doing. The bot doesn't know what it's doing. Also, adding 'form of' entries isn't the same as adding base-word entries. Some of the entries are bad, and some are drastically incomplete, but I'm running into a number of bad ones.

Also, flesh-and-blood users can't add as many bad/sub-par entries as a bot can. A bot can add a whole assload of them in a short time, and I consider that "dangerous". A few started by a human can easily be cleaned up in a few minutes, but it's going to take me at least a few days to clean up the Category:Tbot entries (Romanian). — [ ric | opiaterein ] — 22:34, 19 November 2007 (UTC)[reply]

Most users do add simple (what you call "sub-par") entries; and that is perfectly okay. It would be better if you didn't skip entries you felt you couldn't complete, and did make "partial" entries. Is how wikis work. If we waited for someone who could make a "complete" encyclopedia or dictionary entry, we wouldn't get very much. ("nupedia", wiki's predecessor, managed to produce one article ;-)

When something is wrong the source translation table needs to be fixed. Are the Romanian entries in the translations tables generally that bad? Do note that the category has a configurable limit, it can be set to 10 or 20 or even 0. If it is that bad, we can strip out (most or all) of the contents of the cat, but that leaves the translations tables to be fixed.

Also keep in mind that it is a wiki, someone else can do it too. What we are doing here (in part) is trying to get some better quality control on the translations table entries. Robert Ullmann 10:22, 20 November 2007 (UTC)[reply]

In that case, I think before we put this bot to work, we should get the translation tables under control. — [ ric | opiaterein ] — 15:10, 20 November 2007 (UTC)[reply]

This depends on the language. If you are saying the Romanian translations are crappy, you have your work cut out for you. We can have some bot move all of them to ttbc? Eh? ;-) Meanwhile I set the limit for Romananian to 20 (< less than what is there), so it won't generate any. For Swahili, Kinyarwanda, Croatian we've set to limit to 1000, to check as much as possible. (It is a hell of a lot easier to check created entries that are known to be new than slog through all the trans tables, not knowing what was checked before ;-). For Italian, SemperBlotto has set an intermediate limit; a dozen other users are doing various things. Robert Ullmann 16:02, 20 November 2007 (UTC)[reply]

Seriously, if the Romanian translations are really that awful, I can make you a list to check, and leave the Tbot cat limited until you have sorted them all out. There are 3885 Romanian translations. (Not counting "Roomanian" and the like ;-) Don't know how many already have entries, but I can leave those off the list if desired. Robert Ullmann 22:18, 20 November 2007 (UTC)[reply]

The translations aren't bad, but when you have headers like "proper noun" for languages, Tbot creates the new entries as proper nouns, which languages aren't in Romanian. When it creates adjective articles (from the translations that shouldn't be there like albaneză, albanezi and albaneze) it fails to note that those are forms of a basic word (like albanez). This bot and its entries will be the death of me. ~~[ ric | opiaterein ] — 15:54, 21 November 2007 (UTC)~~[reply]

I forgot to mention that if a word has more than one part-of-speech, it only adds one. — [ ric | opiaterein ] — 16:01, 21 November 2007 (UTC)[reply]

List is User:Robert Ullmann/Trans languages/Romanian; red links are things that might be created. Have at it. Robert Ullmann 13:05, 21 November 2007 (UTC)[reply]

This is the sort of thing I've been looking to discover. (multi-word translations) So, if we/it tagged the apparent phrases some other way, how does a user say "yes, this phrase is right, use it!"? (removing the tag is not sufficient, it would just get tagged again ... ;-) At present, fix the translations entry, and move/delete the new entry. Robert Ullmann 10:22, 20 November 2007 (UTC)[reply]

It can look at the page history and see if the translation was ever tagged; if so, and someone removed the tag, then (one would hope) the translation is okay now. No?—msh210℠ 14:37, 20 November 2007 (UTC)[reply]

Um, read all of the previous versions and see if the tag was there and later removed? And try to determine if it was that line, and if that line said the same thing when the tag was removed? Sure ... ;-) (buying me a new net connection and WMF a new server? Yea!) Robert Ullmann 16:02, 20 November 2007 (UTC)[reply]

Well, it's a bit more feasible as follows. When tbot tags a multi-word entry (as ttbc-tbot or whatever), it adds to a list a line indicating which entry, POS, language, and foreign word was tagged. That's just a text file. And before so tagging, it makes sure that that same info isn't on the list already. ("More feasible", that is, according to my own private little system of logic. I am not a programmer/script author (in case you couldn't tell).)—msh210℠ 06:45, 29 November 2007 (UTC)[reply]

Update: So I start cleaning up the Category:Tbot entries (Romanian) and after less than 10 I get frustrated and it's left so much for me to clean up, and start to get lazy and not add things that I would almost always add in my own new entries. I don't mind adding in things like pronunciation and etymology to random entries I run across that don't have them, but for a whole category without etymology, pronunciation, plurals, declension and categories... it makes me want to jump out my window. Which admittedly would solve very little, I'd probably just break my leg. But you understand. :) — [ ric | opiaterein ] — 21:11, 12 December 2007 (UTC)[reply]

If you try to treat it as a cleanup category, in which you are supposed to add everything, well yes, you will probably get tired of it unless you just pick one at random once in a while. It's a wiki. Entries can start minimally. You don't have to do everything yourself. Giving some other users starting points for adding to Romanian entries will help recruit new Romanian editors. Or were you going to do it yourself? Robert Ullmann 11:14, 13 December 2007 (UTC)[reply]

Just because entries can be started with a bare minimum of information doesn't mean that they should or that it's the best way to do it. Even when I'm adding words in languages I don't know, I try to add something other than the POS and the definition.

Another problem I have in this particular case is that not that many people are interested in Romanian. Being (I think) the only flesh-and-blood constant contributor to Romanian entries, I can ensure that the newer entries will have as much info as they can. There's already a lot of stuff left over from the old contributors that will need to be cleaned up in time. The Tbot is just making the pile bigger.

olandeză is not a proper noun. The bot can't distinguish that when creating entries in other languages from entries in English. Can't the Tbot just clean up Translation tables? That would be awesome. — [ ric | opiaterein ] — 15:57, 13 December 2007 (UTC)[reply]

If entries can be started with minimal information, they absolutely should be. Otherwise we lack the entry until somone gets around to the "complete" entry. In a wiki, entries are usually started fairly minimally.

If the size of the "pile" bothers you, that is a problem for you: the task of adding all Romanian words (or any such similar project) to the wn.wikt is ahead of you. Since you are the only one? Of course not; there will be many, if they have something to do. Saying all entries must start as complete ensures you will have little help.

Meanwhile, someone looking up olandeză will get a helpful entry (even if the POS is off a bit), instead of "Duh, don't know that word?" until whatever future date you were going to add it. (When were you going to get to it? Think about that?) And it isn't in the way: when it does get to the front of the queue you've set yourself you can just CTRL/A and start typing the "complete" entry.

It's a wiki, all entries are the product of multiple contributors, and the property of none. Robert Ullmann 17:04, 13 December 2007 (UTC)[reply]

Ugh you're killing me dude :P I didn't say they needed to be started as complete entries, just that they should have a little more, hopefully, than the crap minimum :P — [ ric ] opiaterein — 02:37, 14 December 2007 (UTC)[reply]

Romanizations[edit]

Can you fix it to not create romanizations (rather transliterations) for Hindi (or for that matter any other language that uses another script) from the translations tables? --Dijan 22:54, 28 November 2007 (UTC)[reply]

The transliterations in the translations tables are wrong? (or that bad?) In the general case we definitely want the transliteration, else it has to be added over again in the entry. Can you tell me more? Robert Ullmann 05:47, 29 November 2007 (UTC)[reply]

Ah, you mean recognize that someone has put only the romanization in the table, and not create entries in the wrong script? (e.g. from coast/Hindi) Yes, very good idea. Robert Ullmann 06:01, 29 November 2007 (UTC)[reply]

This is effectively fixed by the new algorithm and filters. Robert Ullmann 09:46, 16 December 2007 (UTC)[reply]

New filter for creating entries[edit]

Tbot has new and improved method of selecting entries to be created. No, I'm not telling yet. Robert Ullmann 11:17, 13 December 2007 (UTC)[reply]

See the documentation. Robert Ullmann 09:46, 16 December 2007 (UTC)[reply]

FL pronunciations[edit]

Hi Tbot, seeing as you include pronunciations too. Could you add the |lang=xx paramter to the end of the IPA, e.g. {{IPA|/kup/|lang=nl}} etc. Thanks. --Keene 15:07, 14 January 2008 (UTC)[reply]

Not a bad idea, have to see whether I need to restrict the list, or it handles unknown ~~(pretty sure of the latter)~~. Robert Ullmann 15:12, 14 January 2008 (UTC)[reply]

No, DAvilla just let the default case produce [[w: phonology]]. more crap to fix. yuck. Robert Ullmann 15:19, 14 January 2008 (UTC)[reply]

Done in tbot (not running quite yet), {{IPA}} still needs work. Thanks, good idea. Robert Ullmann 15:29, 14 January 2008 (UTC)[reply]

Gloss and hyphenation[edit]

Hi Robert, I've been going through the Category:Tbot entries (Hungarian) list and checking the entries. The vast majority were fine. A few were incorrect because of incorrect translations. Two observations:

Since the trans table gloss is added to the definition of the FL entry, it becomes even more important to add good glosses. Somewhere I've read that the gloss should be an abbreviated form of the actual definition. There will be cases where the shortened gloss will make sense in the original context, but may appear inaccurate in the new FL entry. For example, morzsa (crumb) is described as a "small piece of biscuit, cake, etc." which to me at the first reading seemed incorrect, but when I read the full definition at the English entry, I understood what happened. It's not a big deal, it can be corrected in both places, we just have to remember that the trans table gloss is used like that.
If the gloss is written well in the first place (one short phrase that distinguishes that definition), it will usually be pretty good. But particularly when the translation is approximate (like this case), it is going to take adjusting anyway.

In one of the entries from the January 2008 run, tbot added the Pronunciation section with IPA. It was correct, but I've noticed that it also added a period for hyphenation, see szeder. My question is, do we still need a separate Hyphenation section? Hyphenation could be added to the IPA or to the actual entry name under the POS. Do we want to duplicate the information? Thanks. --Panda10 22:55, 14 January 2008 (UTC)[reply]
It is using the IPA string it finds, not doing any analysis or parsing; a number of the FL wikts use the periods for hyphenation where we do not (in general). You can leave it, or add Hyphenation (as you have), or just take it out. In no case do we put that sort of stuff in the headword. Robert Ullmann 12:25, 15 January 2008 (UTC)[reply]

Adding categories[edit]

Could tbot add categories? --Panda10 00:47, 15 January 2008 (UTC)[reply]

How? Presumably you mean topic categories, since the POS cat is already there. I don't know of any way that Tbot can reasonably identify the category/categories that are appropriate? (In the general case). Would be useful. DO you have particular cases we might look at? Robert Ullmann 12:18, 15 January 2008 (UTC)[reply]

Yes, I meant topic categories. For example, lentil (hu = lencse). The English entry already contains the Vegetables category. When tbot creates the FL entry, it could add the hu:Vegetables category. I know there will be cases when the word has multiple meanings with different translations and it may be impossible to decide which word goes to which category. Just an idea. --Panda10 12:37, 15 January 2008 (UTC)[reply]

It could do so using {{context}}-based categories like category:Mathematics→hu:Mathematics from {{mathematics}}. Of course, it'd have to identify which sense each trans-table corresponds to, which may well be impossible in general, but should be do-able, I imagine, where there's only one sense listed for a given English/PoS.—msh210℠ 18:34, 15 January 2008 (UTC)[reply]

German audio in Hungarian entry[edit]

Tbot added a German audio file to the Hungarian történelem entry. Let me know if this is not the correct place to add these comments. --Panda10 02:37, 15 January 2008 (UTC)[reply]

This is the right place. Look at the Hungarian entry, it doesn't have audio for Hungarian, just for German in the translation table. Never seen that before ;-) I've already fixed this. (After noting the same thing at múzeum). Robert Ullmann 06:56, 15 January 2008 (UTC)[reply]

alt=[edit]

Does Tbot grab the alt= text from {{t}} and insert it as head= in {{infl}} on the new page?—msh210℠ 21:12, 17 January 2008 (UTC)[reply]

Yes. Robert Ullmann 16:28, 19 January 2008 (UTC)[reply]

template:tbot entry text[edit]

I think that this diff, changing the text of template:tbot entry from "was generated automatically" to "was created" was a mistake: It is useful for users to know why the entry is not as complete (or, possibly, reliable) as others.—msh210℠ 19:12, 29 January 2008 (UTC)[reply]

taking translations from {{term}}s[edit]

What do you think of having Tbot create entries from {{term}}s as it does from {{t}}s? I mean: If there's a {{term}} with {{{1}}}, {{{lang}}}, and {{{3}}} (the English translation) specified, Tbot can create the entry (assuming it finds the word in the FL wikt, etc., per current practice with {{t}}).—msh210℠ 20:12, 7 February 2008 (UTC)[reply]

комбинация[edit]

Somehow it got a funky picture, maybe something buggy in the bot, maybe something unrelated, thought I would let you know. (see history) - [The]DaveRoss 23:53, 15 March 2008 (UTC)[reply]

look at the ru entry linked to ... I'm not sure, but this is the sort of case which (quite properly) needs to be checked ... Robert Ullmann 23:26, 21 March 2008 (UTC)[reply]

baseren[edit]

An anonymous user says that baseren is a verb. It's in our translation table s.v. base as a verb sense, and nl:baseren says it's a verb, yet Tbot put a Noun header s.v. baseren. Not sure what went wrong.—msh210℠ 21:29, 31 March 2008 (UTC)[reply]

At the time (13 January) the entry at base had "Transitive verb" which Tbot didn't recognize; so thought it was still in the Noun section. Tbot might be given a list of "stops" that are invalid headers. Robert Ullmann 17:20, 1 April 2008 (UTC)[reply]

Or could recognize Transitive verb, Reflexive verb, and Intransitive verb as verbs.—msh210℠ 17:23, 1 April 2008 (UTC)[reply]

Why bother when they are deprecated and being fixed anyway? Tbot can re-visit the entry later. I've added those 3 to Stops. (Which it already had for other stuff.) Robert Ullmann 17:29, 1 April 2008 (UTC)[reply]

adding language sections[edit]

Perhaps Tbot shouldn't add a language section when it already exists but with its header an unnecessary link?—msh210℠ 17:11, 1 April 2008 (UTC)[reply]

Yes, I've just fixed that ;-) Robert Ullmann 17:13, 1 April 2008 (UTC)[reply]

pl:kula[edit]

Tbot adds an image from Polish Wiktionary but not from the Polish language section... Maro 14:57, 6 April 2008 (UTC)[reply]

But it nicely illustrates the Serbian section that follows (and partly overlaps into ;-) (you could've moved it down? ;-) Want a bit of amusement? Look at afghan as of the last Tbot edit, the picture is correct, but ...

Yes, need to teach Tbot a bit more about the language sections; the problem is that the wikts have developed lots of different ways of doing these. I'd really need a table to tell it which of == L2 header, various templates that start with = or == or -, etc. And: have you (or anyone) seen a wikt in which the native language is not the first (or zeroeth) L2 section? Robert Ullmann 15:32, 6 April 2008 (UTC)[reply]

Um, ours. See, e.g., H.—msh210℠ 17:04, 7 April 2008 (UTC)[reply]

pl:aluminium[edit]

Tbot added pronunciation but from English section (aluminium (język angielski)), even though there was Polish pronunciation. Maro 16:48, 10 April 2008 (UTC)[reply]

This has been fixed. (Both aspects: not picking up the PL IPA, and picking up things from the English or other language sections) Robert Ullmann 18:29, 15 June 2008 (UTC)[reply]

adjektiv[edit]

The bot added this as an adjective (not noun). Very strange. __meco 21:25, 30 April 2008 (UTC)[reply]

Note it is listed under the adjective POS at adjective. A bad translation? Seems to be, I will remove it. (Not in dicts I can check) Robert Ullmann 14:21, 1 May 2008 (UTC)[reply]

Removing its own contributions[edit]

Look at this edit of Tbot. It has created a Dutch entry, and then overrode it by a Swedish one. SPQRobin 20:50, 31 May 2008 (UTC)[reply]

I don't know; it should have found the page existed when it tried the second one. (It doesn't "remember" that it just did another language at that title, just gets the page.) The only sequence I can think of is that sometimes new pages don't appear for a few seconds (which can expand to minutes or more if the WMF servers have trouble), and it just got "NoPage" again and wrote the new one. (This can't happen if the page has been around for a while, it will either get the page or get an error with no "edit cookie".) Robert Ullmann 18:23, 15 June 2008 (UTC)[reply]

See vertebrado for an example where it behaves properly, as intended. Robert Ullmann 23:03, 15 June 2008 (UTC)[reply]

better instruction please[edit]

I came across this at http://en.wiktionary.org/wiki/stat

As a native swedish-speaker with good grasp of english, I can vouch that "stat" in Swedish is indeed "state" in English. Now, how I mark the entry as "checked" I don't know. It could be that I should remove the template, but that isn't clear by the "more information" page - wouldn't that only make the bot put back the template? Also, if that indeed is how to go about things, please consider not talking about the "check". There is no check - if it's correct, remove the template.

Also, "what to do" should be on top of that "more information" page to make it as easy as possible for users to quickly understand what's asked of them.

Regards, 85.227.226.182 18:57, 24 June 2008 (UTC)[reply]

Agenda[edit]

Just curious: How does Tbot select the entries on which it works? It's fairly obvious for AF and Interwicket but I still don't see Tbot's pattern. -- Gauss 18:23, 14 November 2008 (UTC)[reply]

Database ID order, for the entries containing the translations. But then it also uses a cache to avoid looking at things again for a while, so it effectively does a subset each time run. Will look fairly random ... Robert Ullmann 18:27, 14 November 2008 (UTC)[reply]

xs=[edit]

Under what conditions does Tbot add xs=? In this edit, it was added for Aragonese and Occitan, which seems overkill to me. These are languages with 2-letter codes and their own Wiktionaries. Is Tbot still relying on t-sect? If so, then why doesn't t-sect handle these languages? --EncycloPetey 05:26, 22 December 2008 (UTC)[reply]

It adds xs= for languages that are not covered by {{t-lang}} and {{t-lan2}}, which are used by {t-sect} to optimize the top 30 or so languages. In this way a translations section will typically not invoke any of the code templates. Trying to add 170+ languages into {t-sect} would make it a de-optimization for most entries; always using the code templates would be a huge overhead for entries with lots of languages. Tbot "knows" what {t-sect} knows, so no-one else need worry about it. If xs is superfluous or missing, {t} will still work correctly, but with more overhead. Robert Ullmann 05:45, 22 December 2008 (UTC)[reply]

Serbo-Croatian in translation tables[edit]

I've (we've..) currently been considering merging B/C/S (and prob. soon-coming Montenegrin..) entries to ==Serbo-Croatian==, and there's this problem with mapping to individual project entries. Now, suppose we decided to merge those, and having bs/hr/sr/sh wiktionaries, could Tbot in theory update translation table links of Serbo-Croatian translations to check all of those wiktionaries for a particular spelling in the translation table? I.e., it would add superscript links as e.g. ^{(sh b/c/s)} or ^{(sh bs hr sr)} (less concise version) where each of those would individually be checked against the existence of entry in the corresponding wiktionary? --Ivan Štambuk 12:08, 4 March 2009 (UTC)[reply]

soon-coming?! Bože sačuvaj! The uſer hight Bogorm converſation 19:09, 30 May 2009 (UTC)[reply]

adding language name[edit]

Does Tbot convert * {{t|fr|foo}} to * French: {{t|fr|foo}}?—msh210℠ 18:59, 24 March 2009 (UTC)[reply]

No, the line just doesn't parse. AF will flag that entry as needing help. Note that there is a risk of the code being wrong (and the user editing it gets no feedback on the language name selected) so it might be worth a human looking at anyway. (Unlike, for example, where the user puts in the template instead of the language name, sees the name correctly, and it is very safe for AF to fix.) Robert Ullmann 15:28, 14 April 2009 (UTC)[reply]

redirects[edit]

Anything you can do about צ'ק? It's a valid redirect, but Tbot keeps editing it. (Same for any other Hebrew entry with a 7-bit apostrophe.)—msh210℠ 20:03, 5 May 2009 (UTC)[reply]

Thanks, I've fixed it to be more restrictive about which redirects it will replace. The primary target is "conversion script" redirects that should be German nouns. Robert Ullmann 13:10, 14 May 2009 (UTC)[reply]

Thanks.—msh210℠ 15:55, 14 May 2009 (UTC)[reply]

[1][edit]

For some reason Tbot added File:Wikipedia.png to the page when it was created, even though it never appeared on the Spanish wiktionary page, as far as I can tell. Nadando

It is in the es.wikt page: someone subst'd the Wikipedia template there (:-). I've added it to the image stop list. Robert Ullmann 13:10, 14 May 2009 (UTC)[reply]

Portuguese pronunciation[edit]

Hello. Is it possible, if you auto-add pronunciation from pt.wiktionary, for example população and pt:população? This would be great. Thanks --Volants 10:25, 11 June 2009 (UTC)[reply]

xs[edit]

I can understand the changes to the Egyptian Arabic in this edit, but what does the seemingly unnecessary "xs=" add that the language code doesn't? --EncycloPetey 15:00, 16 July 2009 (UTC)[reply]

The xs= parameter (which users never have to worry about) generates the section link, without having {t} and friends invoke a large set of code templates for each page. Once Tbot has edited, the trans section should not be invoking any code templates. This saves a lot of full SQL queries rendering the page. But as said, users/editors never have to worry about it. Robert Ullmann 04:21, 17 July 2009 (UTC)[reply]

cleanup French verbs[edit]

Hello Tbot. Is it possible, for this bot to add {{frconjneeded}} to any French verbs, that it will create? See this edit for example. As a result, my bot and I and other users will be able to conjugate it. --Rising Sun 23:32, 25 August 2009 (UTC)[reply]

Okay, but won't take effect for a little while. You can look at Category:Tbot entries (French) and easily find the verbs ;-). I could with a small bit of work make a list for the whole wikt of French verbs that lack desired conjugation or whatever. Robert Ullmann 10:05, 26 August 2009 (UTC)[reply]

Great, if you can make this list. Another request: is it possible, to find all entries in 2 categories: in my case, Category:French verbs and Category:Tbot entries (French). --Rising Sun 13:12, 27 August 2009 (UTC)[reply]

μέρος[edit]

Just thought I'd let you know about the oddity in this edit, specifically the definition line, in case it was anything which should be worried about. The entry has been fixed. -Atelaes λάλει ἐμοί 21:32, 27 August 2009 (UTC)[reply]

zh->cmn in translations[edit]

Why does this conversion happen and what does it achieve? "Chinese" has been wiped out as a language from Wiktionary, so both templates zh and cmn have the word Mandarin in them. Anatoli 00:49, 17 September 2009 (UTC)[reply]

Because we should be using cmn ("zh" really means the Chinese group, even though our template says "Mandarin" for some reason.) Conrad's assisted editing should really dis-allow "zh", or change it to cmn on entry; but in any case tbot can clean it up. (It also allows "zh-tw" and "zh-cn", which are problematic in that they will require manual cleanup.) Robert Ullmann 09:53, 23 September 2009 (UTC)[reply]

Request for automatic entry creations - other languages[edit]

Hi,

Can we add more languages to automatic entry creations? Russian, Chinese and Arabic (and others) seem to be missing out. In my opinion, some basic entry is better than no entry at all. The additional information could be added later. Please say if you think it's a bad idea. I think, the disclaimer is sufficient to show that the entry is not created by a human. Anatoli 05:56, 23 September 2009 (UTC)[reply]

Hi! it isn't creating any entries for Russian and Arabic because it doesn't understand the format of the entries on those wikts well enough to check the translation. I have been meaning to work on that. The situation with Mandarin may be similar, but I haven't looked; I did notice at some past time that while the Mandarin wikt had lots of entries for English et al, it didn't have many for Mandarin itself. For example, see zh:father, but note that there is no zh:父亲 (which is a fairly basic entry, one would think?). So Tbot would not create 父亲 from translations at father if we didn't have it, as it would not find zh:父亲.

I'll see if I can improve it a bit this afternoon. Robert Ullmann 09:49, 23 September 2009 (UTC)[reply]

y,cmn2pl![fuqin=basic,y--史凡>voice-MSN/skypeme!RSI>typin=hard! 15:49, 23 September 2009 (UTC)[reply]

Yes. But as noted, the shortage of cmn/zh entries for words in the zh.wikt is a limiting problem.

It has found a Russian word or two (it crawls the DB slowly, not looking at a word more than once every 35 days). Arabic I'm not sure of, there shouldn't be any reason why it doesn't match entries, there may just not be that many. I am watching, and I'm going to add a bit of logging so I can see more. Robert Ullmann 11:28, 24 September 2009 (UTC)[reply]

I've found several other language wikts where it needed to know more about the format; continuing to improve it. Robert Ullmann 12:53, 29 September 2009 (UTC)[reply]

I'd like to second Anatoli's request. Even though Chinese entries have relatively complex formatting - and mass-creating entries using a bot would go against this - I still think that would be better than no entries at all. And it's not as if editing them would be such a big issue for us. So please, if you so desire, go ahead and work your magic. The more Chinese entries on wiktionary, the better. Tooironic 02:37, 30 September 2009 (UTC)[reply]

Thank you, Robert. I have seen some Russian entries. They look good for an automatic creation. Well done! Obviously in need of some editing. I am curious where does the bot get the audio files from? (For example the Russian entry: дышать (dyšát') to breathe has an audio file. Must have done from the Russian Wiktionary). Please advise when Chinese and Arabic start working. Anatoli 03:08, 30 September 2009 (UTC)[reply]

Hi Robert, can more languages be included? I don't see why Thai, Belarusian? I think I've seen translations linked to th wiki. Anatoli 00:30, 21 January 2010 (UTC)[reply]

As noted, it does rely on the FL wikt having useful entries; but both of these have lots of entries. If you can identify case(s) where it ought to have enough information, that would be helpful. Also: note that Tbot is not running right now, I have a local technical problem to fix presently and then it will resume ;-) Robert Ullmann 23:26, 21 January 2010 (UTC)[reply]

This month Tbot is very slow for Russian (perhaps others?). Just my observation. I haven't checked many entries personally but I see many of them were the result of my translations. --Anatoli 04:39, 18 May 2010 (UTC)[reply]

Link for Template:also[edit]

Is it possible for Tbot to add, when it is creating e.g. Celte, to add e.g. {{also|celte}} at the top - for the words with the same spellings, only different for lowercase and uppercase - the entry celte existed for the entry Celte, so it might be easy for a bot to notice such examples. --Volants 11:09, 5 October 2009 (UTC)[reply]

It already does this when replacing redirects (a common case is German nouns), I've looked at checking other cases, but haven't done anything yet. The general case, of finding all the correct see/also forms, is much harder; I probably won't look at that. Robert Ullmann 13:18, 9 October 2009 (UTC)[reply]

Entries to be deleted[edit]

Why does it say "At some point all of those remaining with tags will be deleted." What does this mean? Will it ever actually happen? Mglovesfun (talk) 11:37, 9 December 2009 (UTC)[reply]

The November 2007 entries were not cross checked, there are a number of bad ones. At "some point" we need to figure out whether they should be salvaged or dumped. For example, there is a set of entries for Maltese, quality completely unknown. The Maltese wikt doesn't (or didn't) have entries to check. Robert Ullmann 05:33, 11 December 2009 (UTC)[reply]

So not just "all tbot entries that aren't checked" right, that reassures me. Mglovesfun (talk) 11:01, 12 December 2009 (UTC)[reply]

Cleanup the French Wiktionary[edit]

For one year we're asking to the French Wikipedia real time cleaner bot owner to take care of us without any success (not the time). Do you think that it's possible for you to do this job, even only for the page blanking (please!). If yes you can answer in the language of you choice in the paragraph linked by this section title. Thank you for your time. JackPotte 17:49, 11 January 2010 (UTC)[reply]

Verb entries[edit]

Not sure how practical this is, but verb entries should start with to (to eat, to go, to have) whereas Tbot ones don't. Mglovesfun (talk) 17:30, 16 February 2010 (UTC)[reply]

Well, some verbs should, not all. Not all languages have the "to" form as the main verb form --Rising Sun talk? 20:11, 16 February 2010 (UTC)[reply]

Tbot doesn't create English entries ... I'm not sure how many languages might have something akin to "to ..." for an infinitive form, but the ones I know don't use any such convention. May be pretty much unique to English, and thus moot. Robert Ullmann 12:38, 20 February 2010 (UTC)[reply]

Tbot entries (Occitan)[edit]

I've noticed these categorize by month, but not in "Occitan". I wouldn't mind checking some of these. Mglovesfun (talk) 11:01, 20 February 2010 (UTC)[reply]

Now that you've created the cat, they should be turning up there ... Robert Ullmann 12:41, 20 February 2010 (UTC)[reply]

French pronunciations[edit]

Most of the French one's on this page are tbot ones. That's because it's interpreting the |fr as part of the IPA, while (as you know) it's the equivalent of lang=fr. Mglovesfun (talk) 14:20, 21 May 2010 (UTC)[reply]

as in renvoi. I see the problem; a bit of regex that is too simple; will see about fixing. Robert Ullmann 15:34, 21 May 2010 (UTC)[reply]

Template:jump in gloss[edit]

In creating Danish del from part, Tbot incorrectly copied {{jump}} code from the {{trans-top}} gloss. --Bequw → τ 06:30, 26 June 2010 (UTC)[reply]

eˈxem.plo (as a pronunciation)[edit]

It would have been better to ask this two years ago, but I kept forgetting- can you please not add this as a pronunciation for Spanish Tbot entries? It seems to be used as a placeholder (or something left over from a preloaded template) on the Spanish wiktionary. Nadando 07:11, 4 August 2010 (UTC)[reply]

ah, thank you; just the sort of feedback needed. I'll get to that presently :-) Robert Ullmann 11:48, 4 August 2010 (UTC)[reply]

Use `{{gloss}}`[edit]

Please make Tbot use {{gloss}} in the glosses it adds, that’s what the template is for. H. (talk) 10:52, 22 August 2010 (UTC)[reply]

Stumbled over this again today. Haven’t had time? H. (talk) 17:16, 10 January 2011 (UTC)[reply]

Template:frconjneeded[edit]

I see here that Tbot does use frconjneeded, though not for other languages. ~~I think it should use {{rfconj}} and under a header,~~ or {{rfinfl|type=conjugation}} as it's been suggested to rename {{rfconj}} and I don't oppose it. Mglovesfun (talk) 12:54, 30 August 2010 (UTC)[reply]