User:Rukhabot

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

I'm a bot created and controlled by Ruakh. My source code is in Perl, making use of the URI, LWP::UserAgent, and JSON modules, as well as home-grown modules. If you'd like to see my source code, ask, but don't expect me to be released under the GPL anytime soon.

My username is a romanization of a hypothetical Hebrew רוּחֲבּוֹט, which isn't perfectly grammatical Hebrew (as the "b" would be a "v" in really proper Hebrew), but is a quasi-plausible neologism meaning "a bot's wind" or "a bot's spirit". (Ruakh's username, by comparison, is a romanization of the Hebrew word for "wind" or "spirit".)

If you see me do something bad, please leave me a comment on User talk:Rukhabot; I'll notice immediately, and will refuse to make any more edits until Ruakh has seen the comment.

Interwikis[edit]

My main task, accounting for the vast majority of my edits, is the addition, removal, and sorting/organizing/formatting of main-namespace interwiki-links. Various random facts about me in my guise as an interwiki-bot:

  • I only edit the English Wiktionary.
  • I don't examine the page-creation, page-deletion, and page-move logs; rather, I operate solely based on database-dumps (enwiktionary-YYYYMMDD-pages-articles.xml.bz2 and PREFIXwiktionary-YYYYMMDD-all-titles-in-ns0.gz). This means that my information about a given Wiktionary is typically up to two weeks out of date. However, before removing an interwiki-link that I think points to a non-existent page, I'll consult the target wiki's API just to confirm that the page still doesn't exist.
  • I arrange interwiki-links in the wikitext according to the following rules:
  • My code is completely distinct from, and independent of, that of any other interwiki-bot.

Translation-templates[edit]

Another significant task, but not accounting for nearly as many edits as the interwiki-links task, is conversion between {{t}} and {{t+}}. Various random facts about me in my guise as a translation-template-bot:

  • I only edit the English Wiktionary.
  • I only edit pages in the main namespace (regular entries) and the Appendix namespace.
  • I don't examine the page-creation, page-deletion, and page-move logs; rather, I operate solely based on database-dumps (enwiktionary-YYYYMMDD-pages-articles.xml.bz2 and PREFIXwiktionary-YYYYMMDD-all-titles-in-ns0.gz). This means that my information about a given Wiktionary is typically up to two weeks out of date.
  • I only convert between those two templates, plus converting to them from {{t-}} and {{t0}} and {{}}. If a translation does not use any of those templates, it will be not be touched.
  • I choose between {{t}} and {{t+}} using the rules you'd expect ({{t+}} when the foreign-language wikt exists and has the entry; {{t}} in all other cases), with a few special cases:
    • When the translation contains an explicit link, I use {{t}} (since {{t+}} doesn't support that case).
    • I know that the language-codes nan, cmn, nb, rup, kmr, and nds-de/nds-nl/pdt correspond to zh-min-nan.wikt, zh.wikt, no.wikt, roa-rup.wikt, ku.wikt, and nds.wikt, so I use {{t+}} for them when appropriate. For example, no:yes exists, so I will convert {{t|nb|yes}} to {{t+|nb|yes}} and {{t|no|yes}} to {{t+|no|yes}}.
    • sr.wikt has a feature whereby, if a Latin-script page doesn't exist, the software will check for the corresponding Cyrillic-script page, and issue an HTTP 301 redirection if the latter exists — and the reverse if a Cyrillic-script page doesn't exist but the Latin-script page does. I'm fully aware of this feature, so I'll write {{t+|sr|...}} if sr:... either is an entry or redirects to one.
    • ku.wikt has the same sort of feature as sr.wikt, but for Latin and Arabic scripts instead of Latin and Cyrillic. I support that as well.
    • zh.wikt, kk.wikt, and iu.wikt have the same sort of feature as sr.wikt and ku.wikt, but in those cases I don't know the mapping rules yet, so for them I change {{t}} to {{t+}} only when the title is an exact match, and for now, I never change {{t+}} to {{t}}.
  • I do not change any formatting outside of the template call.
  • I don't try very hard to understand the subtle complexities of MediaWiki template syntax. I simply look for (approximately) {{t[-+ø0]?[|][a-z-]+[|][^|}=]+ followed by | or }}. So, for example, I will be fooled by {{t+|fr|asfasefasefase|2=le}}, which looks like it links to fr:asfasefasefase, but which actually links to fr:le. However, even in such pathological cases, I won't cause any serious harm — I just might select the wrong template.
  • I don't examine context at all; I'm just as happy to update a {{t}} in a ====Synonyms==== section, or inside a comment, as a properly-used {{t}} in a ====Translations==== section.
  • I have no special behavior for B/C/S/M; for example, I will convert {{t|hr|Leiter}} to {{t+|hr|Leiter}} and will leave {{t|sh|Leiter}} alone.