Longer term tasks and think-tank projects[edit]

Pending tasks[edit]

  1. For a given word, provide "prev" and "next" words alphabetically in English
  2. JS Preload buttons (row of icons)
    1. figure out way to overlay text on blank button
    2. See if single button can be stretched to all buttons' width so only 1 image is downloaded
  3. JS Translation column balancing to minimize vertical pixels used
  4. JS Language cleanup button - just call Anabel's toolserver tool.
  5. JS Magic inflection button (nah, do as automatic from preload choice? Nah, do as magic stem buttons)
  6. JS Welcome and welcomeip, pediawelcome, userwarn buttons
  7. JS Categories to bottom of page, but above transwikis (sorted!)
  8. JS "subst:" button (find "nearest" "{{".)
  9. JS Auto 2 column the selected lines
  10. JS Auto 3 column the selected lines
  11. JS Auto 4 column the selected lines
  12. JS Link magic buttons: (convert all links on page)
    1. For all page links, add E, 0, +, H, T, D, P, M, W, R links (edit, hist, talk, del, prot, move, whatlinkshere, related changes)
    2. For all user page and user talk page links, add U, T, +, C, BL, B, UL links (user, talk, talk_new, contribs, block log, block, user list/name)
    3. For all talk, Wikt: & help: pages, add E, 0, +, H, P, R links (edit, edit sec 0, edit_new, hist, prot, related)
  13. re-run GutenBot on top 1,000 entries adapting Hippietrail's formatting suggestions (to an extent)
    1. Run one bot to remove the current doobers
    2. Rerun the rankings for top 3,000.
  14. User talk:Connel MacKenzie#Redlinks from the Shakespeare wordlist
  15. Bot runs: (get these scheduled more regularly - each XML dump)
    1. ComparaBot,
    2. SuperlBot,
    3. TheCheatBot,
    4. PluralsBot,
    5. ThirdPersBot,
    6. PastBot
  16. User talk:TheDaveRoss#Maybe a WikiSaurus bot would be useful
  17. Populate WT:RFDA from enwiktionary-latest-meta-history.xml
  18. rewrite English Random redirector in PHP, off toolserver db, not XML dumps (someday. Might be easier to load GT.M MUMPS on toolserver.)
  19. Rewrite auto-column balancer (especially for the "names" indexes) in Javascript so I'm no longer the only one doing it. (Or not, if previous is satisfactory.) Remember "Zemena" (from ru: wiki) for text-area selection.
  20. Devise category vandalism page to identify all category removals.
  21. Work out better method of reporting "top-40" languages vs. dewikified languages (make a finite list!) Update: Hippietrail and Stephen are doing this, so I don't have to. Yay! Wonder what ever became of this.
  22. Get official clarification from Proj. Gut. regarding Webster's 1913. Done: July 2006...A-OK.
  23. Import Webster's 1913.
  24. Preload templates rewrite (each individual template.)
  25. Javascript-ize the preload templates to determine from approximate suffix which to auto-load
  26. Javascript add buttons for each of the preload templates as a row of small buttons (above or to the right of the current edit-box buttons.) Better might be a list of preload templates (horizontal) saying what "suffixed word" the template is for.
  27. Review Help: namespace and double the current content. Let Pathoschild do this?
  28. JS buttons for "history-ize all" and "edit-ize all" links on page. (Hrm, maybe my js should just do that for all wikilinks, no matter what - add a "h", "e" and "t" links to them? For userpages, "t", "+", "c"? Perhaps also WhatLinksHere? Delete? Protect? Move? Watch/Unwatch?) Bookmarklets are the way to go for these, right?
  29. Import these into Wiktionary, somehow.
  30. FmtTransBot: 'bot add the {{trans-top}}/{{trans-mid}}/{{trans-bottom}} to translation sections, and balance columns of entries that do (only change if unbalanced by three or more?)
  31. Import Wiktionary:Dictionary of the Chinook Jargon
  32. Create a Javascript parser, to create a dict object by parsing by headers, arranging sub-objects by blocks of text (that have sub-sub objects as blocks of text) to correspond to the TOC of an entry.
  33. Create a Javascript "re-arranger" to sort headings as per ELE, based on the parsing from previous item in this list.
  34. Get small XML list of entries from all other language Wiktionaries, subtract the entries that exist on en.wikt.
  35. One by one, download all the other language wiktionaries, traverse all entries (once?) and find all that have ==English== or {{-en-}}.
  36. User:ComparBot next in line?
  37. Help Hippietrail refine "personalsidebar.js".
  38. Move User:Connel MacKenzie/custom.js and User:Connel MacKenzie/Preferences to MediaWiki: namespace. Half done 12/2006.
  39. Random page selection enhancements: by PartOfSpeech
  40. Find the MediaWiki: page displayed at login, and add most of {{welcome}} to it, so the first time someone logs in they have at least a hint.
    1. Perhaps think about something similar for anon's 1st page view. (Set a cookie.)
  41. Rewrite spellchecker to use en.Wiktionary's list of English words (minus redirects, minus slang, minus colloquial, minus misspellings) plus the "list of common misspellings" as a stop list...instead of using "ispell" as it does now.
  42. Add Wiktionary jargon (e.g. WT:BP, protologism, POS, etc.) to spellcheck's list of "custom" allowable terms. Also add Wikipedia jargon, since it is being used more heavily there.
  43. Get the more generic things (spellcheck, keypad) out of WT:PREF and into MW:MB proper.
  44. Remove [Citations], replace with (all) [Subpages]. <-- for me only?
  45. Wrap User:Dmcdevit/monobook.js into WT:PREFS
  46. Check status of every four hours, send e-mail whenever enwiktionary/latest changes. Lowest priority...even if I don't check, others nag me. 1/2007
  47. Add User:Annabelleke/Transtool link to WT:PREFS.
  48. Fix /patrol.js to work with main.js for scrunched up Special:Recentchages display (keep sub-arrays and look through all edits to a term marking them all at once.)
    1. rewrite /patrol.js to honor enhanced layout, or SemperBlotto will never use it. Mark all for a user. Mark all pages that have since been edited by a whitelisted user sysop (Because it should have just been tagged for speedy, cleanup, verification. Or cleaned up.)
  49. Fix ALT-C for the 'new' namespaces added a few months ago.
  50. Finish cleanup (+intos) for Special:Prefixindex/Template:new en & MediaWiki:Noexactmatch
  51. Javascript bookmarklets for Special:Checkuser
    1. honor the new enhancements
    2. export as a comma separated variable file/table
    3. expand all usernames to {{vandal|username}} (and {{proxyip2}}.)
  52. Monobook.js: MOVE to Common.js (only applicable pieces/parts? Or all?)
  53. Monobook.js: Fix the Edittools thing to be dynamic (get code I wrote for bs: back in here!)
  54. Monobook.js: In Categories box at bottom of screen, hide all "Translation to be checked ..." categories. Make it a WT:PREFS perhaps?
  55. WT:PREFS: pull in User:Pill/monobook.js
  57. Fork for RFC then RFD then RFV.
  58. Fix WT:PREFS to not show sysop things if there is no [delete] tab present on prefs page. Add VOA uber-mega rollback thing to sysop section.
  59. Add a silly amount more WT:PREFS for randompage by language, externals (google, artfl, etc.)
  60. Add a "[Check minimum formatting]" button (via WT:PREF, like [check spelling],) that checks for a level two language heading, at least one level three heading (from the approved POS headings in WT:ELE), the pagename in bold after the approved pos heading OR an inflection template, at least one "#" definition line, and no "#" lines anywhere else.
  61. Special:Linksearch
  62. Study up on Meatball

Current mania[edit]

Rebuilding historic archives of WT:RFV, WT:RFD, WT:RFDO.

Full XML dump 1/16/2008[edit]

It is fucking amazing that a 300 GB drive is now so inadequate (i.e. full.) I've finished doing all the houskeeping tasks I can think of. Granted, over 100GB came from the digital camera. I might zap those all down to lower resolution. (But every time I've done that in the past, I've regretted it.)

I can't really get rid of any of the Project Gutenberg stuff. Starting on collocations is set back again research.

The Current Full history XML dump is >17 GB uncompressed. The "current pages" (including all the weird namespaces) is less than 700 MB. The next thing I can do, I suppose, is more aggressively thwack older "current pages" downloads that I've been saving for comparison (and because they ain't so big, compressed.)

The fact that I don't have a complete backup for over two years now, is troublesome.

Now, being massively space-constrained poses particular problems. Piping the the decompression to my analysis tool, I can read and parse entries/revisions and save the ones of interest. Saving one copy of each WT:RFV revision is somewhere between five and six times larger than simply saving all current revisions of all pages (still running.) I had to write my own buffering XML parser to get it as selective as I need it.  :-(   But what's a terabyte or 500 among 'pedia friends.

Parsing those apart to get the last REVID a particular section appeared has nasty challenges of its own. Page blanking vandals in the past, mean that I have to compare the first couple lines of those sections to eliminate some of the duplicates. Other terms actually have been listed numerous times. Section headings themselves, for a long time weren't consistently wikified. (Inconsistently - great.) Long blocks moved from RFV to RFD or RFD to RFV sometimes were softlinked, sometimes not. Long blocks archived to talk pages often were copied to three or four places. Determining if the target page existed when the section was removed (or five minutes later) is a different kind of challenge - I have to pull the deletion log for that page from the live wiki ('cause I don't have the space available to build up a full revision copy here.)

Actually, I probably do. Eliminating non-NS:0 stuff, I might have room for all history. If WT:RFV full history import ever finishes, I guess I'll see. OK, so all revisions of WT:RFV is just slightly (a couple dozen MB) less than a gigabyte. Ugh...RFD was renamed from its original name - more customization to pick that one up next round. Grrrr.

--Connel MacKenzie 06:20, 7 February 2008 (UTC)