So, I downloaded all of Project Gutenberg.
Yes, I had to buy a new external hard drive first.
The raw frequency count is at Wiktionary:Frequency_lists. That lists words in their rank order. Unlike other possible rankings by "headword" this lists the exact inflection or conjugation's rank...much like Wiktionary has separate entries for each form of a word.
The following lists indicate words that were undefined in Wiktionary, as of whichever XML dump backup date this was re-run. Since the backup is done monthly, you should not expect to see this updated more than monthly. I will try to add new "runs" toward the top of this page, and keep older runs below.
The "rank" column in these lists reflect their relative rank amongst the words that were undefined in Wiktionary as of 2 October 2005.
As of the 12/14/2008 XML dump, this garners no new entries at all, so is hereby discontinued. --Connel MacKenzie 15:34, 15 December 2008 (UTC)
Only "decent" word remaining:
- ahaua - Maori or Latin? - Neither. If you search Project Gutenburg it occurs in a transliteration of an indigenous language of the Americas.