Some stats[edit]

I took every occurrence of a binomial scientific name in every English-language book indexed by Google in their ngram data:

  • 63.12% of binomial occurrences have the genus (first word) found in Wiktionary
    (22832445 of 36173481; Wiktionary data: 2015-01-02)
  • 73.23% of binomial occurrences in books have their epithet (second word) found in Wiktionary.
    [26483302 of 36164803; Wiktionary data: 2015-02-19; Ngram data: 2012. Probably an underestimate as it's missing a lot of the long tail; that is, any scientific names used less than 40 times (ever) are excluded due to how Google's ngram data is stored, and therefore only 52,625 of the millions of binomials in the Catalogue of Life are included. The Catalogue of Life (2014) was used to define what is and isn't a species. A Latin or translingual entry was required for an entry to count as existing.]
  • 22.60% of binomial occurrences in books are found in Witkionary (as a two-word binomial entry)
    (8172289 of 36164803; Wiktionary data: 2015-01-02; 5 more needed to reach 25%, 55 to reach 30%)

Additional epithet entries needed in Wiktionary...

  • to reach 75%: 19 16 entries (list)
  • to reach 80%: 242 entries (list)
  • to reach 85%: 792 entries
  • to reach 90%: 1920 entries
  • to reach 95%: 4492 entries
  • to reach 99%: 10620 entries
  • to reach 100%: 14575 entries (100% of those commonly found in books)

More raw:

  • Of the genera found in books: 18.83% have entries.
    (3157 / 16768)
  • Of the specific epithets found in books: 32.26% have entries.
    (6945 of 21529; the ones that appear in a scientific name which appears at least 40 times in books)
  • Of the 52,625 binomials commonly seen in books, 4.65% (2446) have entries.

