Wiktionary:Searchable external archives
This is a list of durable archives that are searchable for free. It is intended as a resource for finding citations of words to show that they satisfy Criteria for Inclusion. Not listed are resources that require paid membership, even after a trial period, or otherwise charge a fee to perform the search or retrieve quotations. This excludes sites such as Amazon.com, which requires a previous purchase to preview material, and Jstor.org, which requires subscription.
The most commonly cited sources are printed books, magazine and journal articles, and newspapers. Books.google.com is Wiktionary's go-to engine for searching books and some magazines, and scholar.google.com is a good engine for searching academic, scientific and medical journals. (Note that Google Scholar can be used to find mathematical symbols that are otherwise ignored by search engines and hence unfindable: search for the symbols' Latex notation.) Issuu.com is a large index of newspapers and magazines. (Note that Google Books, Google Scholar and Issuu sometimes index e-publications which do not exist in print, so mere inclusion in one of these indices is not a guarantee that a source is durably archived.)
Laws are also durably archived, and several websites exist to allow search corpora of them:
- Ireland: Houses of the Oireachtas (debates.oireachtas.ie, historical-debates.oireachtas.ie)
- United Nations Educational, Scientific and Cultural Organization
- United States of America, Federal: FindLaw
- United States of America: Legal Information Institute
Audio and video media
Some audio and video media produced in some countries are durably archived by libraries; these include commercially-released songs, motion pictures, and television shows. imsdb.com, the Internet Movie Script Database, provides a searchable archive of movie scripts.
Usenet is considered durably archived because its archives are decentralized. It has been accessible continuously since 1980, before the creation of the World Wide Web. It canbe accessed through Google Groups.
Other online media: websites are not durable
Websites are not considered durably archived; do not add any web search engines here. Sites such as web.archive.org attempt to archive the Internet where possible, but at present cannot be considered durable because they are at the mercy of the original copyright holders. (Note: citations from the web may be useful if they are particularly good examples of the use of a word or sense, and may be retained for this reason even though they do not help the word meet CFI.)
Media resources such as YouTube, intended for online use only, are not considered durably archived. If the material is taken from another source, such as a movie or television show, cite the original source.
Several institutions maintain corpora of English language works; in alphabetical order, these include:
- Brigham Young University Corpus of Contemporary American English
- British National Corpus
- The Free Library thefreelibrary.com
When attempting to attest an obscure, obsolete, or dialectal term, it can be useful to consult the Century Dictionary and Wright's English Dialect Dictionary, as these often provide pointers to books/manuscripts where the terms have been used.
Numerous websites maintain searchable copies of the Hebrew and Greek texts of the Bible, as well as numerous English, Latin, and other-language translations. These include BibleGateway.com, Biblehub.com, and Bible.cc.
Languages other than English
- Austrian literature online (German)
- Biblio (Portuguese)
- Bibliothèque nationale de France (French)
- Germany: Klaus Graf's Zeitungsarchive search engine (German)
- custom search in several German newspaper archives at once, including:
- zeus.zeit.de, welt.de, netzeitung., taz.de, berlinonline.de, spiegel.de, stern.de, freitag.de, jungewelt.de, nd-online.de
- Germany: Internet-Links für Journalisten (German) recherchetipps.de
- Germany, Berlin: taz - die tageszeitung (German) www.taz.de
- Vietnam: Thư viện Quốc gia Việt Nam (Vietnamese; look for collections like )