Wiktionary:Searchable external archives
This is a list of durable archives that are searchable for free. It is intended as a resource for finding citations of words as per the Criteria for Inclusion. Not listed are resources that require paid membership, even after a trial period, or otherwise charge a fee to perform the search or retrieve quotations. This would exclude sites such as Amazon.com, which requires a previous purchase to preview material.
Websites in general
Internet addresses are not considered durably archived in the general case. Do not add any web search engines here. However, do not remove any citations of insightful quotations from the Internet, even if the website no longer exists. Sites such as web.archive.org attempt to archive the Internet where possible, although at present such resources could not be considered durable because they are at the mercy of the original copyright holders.
Usenet (see Wikipedia article) is considered durably archived because of its age. It has been accessible constantly since 1980, before the creation of the World Wide Web. It is now only accessible through Google groups.
Media resources such as YouTube, intended for online use only, are not considered durably archived. If the material is taken from another source, such as a movie or television show, then cite the original source. In rare cases the clips are professional productions whose publishers have real, non-virtual contact information. Those special cases can be cited, but they are not considered durably archived on those grounds alone.
The following resources are durably archived in print form and are searchable online. Note that many websites also have forums and/or blogs which are not durably archived and should be excluded from searches.
To extend the lists below, add links to websites that archive material, but do not add a <tt>domain.name</tt> unless the website allows robots.
- Brigham Young University Corpus of Contemporary American English
- British National Corpus
- The Free Library thefreelibrary.com
- not spider friendly
- British Broadcasting Corporation
- Official Vatican website
- Google Book Search
- registration in some cases
- Project Gutenberg www.gutenberg.org/etext/ www.gutenberg.org/etext/
- HathiTrust Digital Library
- Bible: BibleGateway.com
- Computers: Government Computer News www.gcn.com/print/
- Computers:IT Managing Information Strategies www.misweb.com/magarticle.asp
- Computers:OS:Linux LinuxInsider www.linuxinsider.com/story/
- recent articles free, older also searchable but full article at a fee
- United States: Onion www.theonion.com
- United States, California, central: Modesto Bee
- United States, Illinois, Chicago: Chicago Sun-Times www.suntimes.com
- United States, Massachusetts, Boston: Boston globe www.boston.com
- United States, New York: Northern New York Historical Newspapers
- United States, Washington D.C. Washington Post www.washingtonpost.com
- NPR npr.org
- CNN.com edition.cnn.com rss.cnn.com www.cnn.com
- AOL news.aol.com
- Yahoo! news.yahoo.com
- Ireland: Houses of the Oireachtas debates.oireachtas.ie historical-debates.oireachtas.ie
- United Nations Educational, Scientific and Cultural Organization
- United States of America, Federal: FindLaw
- United States of America: Legal Information Inst.
- Internet Movie Script Database imsdb.com
- Austrian literature online (German)
- Biblio (Portuguese)
- Bibliothèque nationale de France (French)
- Germany: Klaus Graf's Zeitungsarchive search engine (German)
- custom search in several German newspaper archives at once, including:
- zeus.zeit.de, welt.de, netzeitung., taz.de, berlinonline.de, spiegel.de, stern.de, freitag.de, jungewelt.de, nd-online.de
- Germany: Internet-Links für Journalisten (German) recherchetipps.de
- Germany, Berlin: taz - die tageszeitung (German) www.taz.de
- Vietnam: Thư viện Quốc gia Việt Nam (Vietnamese; look for collections like )