Internet Archive
Part of a series on The Internet. [View Related Entries]
[View Related Sub-entries]
About
The Internet Archive is a digital library and archival site dedicated to permanent documentation of and free public access to a wide variety of digital artifacts, ranging rom websites and music to videos and nearly three million books registered under public domain.
History
The Internet Archive[1] was founded in 1996 by American computer engineer and Internet activist Brewster Kahle[2], who also co-founded the web crawling service Alexa Internet around the same time. It initially began as Kahle's personal project to archive the World Wide Web, Netnews Bulletin Board System and other publicly available software and webpages, but by late 1999, its scope had been expanded to include other worthy archive collections like the Prelinger Archives and the NASA Images Archive. Kahle's collection was largely kept private on digital tape throughout the 1990s, only allowing researchers and scientists to access the database on special occasions. Despite its lack of public access, the Internet Archives was met by press coverage from several U.S. news publications upon its launch, including the New York Times, Washington Post, Wired Magazine and National Public Radio (NPR).
The database eventually became available for public access on the fifth anniversary of the project in 2001 with the launch of The Wayback Machine[3], a digital time capsule that allows its users to browse multiple versions of web pages archived over time. According to Kahle[4], he was inspired to create the Machine after visting the offices of now-defunct search engine service Alta Vista and witnessing the company's ambitious plan to store and index everything that was on the web. Throughout the 2000s, the Internet Archive continued to expand its collection by merging pre-existing databases as well as building new ones.
SOPA / PIPA Protest
On January 18th, 2012, the Internet Archives blacked out its website for twelve hours in protest of the controversial U.S. Internet bills Stop Online Piracy Act and the PROTECT IP Act, similar to many other resource websites and databases like Wikipedia.
10 Petabytes Milestone
In mid-October 2012, the project reached an important milestone[5] of 10 petabytes (10,000,000,000,000,000 bytes) worth of digital materials in preservation, including millions of digital books, music, movies and webpages.
San Francisco Office Fire
On November 5th, 2013, Internet Archive's office in San Francisco, California, caught fire, destroying approximately $600,000 worth of digital scanning equipment and damaging an adjacent apartment complex (shown below). According to the official blog post[19], no injuries were reported from the scene and the scope of damage was mostly limited to equipments, albeit costly, with some losses of "physical materials" that were being digitized in the scanning room.
In the blog post, the Internet Archive also announced an emergency fund drive[20] to rebuild its scanning capabilities and called on digitization services to help the group continue its archiving process throughout the recovery.
Features
The Internet Archives is mainly comprised of its free online services Wayback Machine and Archive-It, in addition to a number of specialized media collections that have been acquired over time, most notably the Prelinger Archives, NASA Image Archives, Open Library and Live Music Archive. [this section is currently being researched]
The Wayback Machine
The Wayback Machine is the Internet Archive's "three-dimensional index" service that allows its user to search, browse and access snapshots of the World Wide Web archived through its database over time. Since its launch in 2001, millions of websites and their associated data and media have been archived by the service, which can be used to learn more about what previous versions of certain websites used to look like, to grab source code that have disappeared from websites or to visit websites that no longer exist on the web. Often considered a crucial academic research tool in studying the history of the Internet, its popularity has also led to the synonymous usage of the terms "Wayback Machine" and "Internet Archive" in some online communities.
Archive-It
Archive-It is a web archiving service that enables individuals and organizations to harvest, catalog and preserve specialized collectons of digital media content in the archive format. All of Archive-It partners' collectons are also made publicly available with full-text search and some of them may be periodically indexed into the Internet Archive's general archive. As of mid-2011, the service had reached more than 180 partner institutions in 44 U.S. states and 14 countries with over 2.7 billion URLs and 1,534 public collections.
Open Library
The Open Library is a free, open-source software project which can be used to create a web-based database for every book ever published and archived. It holds at least 23 million catalog records of books and approximately 1.6 million fully-readable and downloadable books in the public domain.
Software Museum
On April 13th, 2013, TextFiles founder and archivist Jason Scott announced the launch of the Internet Archive's Software Museum[9] on his blog.[8] According to Scott, the world's largest software repository will serve as the host of smaller collections including shareware CDs[10], emulators for a number of old-school gaming consoles and computers[11], classic PC games[12] a mirror of the now-defunct gaming site FilePlanet[13], and the Tucows Software Library[14], which boasts more than 33,000 files in its collection. Scott also noted that the Museum's work is not completed, as it still lacks the necessary metadata needed to easily browse through the archive. The launch of the Software Museum was subsequently picked up by a number of tech news blogs such as Engadget[15], Tuaw[16] and VentureBeat.[17]
The Console Living Room
On December 26th, the Internet Archive launched The Console Living Room[21] archive for console games from several different consoles from the 1970s and 1980s, including the Atari 2600, ColecoVision, Magnavox Odyssey and Astrocade systems. In a blog post announcement,[22] archivist Jason Scott revealed that additional classic games would be added to the library in coming months.
On the following day, the BBC[23] published an article about the new console game library, which was subsequently posted to the /r/technology[24] subreddit. In the first six hours, the post gained over 6,000 up votes and 350 comments. In the coming days, several other news sites reported on the collection, including Mashable,[25] Engadget,[26] BoingBoing[27] and ArsTechnica.[28]
Traffic
According to its FAQ page and Alexa, the Internet Archives receives approximately 2.5 million daily unique visits and currently stands at the 278th place in U.S. rank and 222nd place in Global Rank.
External References
[1] Internet Archive – Digital Library of Free Books, Music, Movies and Wayback Machine
[2] Wikipedia – Internet Archive
[3] Internet Archive – Wayback Machine
[4] Wikipedia – Wayback Machine
[5] Internet Archive Blogs – 10,000,000,000,000,000 bytes archived!
[6] Internet Archive – Open Library
[7] Internet Archive – Archive It
[8] ASCII by Jason Scott – Change Computer History Forever: Well, Here We Are
[9] Internet Archive – Software Archive
[10] Internet Archive – The Shareware CD Archive
[11] Internet Archive – TOSEC: The Old School Emulation Center
[12] Internet Archive – Classic PC games
[13] Internet Archive via Wayback Machine – The Fireplanet: A Fileplanet Mirror
[14] Internet Archive – Tucows Software Library
[15] Engadget – Internet Archive expands software museum, invites you to dig in
[16] Tuaw via Wayback Machine – Internet Archive expands software collection, still needs more metadata
[17] VentureBeat – Internet Archive beefs up its free software museum
[18] Slate – Phew: Internet Not Lost in Fire at Internet Archive
[19] Internet Archive – Scanning Center Fire -- Please Help Rebuild
[20] Internet Archive – Donate to The Internet Archive
[21] Internet Archive – The Console Living Room
[22] Internet Archive – A Second Christmas Morning
[23] BBC – Internet Archive puts classic 70s and 80s games online
[24] Reddit – Internet Archive releases 70s and 80s games
[25] Mashable – Internet Archive Now Lets You Play Vintage
[26] Engadget – Internet Archive starts preserving classic game
[27] BoingBoing – Console Living Room
[28] Ars Technica – Internet Archive releases hundreds of classic game console ROMS
Top Comments
Rei_12
Oct 30, 2012 at 07:05PM EDT
VinchVolt
Oct 23, 2024 at 11:55PM EDT