Google
 
   
Login
Username:

Password:


Lost Password?

Register now!
Search
Main Menu
top books
Polls
What do you think about php-deluxe.net?
Excellent!
Cool
Hmm..not bad
What the hell is this?
encyclopedia
recommendation
compare webbrowser
Freenet DSL
Who's Online
10 user(s) are online (9 user(s) are browsing encyclopedia)

Members: 0
Guests: 10

more...
browser tip
Unix Befehle
manual of unix befehle
recommendation!
Sponsored
partner

Archive site

An archive site is a type of website that stores information on, or the actual, webpages from the past for anyone to view.

=Common Techniques=

Two common techniques are (1) using a web crawler or (2) user submissions.

(1) By using a web crawler the service will not depend on an active community for their content, thereby building a larger database faster, which usually results in the community growing larger as well. However, web site developers and system administrators do have the ability to block these robots from accessing [certain] web pages (using a robots.txt).

(2) While it can be difficult to start such services due to potentially low rates of user submission, this system can yield some of the best results. By crawling web pages one is only able to obtain the information the public has bothered to post to the Internet. They may have not bothered to post it due to not thinking anyone would be interested in it, lack of a proper medium, etc. However, if they see someone wants their information then they may be more apt to submit it.

=Examples=

==Google Groups==

On s.

==Internet Archive==

The , Archive has been employing a web crawler to build up their database. They are one of the best known archive sites.

==TextFiles.com==

[http://www.textfiles.com TextFiles.com] is a large library of old text files sustained by Jason Scott Sadofsky. Its mission is to archive the old documents that had floated around the bulletin board systems (BBS) of his youth and to document other people s experiences on the BBSes.

==PANDORA Archive==

PANDORA (Pandora_Archive), founded in 1996 by the National Library of Australia, stands for Preserving and Accessing Networked Documentary Resources of Australia, which encapsolates their mission. They provide a long-term catalog of select online publications and web sites authored by Australians or that are of an Australian topic. They employ their PANDAS (PANDORA Digital Archiving System) when building their catalog.

=See Also=

  • Internet Archive s [http://web.archive.org Wayback machine]
  • - A guide to the Internet Archive s Wayback machine.
  • Pandora_Archive - Wikipedia article on PANDORA
  • [http://pandora.nla.gov.au/ PANDORA] - Official web site
  • [http://www.nla.gov.au/ National Library of Australia] - Hosts PANDORA