Internet Archive


Purpose

The Internet Archive Connector offers a clever and simple solution to collect historical web page content and compare over time. It is applicable on all websites registered in the Internet Archive database, which means all kinds of practical marketing opportunities as well. For example,

  • How has your competitor product portfolio developed over time?
  • What titles have been used over time?
  • What images have been used over time?
  • How many interactions, such as shares or comments, have been registered over time?

How To

  1. Make sure you have an API key to Phantom JS Cloud (see below).

  2. Enter your desired website and the year to collect historical "snapshot" images from.

  3. Next, click on Insert and SeoTools will list all URLs available for the desired time span.

  4. Inspect one of the collected URLs for a proper scraping syntax which extracts the content you are interested in. SeoTools supports many different options, for example XPath, Json, Regex, and CsQuery. The good old HTML functions works as well, for example HTMLTitle and LinkCount.

Metrics

The Connector also includes the following metrics:

  • First Seen
  • Last Seen
  • Archived Count

Getting an API key

  1. Register for a PhantomJS Cloud account.
  2. Activate your account.
  3. Retrieve your API key from the dashboard.

Contribute

This connector suite is open-sourced on GitHub.


See

Guides

Related Functions


Get help with this function in the community →