Arquivo.pt - The Portuguese web-archive (PWA) is the national Web archive of Portugal. Its mission is to periodically archive contents of national interest available on the Web, storing and preserving for future generations information of historical relevance. It is a service of the Foundation for Science and Technology (FCT).
TIMESTAMPS
The Wayback Machine - https://web.archive.org/web/20200913231951/https://scrapy.org/
pip install shub
shub login
Insert your Scrapinghub API Key: <API_KEY># Deploy the spider to Scrapy Cloud shub deploy
# Schedule the spider for execution shub schedule blogspider
Spider blogspider scheduled, watch it running here:
https://app.scrapinghub.com/p/26731/job/1/8# Retrieve the scraped data shub items 26731/1/8
{"title":"Improved Frontera: Web Crawling at Scale with Python 3 Support"}{"title":"How to Crawl the Web Politely with Scrapy"}...