The Wayback Machine - https://web.archive.org/web/20220920031024/https://github.com/scrapinghub
Skip to content
@scrapinghub

Scrapinghub

Turn web content into useful data

Pinned

  1. splash Public

    Lightweight, scriptable browser as a service with an HTTP API

    Python 3.7k 495

  2. dateparser Public

    python parser for human readable dates

    Python 2.1k 421

  3. A client interface for Scrapinghub's API

    Python 182 61

  4. extruct Public

    Extract embedded metadata from HTML markup

    Python 703 102

  5. spidermon Public

    Scrapy Extension for monitoring spiders execution.

    Python 428 81

  6. A python binding for crfsuite

    Python 741 216

Repositories