recommend-games / board-game-scraper

board-game-scraper

Scraping data about board games from the web. View the data live at Recommend.Games! Install via

pip install board-game-scraper

Sources

Board Game Atlas (bga)
BoardGameGeek (bgg)
DBpedia (dbpedia)
Luding.org (luding)
Spielen.de (spielen)
Wikidata (wikidata)

Run scrapers

Requires Python 3. Make sure Pipenv is installed and create the virtual environment:

python3 -m pip install --upgrade pipenv
pipenv install --dev
pipenv shell

Run a spider like so:

JOBDIR="jobs/${SPIDER}/$(date --utc +'%Y-%m-%dT%H-%M-%S')"
scrapy crawl "${SPIDER}" \
    --output 'feeds/%(name)s/%(time)s/%(class)s.csv' \
    --set "JOBDIR=${JOBDIR}"

where $SPIDER is one of the IDs above.

Run all the spiders with the run_scrapers.sh script. Get a list of the running scrapers' PIDs with the processes.sh script. You can close all the running scrapers via

./processes.sh stop

and resume them later.

Tests

You can run scrapy check to perform contract tests for all spiders, or scrapy check $SPIDER to test one particular spider. If tests fails, there most likely has been some change on the website and the spider needs updating.

Board game datasets

If you are interested in using any of the datasets produced by this scraper, take a look at the BoardGameGeek guild. A subset of the data can also be found on Kaggle.

Links

board-game-scraper: This repository
Recommend.Games: board game recommender using the scraped data
recommend-games-server: Server code for Recommend.Games
board-game-recommender: Recommender code for Recommend.Games

Dec	JAN	Feb
	20
2020	2021	2022

recommend-games / board-game-scraper

README.md

board-game-scraper

Sources

Run scrapers

Tests

Board game datasets

Links

About

Releases

Packages 1

Used by 1

Languages

recommend-games / board-game-scraper

Launching GitHub Desktop

Launching GitHub Desktop

Launching Xcode

Launching Visual Studio

Latest commit

Git stats

Files

README.md

board-game-scraper

Sources

Run scrapers

Tests

Board game datasets

Links

About

Topics

Resources

License

Releases

Packages 1

Used by 1

Languages