In previous articles, we explored how to download and install PrestoDB locally on your machine. In this guide, we take it a step further: you'll learn how to set up and run a single-node Presto cluster using Docker, and connect it to Apache Superset. We'll walk through querying data from multiple sources like MySQL and MongoDB via PrestoDB. Whether you're a developer, data engineer, or BI enthusiast, this step-by-step tutorial will help you build a modern analytics stack with open-source tools and Docker.
Pre-Requisites:
- Docker Application (I am using OrbStack).
- Knowledge of Basic Docker Commands.
Step -1: Project Structure:
Step -2: Setting Up Docker Compose:
version: "3.8"
services:
superset:
image: apache/superset:latest
container_name: superset
ports:
- "8088:8088"
environment:
SUPERSET_SECRET_KEY: 'supersecretkey'
PYTHONUNBUFFERED: 1
depends_on:
- db
volumes:
- superset_home:/app/superset_home
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8088/health"]
interval: 30s
timeout: 10s
retries: 5
command: >
/bin/bash -c "
sleep 10 &&
superset db upgrade &&
superset fab create-admin --username admin --firstname Admin --lastname User --email [email protected] --password admin &&
superset init &&
superset run -h 0.0.0.0 -p 8088
"
db:
image: postgres:15
container_name: superset_db
environment:
POSTGRES_DB: superset
POSTGRES_USER: superset
POSTGRES_PASSWORD: superset
volumes:
- db_data:/var/lib/postgresql/data
mysql:
image: mysql:latest
container_name: mysql
environment:
MYSQL_ROOT_PASSWORD: root
MYSQL_DATABASE: testdb
ports:
- "3307:3306"
volumes:
- mysql_data:/var/lib/mysql
mongo:
image: mongo:latest
container_name: mongodb
ports:
- "27018:27017"
volumes:
- mongo_data:/data/db
presto:
image: prestodb/presto:latest
container_name: presto
ports:
- "8081:8080"
volumes:
- ./presto/etc/catalog/mongodb.properties:/opt/presto-server/etc/catalog/mongodb.properties
- ./presto/etc/catalog/mysql.properties:/opt/presto-server/etc/catalog/mysql.properties
depends_on:
- mysql
- mongo
volumes:
superset_home:
db_data:
mysql_data:
mongo_data:
Step -3: Creating Presto Catalog Files:
mysql.properties (To connect MySQL Database)
connector.name=mysql
connection-url=jdbc:mysql://mysql:3306
connection-user=root
connection-password=root
mongodb.properties (To connect MongoDB Database)
connector.name=mongodb
mongodb.seeds=mongodb:27017
Step -4: Start all the Services:
- Go to terminal and navigate to the docker-compose.yml file directory.
- Hit the below command. (It will automatically start all the services, just wait for 3-5 mins, as docker will pull all the images).
docker-compose up -d
- Once all the images are pulled, hit the below command to check the status of all containers.
docker ps
You will see an output like snapshot shared above. Now, let's confirm that PrestoDB and Apache Superset are running on their respective ports or not.
Open browser and check Apache Superset is listening on port 8088 (http://localhost:8088/) and Presto on port 8081 (http://localhost:8081/).
Step -5: Connecting PrestoDB as a database to Apache Superset:
- Superset doesn't ship Presto driver by default. So, as a next step we need to install it manually. Hit the below command to enter inside superset docker container.
docker exec -it superset bash
As soon as you hit this command, you will be inside superset docker container.
We need to install pyhive[presto], this is a important Python package to connect PrestoDB with Superset. Hit the below command.
pip install "pyhive[presto]"
- Once Installation is complete, exit the Superset container using exit command and restart Superset container.
docker restart superset
- Open Superset on browser: localhost:8088 and enter username and password.
Username:admin
Password:admin
- Navigate to Settings -> Database Connections -> Database.
Click on CONNECT once you see "Connection looks good".
Congratulations, everything is running smoothly and Presto has connected with Apache Superset.
Step -6: Let's run a SQL Query also verify MySQL and MongoDB should visible as Catalogs:
Conclusion:
Follow Presto at Official Website, Linkedin, Youtube, and Join Slack channel to interact with the community.
Top comments (2)
good post!
Thanks @propelius