DEV Community

Cover image for Setting Up Presto with Apache Superset using Docker 🐳 : Hands-On Guide
Saurabh Mahawar
Saurabh Mahawar

Posted on

Setting Up Presto with Apache Superset using Docker 🐳 : Hands-On Guide

In previous articles, we explored how to download and install PrestoDB locally on your machine. In this guide, we take it a step further: you'll learn how to set up and run a single-node Presto cluster using Docker, and connect it to Apache Superset. We'll walk through querying data from multiple sources like MySQL and MongoDB via PrestoDB. Whether you're a developer, data engineer, or BI enthusiast, this step-by-step tutorial will help you build a modern analytics stack with open-source tools and Docker.

Pre-Requisites:

  • Docker Application (I am using OrbStack).
  • Knowledge of Basic Docker Commands.

Step -1: Project Structure:

Project Structure

Step -2: Setting Up Docker Compose:

version: "3.8"

services:
  superset:
    image: apache/superset:latest
    container_name: superset
    ports:
      - "8088:8088"
    environment:
      SUPERSET_SECRET_KEY: 'supersecretkey'
      PYTHONUNBUFFERED: 1
    depends_on:
      - db
    volumes:
      - superset_home:/app/superset_home
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8088/health"]
      interval: 30s
      timeout: 10s
      retries: 5
    command: >
      /bin/bash -c "
      sleep 10 &&
      superset db upgrade &&
      superset fab create-admin --username admin --firstname Admin --lastname User --email [email protected] --password admin &&
      superset init &&
      superset run -h 0.0.0.0 -p 8088
      "

  db:
    image: postgres:15
    container_name: superset_db
    environment:
      POSTGRES_DB: superset
      POSTGRES_USER: superset
      POSTGRES_PASSWORD: superset
    volumes:
      - db_data:/var/lib/postgresql/data

  mysql:
    image: mysql:latest
    container_name: mysql
    environment:
      MYSQL_ROOT_PASSWORD: root
      MYSQL_DATABASE: testdb
    ports:
      - "3307:3306"
    volumes:
      - mysql_data:/var/lib/mysql

  mongo:
    image: mongo:latest
    container_name: mongodb
    ports:
      - "27018:27017"
    volumes:
      - mongo_data:/data/db

  presto:
    image: prestodb/presto:latest
    container_name: presto
    ports:
      - "8081:8080"
    volumes:
      - ./presto/etc/catalog/mongodb.properties:/opt/presto-server/etc/catalog/mongodb.properties
      - ./presto/etc/catalog/mysql.properties:/opt/presto-server/etc/catalog/mysql.properties
    depends_on:
      - mysql
      - mongo

volumes:
  superset_home:
  db_data:
  mysql_data:
  mongo_data:
Enter fullscreen mode Exit fullscreen mode

Step -3: Creating Presto Catalog Files:

mysql.properties (To connect MySQL Database)

connector.name=mysql
connection-url=jdbc:mysql://mysql:3306
connection-user=root
connection-password=root
Enter fullscreen mode Exit fullscreen mode

mongodb.properties (To connect MongoDB Database)

connector.name=mongodb
mongodb.seeds=mongodb:27017
Enter fullscreen mode Exit fullscreen mode

Step -4: Start all the Services:

  • Go to terminal and navigate to the docker-compose.yml file directory.

Present Working Directory

  • Hit the below command. (It will automatically start all the services, just wait for 3-5 mins, as docker will pull all the images).
docker-compose up -d
Enter fullscreen mode Exit fullscreen mode
  • Once all the images are pulled, hit the below command to check the status of all containers.
docker ps
Enter fullscreen mode Exit fullscreen mode

Docker Running Container's Status

Orbstack

  • You will see an output like snapshot shared above. Now, let's confirm that PrestoDB and Apache Superset are running on their respective ports or not.

  • Open browser and check Apache Superset is listening on port 8088 (http://localhost:8088/) and Presto on port 8081 (http://localhost:8081/).

Apache Superset is listening on port 8088

Presto is listening on port 8081

Step -5: Connecting PrestoDB as a database to Apache Superset:

  • Superset doesn't ship Presto driver by default. So, as a next step we need to install it manually. Hit the below command to enter inside superset docker container.
docker exec -it superset bash
Enter fullscreen mode Exit fullscreen mode
  • As soon as you hit this command, you will be inside superset docker container.

  • We need to install pyhive[presto], this is a important Python package to connect PrestoDB with Superset. Hit the below command.

pip install "pyhive[presto]"
Enter fullscreen mode Exit fullscreen mode
  • Once Installation is complete, exit the Superset container using exit command and restart Superset container.
docker restart superset
Enter fullscreen mode Exit fullscreen mode
  • Open Superset on browser: localhost:8088 and enter username and password.
Username:admin
Password:admin
Enter fullscreen mode Exit fullscreen mode
  • Navigate to Settings -> Database Connections -> Database.

Click on Test Connection to check the status

  • Click on CONNECT once you see "Connection looks good".

  • Congratulations, everything is running smoothly and Presto has connected with Apache Superset.

Step -6: Let's run a SQL Query also verify MySQL and MongoDB should visible as Catalogs:

Query Executed Successfully with MySQL and MongoDB as catalogs.

Conclusion:

Conclusion

Follow Presto at Official Website, Linkedin, Youtube, and Join Slack channel to interact with the community.

Top comments (2)

Collapse
 
propelius profile image
Propelius

good post!

Collapse
 
saurabhmahawar profile image
Saurabh Mahawar

Thanks @propelius