0

I'm working on a data pipeline using airflow and aws redshift and s3. I'm trying to launch airflow with docker but I'm getting errors when I run docker-compose up.

The error:

Creating airflow_postgres_1 ... done
Creating airflow_webserver_1 ... done
Attaching to airflow_postgres_1, airflow_webserver_1
postgres_1   | The files belonging to this database system will be owned by user "postgres".
postgres_1   | This user must also own the server process.
postgres_1   | 
postgres_1   | The database cluster will be initialized with locale "en_US.utf8".
postgres_1   | The default database encoding has accordingly been set to "UTF8".
postgres_1   | The default text search configuration will be set to "english".
postgres_1   | 
postgres_1   | Data page checksums are disabled.
postgres_1   | 
postgres_1   | fixing permissions on existing directory /var/lib/postgresql/data ... ok
postgres_1   | creating subdirectories ... ok
postgres_1   | selecting default max_connections ... 100
postgres_1   | selecting default shared_buffers ... 128MB
postgres_1   | selecting default timezone ... Etc/UTC
postgres_1   | selecting dynamic shared memory implementation ... posix
postgres_1   | creating configuration files ... ok
postgres_1   | running bootstrap script ... ok
postgres_1   | performing post-bootstrap initialization ... ok
postgres_1   | syncing data to disk ... ok
postgres_1   | 
postgres_1   | WARNING: enabling "trust" authentication for local connections
postgres_1   | You can change this by editing pg_hba.conf or using the option -A, or
postgres_1   | --auth-local and --auth-host, the next time you run initdb.
postgres_1   | 
postgres_1   | Success. You can now start the database server using:
postgres_1   | 
postgres_1   |     pg_ctl -D /var/lib/postgresql/data -l logfile start
postgres_1   | 
postgres_1   | waiting for server to start....LOG:  database system was shut down at 2020-05-20 22:59:54 UTC
postgres_1   | LOG:  MultiXact member wraparound protections are now enabled
postgres_1   | LOG:  database system is ready to accept connections
postgres_1   | LOG:  autovacuum launcher started
postgres_1   |  done
postgres_1   | server started
postgres_1   | CREATE DATABASE
postgres_1   | 
postgres_1   | 
postgres_1   | /usr/local/bin/docker-entrypoint.sh: ignoring /docker-entrypoint-initdb.d/*
postgres_1   | 
postgres_1   | LOG:  received fast shutdown request
postgres_1   | LOG:  aborting any active transactions
postgres_1   | LOG:  autovacuum launcher shutting down
postgres_1   | waiting for server to shut down....LOG:  shutting down
postgres_1   | LOG:  database system is shut down
webserver_1  | DB: postgresql://airflow_user:***@postgres/airflow
webserver_1  | [2020-05-20 22:59:57,188] {db.py:378} INFO - Creating tables
webserver_1  | Traceback (most recent call last):
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2345, in _wrap_pool_connect
webserver_1  |     return fn()
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 304, in unique_connection
webserver_1  |     return _ConnectionFairy._checkout(self)
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
webserver_1  |     fairy = _ConnectionRecord.checkout(pool)
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
webserver_1  |     rec = pool._do_get()
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/impl.py", line 140, in _do_get
webserver_1  |     self._dec_overflow()
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 69, in __exit__
webserver_1  |     exc_value, with_traceback=exc_tb,
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
webserver_1  |     raise exception
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/impl.py", line 137, in _do_get
webserver_1  |     return self._create_connection()
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
webserver_1  |     return _ConnectionRecord(self)
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
webserver_1  |     self.__connect(first_connect_check=True)
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
webserver_1  |     pool.logger.debug("Error on connect(): %s", e)
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 69, in __exit__
webserver_1  |     exc_value, with_traceback=exc_tb,
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
webserver_1  |     raise exception
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 656, in __connect
webserver_1  |     connection = pool._invoke_creator(self)
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
webserver_1  |     return dialect.connect(*cargs, **cparams)
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 490, in connect
webserver_1  |     return self.dbapi.connect(*cargs, **cparams)
webserver_1  |   File "/usr/local/lib/python3.7/site-packages/psycopg2/__init__.py", line 127, in connect
webserver_1  |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
webserver_1  | psycopg2.OperationalError: could not connect to server: Connection refused
webserver_1  |  Is the server running on host "postgres" (172.24.0.2) and accepting
webserver_1  |  TCP/IP connections on port 5432?

So far i've tried:

Different the sql_alchemy_conn strings in airflow.cfg (all resulted in the same connection refused error as above)

sql_alchemy_conn = postgresql://redshiftuser:redshiftpassword@host:5439/db
sql_alchemy_conn = postgresql://postgresuser:[email protected]:5432/db
sql_alchemy_conn = postgresql+psycopg2://postgresuser:password@postgres:5432/db
sql_alchemy_conn = postgresql+psycopg2://postgresuser:postgresuserpassword@postgres:5432/db
sql_alchemy_conn = postgresql+psycopg2://postgresuser:postgresspassword@localhost:5432/db
sql_alchemy_conn = postgresql://postgresuser:postgresspassword@localhost:5432/db
sql_alchemy_conn = postgresql+psycopg2://postgresuser:databasepassword@localhost:5432/db
sql_alchemy_conn = postgresql://postgresuser:databasepassword@postgres/db

Within the postgres.conf file I've changed:

listen_addresses = 'localhost'

to

listen_addresses = '*'

Within the postgres UI tried:

ALTER DATABASE airflow CONNECTION LIMIT 5; ALTER SYSTEM SET listen_addresses = '*';

Relevant part of docker-compose.yaml:

version: '3.7'
services: 
  postgres: 
    image: postgres:9.6 
    environment: 
        - POSTGRES_USER=airflow_user
        - POSTGRES_PASSWORD=password
        - POSTGRES_DB=airflow
     logging:
       options:
         max-size: 10m
         max-file: "3"
     ports:
        - "5432"

The github repo link: https://github.com/marshall7m/data_engineering_capstone/tree/master/airflow

1
  • Can you include a relevant extract from the docker-compose.yml in the question (at least including both services, the images you're running and any network configuration)? Are you using any sort of scheme to wait for the database to be fully up before connecting to it? Commented May 21, 2020 at 21:30

1 Answer 1

2

You never tried the correct username, password, db and host combination as per your docker-compose.yaml definition!

Your host is postgres which is the same as the service name in the docker-compose.yaml, other values are clear from the environment variables

The connection string should be something this the following:

postgresql://airflow_user:password@postgres:5439/airflow

Unrelated to the question but please never submit plain text password to git (even if it is private repo), use Mozilla sops to encrypt the values or something similar. Also avoid default passwords use pwgen or any random password generator.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for the answer! I tried the connection string format you mentioned in your answer and it worked! Regarding the security tip you gave, was the plain text password something I explicitly stated? or was it within the traceback error snippet? The credentials within the docker-compose.yml and connection string were fake but I appreciate the tip and will be more aware next time.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.