DEV Community

Gerson Morales
Gerson Morales

Posted on • Edited on

Automated RDS PostgreSQL Restoration Using GitHub Action🐘

This document provides a step-by-step guide for integrating GitHub Actions to automate the restoration of a production AWS RDS PostgreSQL database for use in development environments.

Infrastructure

  1. EKS cluster.
  2. RDS database running in AWS (Just for this case scenario).
  3. Github Repository.
  4. AWS IAM service.
  5. AWS S3 bucket.

Scenario

We have an Amazon RDS cluster running a PostgreSQL database, referred to as gerx_db_prod. Additionally, there are several lower-tier environments—such as gerx_db_dev_1, gerx_db_dev_2, and gerx_db_dev_3 that need to be kept up to date with the latest data from the production database. The goal is to enable on-demand synchronization of these environments by leveraging a GitHub Actions workflow.

AWS IAM configuration

We require the use of AWS IAM to create two roles with an established trust relationship. Additionally, we need to define two IAM policies:

First Policy: Policy granting permission to retrieve a pgdump file from an S3 bucket, which will be used for database restoration.

Second Policy: Policy providing access to an Amazon EKS cluster from github workflow.

I created this document to configure IRSA

https://dev.to/gerson_morales_3e89188d50/configure-irsa-using-eks-to-access-s3-from-a-pod-513a

But good news is that I would make this easy for you by adding how the code would look like for each role.

Role to allow pulling objects from S3 bucket would be like this

Role name: postgres-db-dev-restore-IRSA

Trusted entities

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::XXXXXXXXXXXXXXXX:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/SFAFJFJAFKAFKAFKAFLLSFLAFLAFLAFA"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "oidc.eks.us-east-1.amazonaws.com/id/SFAFJFJAFKAFKAFKAFLLSFLAFLAFLAFA:sub:sub": "system:serviceaccount:<EKS-NAMESPACE>:<EKS-SERVICE_ACCOUNT>",
                    "oidc.eks.us-east-1.amazonaws.com/id/SFAFJFJAFKAFKAFKAFLLSFLAFLAFLAFA:sub:aud": "sts.amazonaws.com"
                }
            }
        }
    ]
}
Enter fullscreen mode Exit fullscreen mode

Policy attached to the role postgres-db-dev-restore-IRSA

{
    "Statement": [
        {
            "Action": [
                "s3:GetObject",
                "s3:ListBucket",
                "s3:PutObject"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::pgdump_my_bucket_name",
                "arn:aws:s3:::pgdump_my_bucket_name/*"
            ]
        }
    ],
    "Version": "2012-10-17"
}
Enter fullscreen mode Exit fullscreen mode

Role to allow access from github repo using main branch to the EKS cluster.

Role name: postgres-db-dev-eks-refresh

Trusted entities

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::xxxxxxxxxxx:oidc-provider/token.actions.githubusercontent.com"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringLike": {
          "token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
          "token.actions.githubusercontent.com:sub": "repo:myrepo:ref:refs/heads/main"
        }
      }
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Policy attached to the role postgres-db-dev-eks-refresh

{
    "Statement": [
        {
            "Action": [
                "eks:DescribeCluster",
                "eks:ListClusters",
                "eks:AccessKubernetesApi"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:eks:us-east-1:xxxxxxxx:cluster/dev-usva-gerx24-cluster"
        },
        {
            "Action": [
                "sts:AssumeRole"
            ],
            "Effect": "Allow",
            "Resource": "*"
        }
    ],
    "Version": "2012-10-17"
}
Enter fullscreen mode Exit fullscreen mode

With the two new IAM roles created each configured with its respective trust relationship and attached policies you will need to retrieve their ARNs. These ARNs are required for use in two specific areas, which I will outline next.

The role ARN for postgres-db-dev-restore-IRSA

(arn:aws:iam::xxxxxx:role/postgres-db-dev-restore-IRSA) should be associated with the Kubernetes service account responsible for retrieving the pgdump file (postgres_dump_$DATE.sql) from S3. This file will be used to perform the database restoration.

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: restore-db-sa
  namespace: restore-db
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxx:role/postgres-db-dev-restore-IRSA
Enter fullscreen mode Exit fullscreen mode

The role ARN for postgres-db-dev-eks-refresh (arn:aws:iam::xxxxxx:role/postgres-db-dev-eks-refresh) must be added to the EKS cluster by updating the aws-auth ConfigMap in the kube-system namespace. The modification should be as follows:

    - "groups":
      - "github-ci-group"
      "rolearn": "arn:aws:iam::xxxxxx:role/postgres-db-dev-eks-refresh"
      "username": "github:db-restore"
Enter fullscreen mode Exit fullscreen mode

Finally, we need to configure RBAC to grant the role access exclusively to the namespace where the GitHub triggered job responsible for database restoration will be deployed.

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: restore-db
  name: postgres-db-restore-role
rules:
  - apiGroups: [""]
    resources: ["pods", "services"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: ["batch"]
    resources: ["jobs"]
    verbs: ["get", "list", "watch", "create", "delete"]
Enter fullscreen mode Exit fullscreen mode
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: postgres-db-restore-rolebinding
  namespace: restore-db
subjects:
  - kind: User
    name: github:db-restore
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: restore-db
  apiGroup: rbac.authorization.k8s.io
Enter fullscreen mode Exit fullscreen mode

Configure ConfigMap to run SQL script to grant privileges to postgres_owner_role

apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-db-restore-dev
  namespace: restore-db
data:
  privileges.sql: |
    DO $$
    BEGIN
      -- Grant privileges on all tables in all schemas
      EXECUTE (
        SELECT string_agg(
          'GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA ' || quote_ident(schemaname) || ' TO postgres_owner_role;',
          ' '
        )
        FROM pg_catalog.pg_tables
        WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
      );
    END
    $$;
Enter fullscreen mode Exit fullscreen mode

Configure k8s secret or external secret in this case which contains the password to access database that would be use in variable PGPASSWORD

---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-refresh-dev
  namespace: restore-db
  annotations:
    argocd.argoproj.io/sync-wave: "-12"
spec:
  dataFrom:
  - extract:
      key: db-refresh-dev/db-refresh-dev-secret
  refreshInterval: 1h
  secretStoreRef:
    kind: ClusterSecretStore
    name: aws-secrets-manager
  target:
    name: db-refresh-dev
    template:
      engineVersion: v2
      type: Opaque
Enter fullscreen mode Exit fullscreen mode

At this stage, we should be ready to configure the GitHub Actions workflow to execute the database restoration process.

Github workflow configuration

Let’s begin by configuring a GitHub Actions workflow in the repository from which the process will be triggered. This workflow should allow the selection of a lower-tier environment e.g. dev that needs to be refreshed with the latest data from the production database.

name: db restore dev [gerx_db_prod]


on:
  workflow_dispatch:
    inputs:
      database:
        description: "gerx_db_dev_x"
        required: true
        type: string
      environment:
        description: "environment"
        default: int
        type: string
      date:
        description: "Backup date format e.g 20250512 yyyymmdd"
        required: true
        type: string

  ## This can be also used as workflow_call ##
  workflow_call:
    inputs:
      database:
        description: "gerx_db_dev_x"
        required: true
        type: string
      environment:
        description: "environment"
        default: dev
        type: string
      date:
        description: "Backup date format e.g 20250515 year/month/day"
        required: true
        type: string

jobs:
  db-restore-int:
    runs-on: ubuntu-latest
    permissions:
      id-token: write
      contents: read

    steps:
      - name: 🔑 Get AWS Creds
        id: aws-creds
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-region: us-east-1
          role-to-assume: arn:aws:iam::xxxxxx:role/postgres-db-dev-eks-refresh

      - name: Update kubeconfig for EKS
        run: |
          aws eks update-kubeconfig --name ${{ inputs.environment }}-usva-gerx24-cluster --region us-east-1

      - name: Deploy Job
        run: |
          export DB_NAME=${{ inputs.database }}
          export ENV=${{ inputs.environment }}
          export DATE=${{ inputs.date }}
          export PGPASSWORD=${{ secrets.PGPASSWORD }}

          cat <<EOF | envsubst | kubectl apply -f -
          apiVersion: batch/v1
          kind: Job
          metadata:
            name: db-restore-job-$DB_NAME
            namespace: postgres-db-restore
            labels:
              app: db-restore-job
          spec:
            ttlSecondsAfterFinished: 300
            template:
              metadata:
                name: db-restore-job
                labels:
                  app: db-restore-job
              spec:
                initContainers:
                - name: copying-pgdump
                  image: amazon/aws-cli
                  command:
                    - /bin/sh
                    - -c
                    - |
                      echo "Copying files from my-bucket"
                      aws s3 cp s3://pgdump_my_bucket_name/ /pg-dump --recursive
                  volumeMounts:
                    - name: pg-dump
                      mountPath: /pg-dump
                containers:
                - name: db-restore
                  image: gerx24/centos-tools:3.0.0
                  env:
                    - name: PGPASSWORD
                       valueFrom:
                          secretKeyRef:
                            name: db-refresh-dev
                            key: PGPASSWORD
                    - name: DB_NAME
                      value: "$DB_NAME"
                    - name: ENV
                      value: "$ENV"
                    - name: DATE
                      value: "$DATE"
                  command: ["/bin/bash", "-c"]
                  args:
                    - |
                      echo "Dropping old database..."
                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "ALTER DATABASE $DB_NAME OWNER TO root;"

                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c " SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = '$DB_NAME AND pid <> pg_backend_pid(); DROP DATABASE $DB_NAME;"

                      echo "Creating new database..."
                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "CREATE DATABASE $DB_NAME;"

                      echo "Changing ownership..."
                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "ALTER DATABASE $DB_NAME OWNER TO postgres_owner_role;"
                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "GRANT postgres_owner_role TO postgres_owner_green;"
                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "ALTER USER postgres_owner_green SET ROLE postgres_owner_role;"
                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "GRANT ALL PRIVILEGES ON DATABASE $DB_NAME TO postgres_owner_role;"
                      echo "Restoring database..."
                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d $DB_NAME -f /pg-dump/postgres_dump_$DATE.sql

                      echo "Altering schema ownership..."
                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d $DB_NAME -c "ALTER SCHEMA public OWNER TO postgres_owner_role; ALTER SCHEMA client_side OWNER TO postgres_owner_role; ALTER SCHEMA settings OWNER TO postgres_owner_role;"

                      echo "Running script"
                      psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d $DB_NAME -f /script/privileges.sql
                  volumeMounts:
                    - name: pg-dump
                      mountPath: /pg-dump
                    - name: privileges-script
                      mountPath: /script
                volumes:
                  - name: pg-dump
                    emptyDir: {}
                  - name: privileges-script
                    configMap:
                      name: postgres-db-restore-dev
                restartPolicy: OnFailure
                serviceAccountName: restore-db-sa
          EOF

      - name: Wait for Job to Succeed [5 minutes check]
        run: |
          echo "Checking status of job db-restore-job-${{ inputs.database }}"
          for i in {1..30}; do
            STATUS=$(kubectl get job db-restore-job-${{ inputs.database }} -n postgres-db-restore -o jsonpath="{.status.conditions[?(@.type=='Complete')].status}")
            if [[ "$STATUS" == "True" ]]; then
              echo "✅ Job db-restore-job-${{ inputs.database }} completed successfully."
              exit 0
            fi

            FAILED=$(kubectl get job db-restore-job-${{ inputs.database }} -n postgres-db-restore -o jsonpath="{.status.failed}")
            if [[ "$FAILED" -ge 1 ]]; then
              echo "❌ Job db-restore-job-${{ inputs.database }} failed."
              exit 1
            fi

            echo "⏳ Job db-restore-job-${{ inputs.database }} not complete yet... waiting 10 seconds"
            sleep 10
          done

          echo "⏰ Timed out waiting for job to complete."
          exit 1


      - name: Delete Job
        run: |
          kubectl delete job db-restore-job-${{ inputs.database }} -n postgres-db-restore
          echo "Job db-restore-job-${{ inputs.database }} completed"
Enter fullscreen mode Exit fullscreen mode

The job above would do the following:

A. You are expected to include the input parameters for GitHub Actions, using values like below including database_name, environment and date using yyyymmdd format as below.

          export DB_NAME=gerx_db_dev_1
          export ENV=dev
          export DATE=20250515 (Lets assumed you are using a dump created today)
          export PGPASSWORD=${{ secrets.PGPASSWORD }} (This is a secret in the repo with the password you can use to connect to the database)
Enter fullscreen mode Exit fullscreen mode

secrets.PGPASSWORD -> This would be the secret password to access database save in the repo as a secret.

Image description

B. The GitHub Action run will initiate a Kubernetes job that begins with an initContainer. This container will download the latest postgres_dump_$DATE.sql backup from s3://pgdump_my_bucket_name/ (At this point we assumed that you have it on your bucket) and place those in a shared volume. The main container will then use this shared volume to execute a series of psql commands that would do the following.

1- Drop the existing database.
2- Recreate the database.
3- Change ownership.
4- Restore database. (gerx_db_dev_1) from (postgres_dump_$DATE.sql)
5- Add schema permissions.
6- Finally run the script to grant privileges on all tables in all schemas.

C. Finally, the GitHub Action will perform a verification check using kubectl within a 5 minute window to confirm the status of the Kubernetes job. Once the job is confirmed to have succeeded, the action will proceed to clean up by deleting the job from the Kubernetes cluster.

Top comments (0)