This document provides a step-by-step guide for integrating GitHub Actions to automate the restoration of a production AWS RDS PostgreSQL database for use in development environments.
Infrastructure
- EKS cluster.
- RDS database running in AWS (Just for this case scenario).
- Github Repository.
- AWS IAM service.
- AWS S3 bucket.
Scenario
We have an Amazon RDS cluster running a PostgreSQL database, referred to as gerx_db_prod. Additionally, there are several lower-tier environments—such as gerx_db_dev_1, gerx_db_dev_2, and gerx_db_dev_3 that need to be kept up to date with the latest data from the production database. The goal is to enable on-demand synchronization of these environments by leveraging a GitHub Actions workflow.
AWS IAM configuration
We require the use of AWS IAM to create two roles with an established trust relationship. Additionally, we need to define two IAM policies:
First Policy: Policy granting permission to retrieve a pgdump file from an S3 bucket, which will be used for database restoration.
Second Policy: Policy providing access to an Amazon EKS cluster from github workflow.
I created this document to configure IRSA
https://dev.to/gerson_morales_3e89188d50/configure-irsa-using-eks-to-access-s3-from-a-pod-513a
But good news is that I would make this easy for you by adding how the code would look like for each role.
Role to allow pulling objects from S3 bucket would be like this
Role name: postgres-db-dev-restore-IRSA
Trusted entities
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::XXXXXXXXXXXXXXXX:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/SFAFJFJAFKAFKAFKAFLLSFLAFLAFLAFA"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"oidc.eks.us-east-1.amazonaws.com/id/SFAFJFJAFKAFKAFKAFLLSFLAFLAFLAFA:sub:sub": "system:serviceaccount:<EKS-NAMESPACE>:<EKS-SERVICE_ACCOUNT>",
"oidc.eks.us-east-1.amazonaws.com/id/SFAFJFJAFKAFKAFKAFLLSFLAFLAFLAFA:sub:aud": "sts.amazonaws.com"
}
}
}
]
}
Policy attached to the role postgres-db-dev-restore-IRSA
{
"Statement": [
{
"Action": [
"s3:GetObject",
"s3:ListBucket",
"s3:PutObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::pgdump_my_bucket_name",
"arn:aws:s3:::pgdump_my_bucket_name/*"
]
}
],
"Version": "2012-10-17"
}
Role to allow access from github repo using main branch to the EKS cluster.
Role name: postgres-db-dev-eks-refresh
Trusted entities
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::xxxxxxxxxxx:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringLike": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "repo:myrepo:ref:refs/heads/main"
}
}
}
]
}
Policy attached to the role postgres-db-dev-eks-refresh
{
"Statement": [
{
"Action": [
"eks:DescribeCluster",
"eks:ListClusters",
"eks:AccessKubernetesApi"
],
"Effect": "Allow",
"Resource": "arn:aws:eks:us-east-1:xxxxxxxx:cluster/dev-usva-gerx24-cluster"
},
{
"Action": [
"sts:AssumeRole"
],
"Effect": "Allow",
"Resource": "*"
}
],
"Version": "2012-10-17"
}
With the two new IAM roles created each configured with its respective trust relationship and attached policies you will need to retrieve their ARNs. These ARNs are required for use in two specific areas, which I will outline next.
The role ARN for postgres-db-dev-restore-IRSA
(arn:aws:iam::xxxxxx:role/postgres-db-dev-restore-IRSA) should be associated with the Kubernetes service account responsible for retrieving the pgdump file (postgres_dump_$DATE.sql) from S3. This file will be used to perform the database restoration.
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: restore-db-sa
namespace: restore-db
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxx:role/postgres-db-dev-restore-IRSA
The role ARN for postgres-db-dev-eks-refresh (arn:aws:iam::xxxxxx:role/postgres-db-dev-eks-refresh) must be added to the EKS cluster by updating the aws-auth ConfigMap in the kube-system namespace. The modification should be as follows:
- "groups":
- "github-ci-group"
"rolearn": "arn:aws:iam::xxxxxx:role/postgres-db-dev-eks-refresh"
"username": "github:db-restore"
Finally, we need to configure RBAC to grant the role access exclusively to the namespace where the GitHub triggered job responsible for database restoration will be deployed.
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: restore-db
name: postgres-db-restore-role
rules:
- apiGroups: [""]
resources: ["pods", "services"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: postgres-db-restore-rolebinding
namespace: restore-db
subjects:
- kind: User
name: github:db-restore
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: restore-db
apiGroup: rbac.authorization.k8s.io
Configure ConfigMap to run SQL script to grant privileges to postgres_owner_role
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-db-restore-dev
namespace: restore-db
data:
privileges.sql: |
DO $$
BEGIN
-- Grant privileges on all tables in all schemas
EXECUTE (
SELECT string_agg(
'GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA ' || quote_ident(schemaname) || ' TO postgres_owner_role;',
' '
)
FROM pg_catalog.pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
);
END
$$;
Configure k8s secret or external secret in this case which contains the password to access database that would be use in variable PGPASSWORD
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-refresh-dev
namespace: restore-db
annotations:
argocd.argoproj.io/sync-wave: "-12"
spec:
dataFrom:
- extract:
key: db-refresh-dev/db-refresh-dev-secret
refreshInterval: 1h
secretStoreRef:
kind: ClusterSecretStore
name: aws-secrets-manager
target:
name: db-refresh-dev
template:
engineVersion: v2
type: Opaque
At this stage, we should be ready to configure the GitHub Actions workflow to execute the database restoration process.
Github workflow configuration
Let’s begin by configuring a GitHub Actions workflow in the repository from which the process will be triggered. This workflow should allow the selection of a lower-tier environment e.g. dev that needs to be refreshed with the latest data from the production database.
name: db restore dev [gerx_db_prod]
on:
workflow_dispatch:
inputs:
database:
description: "gerx_db_dev_x"
required: true
type: string
environment:
description: "environment"
default: int
type: string
date:
description: "Backup date format e.g 20250512 yyyymmdd"
required: true
type: string
## This can be also used as workflow_call ##
workflow_call:
inputs:
database:
description: "gerx_db_dev_x"
required: true
type: string
environment:
description: "environment"
default: dev
type: string
date:
description: "Backup date format e.g 20250515 year/month/day"
required: true
type: string
jobs:
db-restore-int:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- name: 🔑 Get AWS Creds
id: aws-creds
uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: us-east-1
role-to-assume: arn:aws:iam::xxxxxx:role/postgres-db-dev-eks-refresh
- name: Update kubeconfig for EKS
run: |
aws eks update-kubeconfig --name ${{ inputs.environment }}-usva-gerx24-cluster --region us-east-1
- name: Deploy Job
run: |
export DB_NAME=${{ inputs.database }}
export ENV=${{ inputs.environment }}
export DATE=${{ inputs.date }}
export PGPASSWORD=${{ secrets.PGPASSWORD }}
cat <<EOF | envsubst | kubectl apply -f -
apiVersion: batch/v1
kind: Job
metadata:
name: db-restore-job-$DB_NAME
namespace: postgres-db-restore
labels:
app: db-restore-job
spec:
ttlSecondsAfterFinished: 300
template:
metadata:
name: db-restore-job
labels:
app: db-restore-job
spec:
initContainers:
- name: copying-pgdump
image: amazon/aws-cli
command:
- /bin/sh
- -c
- |
echo "Copying files from my-bucket"
aws s3 cp s3://pgdump_my_bucket_name/ /pg-dump --recursive
volumeMounts:
- name: pg-dump
mountPath: /pg-dump
containers:
- name: db-restore
image: gerx24/centos-tools:3.0.0
env:
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: db-refresh-dev
key: PGPASSWORD
- name: DB_NAME
value: "$DB_NAME"
- name: ENV
value: "$ENV"
- name: DATE
value: "$DATE"
command: ["/bin/bash", "-c"]
args:
- |
echo "Dropping old database..."
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "ALTER DATABASE $DB_NAME OWNER TO root;"
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c " SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = '$DB_NAME AND pid <> pg_backend_pid(); DROP DATABASE $DB_NAME;"
echo "Creating new database..."
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "CREATE DATABASE $DB_NAME;"
echo "Changing ownership..."
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "ALTER DATABASE $DB_NAME OWNER TO postgres_owner_role;"
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "GRANT postgres_owner_role TO postgres_owner_green;"
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "ALTER USER postgres_owner_green SET ROLE postgres_owner_role;"
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d postgres -c "GRANT ALL PRIVILEGES ON DATABASE $DB_NAME TO postgres_owner_role;"
echo "Restoring database..."
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d $DB_NAME -f /pg-dump/postgres_dump_$DATE.sql
echo "Altering schema ownership..."
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d $DB_NAME -c "ALTER SCHEMA public OWNER TO postgres_owner_role; ALTER SCHEMA client_side OWNER TO postgres_owner_role; ALTER SCHEMA settings OWNER TO postgres_owner_role;"
echo "Running script"
psql -h rds.gerx24.usva.$ENV.gersonplace.com -p 5432 -U root -d $DB_NAME -f /script/privileges.sql
volumeMounts:
- name: pg-dump
mountPath: /pg-dump
- name: privileges-script
mountPath: /script
volumes:
- name: pg-dump
emptyDir: {}
- name: privileges-script
configMap:
name: postgres-db-restore-dev
restartPolicy: OnFailure
serviceAccountName: restore-db-sa
EOF
- name: Wait for Job to Succeed [5 minutes check]
run: |
echo "Checking status of job db-restore-job-${{ inputs.database }}"
for i in {1..30}; do
STATUS=$(kubectl get job db-restore-job-${{ inputs.database }} -n postgres-db-restore -o jsonpath="{.status.conditions[?(@.type=='Complete')].status}")
if [[ "$STATUS" == "True" ]]; then
echo "✅ Job db-restore-job-${{ inputs.database }} completed successfully."
exit 0
fi
FAILED=$(kubectl get job db-restore-job-${{ inputs.database }} -n postgres-db-restore -o jsonpath="{.status.failed}")
if [[ "$FAILED" -ge 1 ]]; then
echo "❌ Job db-restore-job-${{ inputs.database }} failed."
exit 1
fi
echo "⏳ Job db-restore-job-${{ inputs.database }} not complete yet... waiting 10 seconds"
sleep 10
done
echo "⏰ Timed out waiting for job to complete."
exit 1
- name: Delete Job
run: |
kubectl delete job db-restore-job-${{ inputs.database }} -n postgres-db-restore
echo "Job db-restore-job-${{ inputs.database }} completed"
The job above would do the following:
A. You are expected to include the input parameters for GitHub Actions, using values like below including database_name, environment and date using yyyymmdd format as below.
export DB_NAME=gerx_db_dev_1
export ENV=dev
export DATE=20250515 (Lets assumed you are using a dump created today)
export PGPASSWORD=${{ secrets.PGPASSWORD }} (This is a secret in the repo with the password you can use to connect to the database)
secrets.PGPASSWORD -> This would be the secret password to access database save in the repo as a secret.
B. The GitHub Action run will initiate a Kubernetes job that begins with an initContainer. This container will download the latest postgres_dump_$DATE.sql backup from s3://pgdump_my_bucket_name/ (At this point we assumed that you have it on your bucket) and place those in a shared volume. The main container will then use this shared volume to execute a series of psql commands that would do the following.
1- Drop the existing database.
2- Recreate the database.
3- Change ownership.
4- Restore database. (gerx_db_dev_1) from (postgres_dump_$DATE.sql)
5- Add schema permissions.
6- Finally run the script to grant privileges on all tables in all schemas.
C. Finally, the GitHub Action will perform a verification check using kubectl within a 5 minute window to confirm the status of the Kubernetes job. Once the job is confirmed to have succeeded, the action will proceed to clean up by deleting the job from the Kubernetes cluster.
Top comments (0)