DEV Community

Cover image for S3 Cross-Region Replication with Terraform - s3 High Availability
Ashraf Minhaj
Ashraf Minhaj

Posted on

S3 Cross-Region Replication with Terraform - s3 High Availability

Introduction

We keep our data in the cloud, it's stored in a region, but what if that region gets compromised? What if a natural disaster hits the datacenter and now our application is gone because it has data outage.
To avoid such scenarios, we should create a replica of our data in another region. The chance of both regions going down at once is extremely low. And that system has to be automatic, we can't copy paste everything every time. That's where AWS Cross Region Replication comes into play.

What is aws CRR?

In simple words, s3 Cross-Region Replication (CRR) lets you replicate data automatically from one bucket to another bucket. Here, Cross-Region means the bucket can be on different AWS regions.
In a simpler way, if we put one file in a bucket a copy will be created (replicated) in another bucket in another region.

Here's how CRR helps -

  • enhances data durability and availability, ensuring data is protected across geographically separate locations
  • useful for disaster recovery (DR) and
  • compliance requirements

The Game Plan

To set up CRR using Terraform, we need to:

  1. Create two buckets in different regions (source and destination)
  2. Enable versioning on both buckets
  3. Set proper bucket policies
  4. Create an IAM role for replication
  5. Define replication rules

s3 Cross Region Replication

This will be the directory structure for the project -

├── main.tf
├── s3.tf
├── s3_replication.tf
Enter fullscreen mode Exit fullscreen mode

1. Create two buckets in different regions

1.1. Create two providers

We need to create two provider for each regions. Alias is used so that we can identify them easily, you can name anything. So, the main.tf file is -

# Providers for different AWS regions
provider "aws" {
  alias  = "primary"
  region = "us-east-2" # Source region
}

provider "aws" {
  alias  = "secondary"
  region = "eu-north-1" # Destination region
}
Enter fullscreen mode Exit fullscreen mode

1.2. Create Source and Destination buckets

Let's write the s3.tf file to create two buckets and enable versioning -

# Source S3 Bucket (Primary Region)
resource "aws_s3_bucket" "source_bucket" {
  provider      = aws.primary
  bucket        = "primary-bucket-for-replication"
}

resource "aws_s3_bucket_versioning" "source_versioning" {
  provider = aws.primary
  bucket   = aws_s3_bucket.source_bucket.id
  versioning_configuration {
    status = "Enabled"
  }
}

# Destination S3 Bucket (Secondary Region)
resource "aws_s3_bucket" "destination_bucket" {
  provider      = aws.secondary
  bucket        = "secondary-bucket-for-replication"
}

resource "aws_s3_bucket_versioning" "destination_versioning" {
  provider = aws.secondary
  bucket   = aws_s3_bucket.destination_bucket.id
  versioning_configuration {
    status = "Enabled"
  }
}
Enter fullscreen mode Exit fullscreen mode

2. Add replication Policy

Now we will create a replication policy which will contain which actions can be done by one bucket on another like ReplicateObject, ReplicateDelete etc. and finally will bind it to a role that can be attached to our buckets using aws_s3_bucket_replication_configuration.
Notice using

filter {
      prefix = "" # Replicate all objects or *.pdf *.mp4
    }
Enter fullscreen mode Exit fullscreen mode

You can filter out what kind of files to replicate. Here I kept it on for all, because - why not it's a demo!
So, the s3_replication.tf file is -

# IAM Role for Replication
resource "aws_iam_role" "replication_role" {
  name = "s3-replication-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "s3.amazonaws.com" }
      Action    = "sts:AssumeRole"
    }]
  })
}

# IAM Policy for Replication
resource "aws_iam_policy" "replication_policy" {
  name        = "s3-replication-policy"
  description = "Allows S3 replication between primary and secondary buckets"

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
        Action   = ["s3:ReplicateObject", "s3:ReplicateDelete", "s3:GetObjectVersion", "s3:GetObjectVersionAcl"]
        Resource = "arn:aws:s3:::${aws_s3_bucket.source_bucket.bucket}/*"
      },
      {
        Effect   = "Allow"
        Action   = ["s3:ReplicateObject", "s3:ReplicateDelete"]
        Resource = "arn:aws:s3:::${aws_s3_bucket.destination_bucket.bucket}/*"
      }
    ]
  })
}

# Attach Policy to Role
resource "aws_iam_role_policy_attachment" "replication_policy_attach" {
  role       = aws_iam_role.replication_role.name
  policy_arn = aws_iam_policy.replication_policy.arn
}

# S3 Replication Configuration
resource "aws_s3_bucket_replication_configuration" "replication" {
  provider = aws.primary
  bucket   = aws_s3_bucket.source_bucket.id
  role     = aws_iam_role.replication_role.arn

  rule {
    id     = "cross-region-replication"
    status = "Enabled"

    filter {
      prefix = "" # Replicate all objects or *.pdf *.mp4
    }

    destination {
      bucket        = aws_s3_bucket.destination_bucket.arn
      storage_class = "STANDARD"
    }
    delete_marker_replication {
      status = "Enabled"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Now, if we put one file in the source bucket (primary-bucket-for-replication) the object will very soon be created automatically in the secondary bucket as well (secondary-bucket-for-replication`.

Final thoughts

For a production use case, use variables file to handle variables and use strict permission on IAM policies, grant only what’s necessary.
Happy Coding!

Top comments (2)

Collapse
 
nevodavid profile image
Nevo David

Amazing guide! How can this knowledge be applied to enhance global data resilience strategies?

Collapse
 
ashraf-minhaj profile image
Ashraf Minhaj

Thank you!
S3 CRR plays a key role in global data resilience by ensuring that critical data is automatically duplicated (replicated) across geographically distant regions. This reduces the risk of total data loss due to regional outages or disasters and helps meet compliance needs. It’s especially valuable for DR (Disaster Recovery) plans, where quick recovery with minimal data loss is crucial.