Migrating a production-grade, high-traffic application from Google Cloud to AWS is no small feat. Here's how our DevOps and engineering teams executed.
🧭 Why We Migrated: Business Drivers Behind the Move
Our platform, serving millions of daily users, was running smoothly on GCP. However, evolving business goals, pricing considerations, and long-term cloud ecosystem alignment led us to migrate to AWS.
Key components of our GCP-based stack:
Web Tier : Next.js frontend + Django backend
Databases : MongoDB replica sets, MySQL clusters
Asynchronous Services : Redis, RabbitMQ
Search : Apache Solr for full-text search
Infrastructure : GCP Compute Engine VMs, managed instance groups, HTTPS Load Balancer
Storage : 21 TB of data in Google Cloud Storage (GCS)
📋 Step 0: Creating a Migration Runbook
We treated this as a mission-critical project. Our runbook included:
Stakeholders : CTO, DevOps Lead, Database Architect, Application Owners
Timeline : 8 weeks from planning to cutover
Phases : Network Setup Data Migration Database Sync Application Migration Cutover
Rollback Plan : Prepared and rehearsed with timelines for failback
🛠 Infrastructure Mapping: GCP vs AWS
Challenges in Mapping:
GCP allows custom CPU and RAM, AWS uses fixed instance types (t3, m6i, r6g, etc)
IOPS differences between GCP SSDs and AWS EBS gp3 required tuning
Cost model varies significantly (especially egress from GCP)
We used AWS Pricing Calculator and GCP Pricing Calculator to simulate monthly billing and select cost-optimized instance types.
🌐 Phase 1: AWS Network Infrastructure Setup
AWS Network Infrastructure (ap-south-1)
┌──────────────────────┐
│ GCP / DC VPC │
└────────┬─────────────┘
│
┌─────────▼─────────┐
│ Site-to-Site VPN │
└─────────┬─────────┘
│
┌─────────▼─────────┐
│ VPC │
│ (ap-south-1) │
└─────────┬─────────┘
│
┌──────────────────────────┼─────────────────────────────┐
│ │ │
┌─────────▼─────────┐ ┌──────────▼──────────┐ ┌──────────▼──────────┐
│ Public Subnet AZ1 │ │ Public Subnet AZ2 │ │ Public Subnet AZ3 │
│ - Bastion Host │ │ - NAT Gateway │ │ - Internet Gateway │
└─────────┬─────────┘ └──────────┬──────────┘ └──────────┬──────────┘
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│Private Subnet 1│ │Private Subnet 2│ │Private Subnet 3│
│App / DB Tier │ │App / DB Tier │ │App / DB Tier │
└────────────────┘ └────────────────┘ └────────────────┘
Security Groups + NACLs as per GCP mapping
VPC Flow Logs → CloudWatch Logs
🧩 Components Breakdown
| Component | Purpose |
| 3 AZs | High availability and fault tolerance |
| Public Subnets | Bastion, NAT, IGW for ingress/egress |
| Private Subnets | Isolated app and DB tiers |
| VPN | Secure hybrid GCPAWS connectivity |
| Security | Security Groups + NACLs derived from GCP firewall rules |
| Monitoring | VPC Flow Logs + CloudWatch Metrics for visibility |
📦 Phase 2: Data Migration (GCS S3)
We migrated over 21 TB of user-generated and application asset data from Google Cloud Storage (GCS) to Amazon S3. Given the scale, this phase required surgical precision in planning, execution, and cost control.
Tools & Techniques Used
AWS DataSync:
Chosen for its efficiency, security, and ability to handle large-scale object transfers.Service Account HMAC Credentials :
Used for secure bucket-to-bucket authentication between GCP and AWS.Phased Sync Strategy :
Note: We carefully validated checksums and object counts after each sync phase to ensure data integrity and avoid overwriting unchanged files.
💡 Smart Optimization Decisions
Selective Data Migration :
Delta Awareness :
Post-Migration S3 Tuning
After the bulk migration was completed, we fine-tuned our S3 environment for cost optimization, data hygiene , and long-term sustainability.
Lifecycle Policies Implemented :
Automatic archival of infrequently accessed data to S3 Glacier.
Expiry rules for:
Configured S3 Incomplete Multipart Upload Aborts :
Lessons Learned
Data Volume Data Complexity :
Even though we had tools for the job, coordinating syncs across staging, pre-prod, and production environments required careful orchestration and monitoring.Egress and DTO Costs :
Data Transfer Out from GCP was a hidden but substantial cost center plan ahead for this when budgeting.S3 Behavior Is Not GCS :
We had to adjust application logic and IAM policies post-migration to align with S3 object handling, access policies, and permissions model.
🗄 Phase 3: Database Migration
MongoDB Migration
Migrating MongoDB from GCP to AWS was one of the most sensitive components of the move due to its role in powering real-time operations and user sessions.
Our Strategy :
Replica Set Initialization : Set up MongoDB replica sets on AWS EC2 instances to mirror the topology running in GCP.
Oplog-Based Sync : Enabled oplog-based replication between AWS and GCP MongoDB nodes to ensure near real-time data synchronization without full data dumps.
Hybrid Node Integration : Deployed a MongoDB node in AWS , directly connected to the GCP replica set , acting as a bridge before full cutover.
iptables for Controlled Access : Used iptables rules to restrict write access during the sync period. This allowed inter-DB synchronization traffic only , blocking application-level writes and ensuring data consistency before switchover.
Failover Testing : Conducted multiple failover and promotion drills to validate readiness, with rollback plans in place.
Key Takeaway : Setting up a hybrid node and controlling access at the OS level allowed us to minimize data drift and test production-grade failovers without service disruption.
MySQL Migration
The MySQL component required careful orchestration to ensure transactional consistency and minimal downtime.
Our Approach :
Master-Slave Topology : Established a classic master-slave setup on AWS EC2 instances to replicate data from the GCP-hosted MySQL master.
Replication Lag Challenges : One of the major blockers encountered was replication lag during promotion drills, especially under active write-heavy workloads.
Controlled Write Freeze : We implemented iptables-based rules at the OS level to block application write traffic , allowing replication to catch up safely before cutover.
Promotion Strategy :
Key Takeaway : Blocking writes via iptables provided a clean buffer for promotion without the risk of in-flight transactions, making the cutover smooth and predictable.
End of Part 1: Setting the Stage for Migration
Youve seen how we architected an AWS environment from scratch, replicated critical systems like MongoDB and MySQL, and seamlessly migrated over 21 TB of assets from GCP to S3all while optimizing for cost, security, and scalability.
But this was just the calm before the storm.
"Give me six hours to chop down a tree and I will spend the first four sharpening the axe."
Abraham Lincoln
We were well-prepared. But would the systemsand the teamhold up during live cutover?
In Part 2: The Real Cutover & Beyond , well step into the fire:
What went wrong,
What we had to patch live,
And what we did to walk away from it stronger.
👉 Don't miss it. Follow me on LinkedIn for more deep-dive case studies and real-world DevOps/CloudOps stories like this.
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.