How I Migrated a Monolith Cron Job to a Serverless Architecture

#aws #serverless #lambda #systemdesign

A brief background about the app

It’s an app where users upload their data into categories. The app processes that data, runs some additional calculations and aggregations, and stores everything in a database. After that, users can view the resulting data in charts and grids on the website.

There’s also a requirement to capture that data and upload it as CSV files to an S3 bucket — daily, per account and per category.

There are around 20 categories and 10,000 accounts. So in total, the app needs to upload about 200,000 CSV files to S3 every day.

How it was

It all started with a simple cron job that ran a script every day at 5 AM.

The script did 3 things per account and category:

Loaded data from the MySQL database
Prepared the CSV file
Uploaded it to S3

Here’s what it looked like:

Problems 🙄

It’s not scalable — any increase in categories or accounts would linearly increase execution time
It’s not fault tolerant — no built-in retry mechanism or partial success tracking could cause permanent export loss for an entire day
It has a single point of failure — if the machine dies or script crashes mid-way, there’s no recovery or continuation
There’s no proper monitoring — no visibility into what succeeded, failed, or how long it took

How it’s now

Obviously, the previous solution didn’t scale well and needed a fix — fast.

Since we were already using AWS services, I decided to stay within that ecosystem.

I realized that to get the best scalability and fault tolerance, each CSV file had to be prepared and exported separately.

So, I broke the system down into three core components:

Export Scheduler — handles everything related to scheduling
Export Trigger — kicks off the export process for each file
Export Runner — does the actual work: loads the data, prepares the CSV, and uploads it to S3

Here’s what I came up with:

Here’s how the new system works, end to end:

EventBridge kicks things off by emitting a daily export event — this is the trigger that starts the whole export flow.
That event is picked up by the Export Trigger lambda, which is responsible for fetching relevant application data from the database and generating export requests — one for each account and category.
These requests are sent to a queue (or stream), which acts as a buffer and decouples the trigger from the actual export logic.
The Export Runner lambda then processes each export request: it loads the data, prepares the CSV file, and uploads it directly to the S3 bucket.
Meanwhile, logs, metrics, and alarms are pushed to CloudWatch, giving us visibility into what’s happening at every step — useful for monitoring and debugging.

What’s solved 🙂

It’s scalable — each file is handled separately, so the system can easily grow with more categories or accounts without slowing down.
It’s fault tolerant — if a single export fails, it won’t stop the others; failures are isolated and can be retried.
It has no single point of failure — the whole process runs on distributed, managed services, so there’s no risk of a script crashing and killing the job.
It has proper monitoring — with logs, metrics, and alarms, you can track what succeeded, what failed, and how long everything took.
It’s built on AWS-native tools — everything runs on managed services like Lambda, EventBridge, and S3, so there’s no infrastructure to manage or maintain.

Thanks for reading! If you want to stay updated with future posts, hit that follow button. And don’t hesitate to share your ideas or questions in the comments — I’m eager to hear from you!

DEV Community

How I Migrated a Monolith Cron Job to a Serverless Architecture

A brief background about the app

How it was

How it’s now

Top comments (0)