I am learning backend development by building a simple URL shortener project. My goal is to learn backend concepts and try to do things in simple but better ways, keeping scalability and reliability in mind.
Tech Stack:
- Python
- FastAPI
- PostgreSQL
- SQLAlchemy
a) Handling Click Events:
Currently, when a user clicks a short URL:
- I record the click in a clicks table (url_id, clicked_at, and other metadata).
- After that, the user is redirected to the long URL.
This direct DB insert works fine for a few users. But with thousands or millions of clicks, this becomes a bottleneck, as frequent DB writes are expensive.
I started thinking about using background tasks and eventual consistency. The idea was:
- Do the important updates immediately.
- Push less critical work to a task queue (e.g. Celery).
However, creating one background task per click is too heavy. So I thought of introducing a buffer system:
- Use Redis to capture click data quickly.
- Periodically fetch click data in bulk from Redis (using a task which runs periodically) and bulk-insert into DB.
This looks reasonable for my scale, but I’m not sure if this is the best approach. I have read about Kafka and event-driven architectures, but they feel too complex for my current learning level.
My questions for this part:
- Is the Redis buffer -> periodic bulk insert approach a good and simple way for this use case?
- Are there any better or simpler ways to handle click recording efficiently in small-to-medium projects?
b) Handling Sequentially Related Operations:
When a user is deactivated, all their URLs should also be deactivated. A simple approach is:
- Update user status.
- Update all related URLs.
This works fine for a few users. But if many users are deactivated at once (e.g. 1000 users with 100 URLs each), this will be slow.
So I thought of making the user update instant, and handling URL updates asynchronously.
Idea 1: Per-user Tasks
- When a user is deactivated, update the user status immediately.
- Push a background task (via Celery) to deactivate all URLs of that user.
- Similar for reactivation.
This works, but introduces a race condition:
- What if the user is reactivated before the first task runs?
- Example: user status changes from active → deactivated → active within a short time.
Now the queued deactivate task might still run and disable URLs incorrectly. After that queued reactivate task will also run, enabling the URLs. So, two operations, to return to the same original state.
I thought of using a versioning system:
- Maintain a version number for each user status change.
- Each task carries the version it was created for.
- Before updating URLs, the task checks if the current user version still matches.
- If not, it skips the update.
# Deactivation worker
if url.last_user_version >= task.version or user.status_version > task.version:
skip
else:
deactivate URL
url.last_user_version = task.version
# Reactivation worker
if url.last_user_version >= task.version or user.status_version > task.version:
skip
else:
activate URL
url.last_user_version = task.version
This prevents outdated tasks from applying invalid updates. If there are many users, will the task queue be overloaded?
Idea 2: Scheduled Task Approach:
Instead of per-user tasks, I can have a periodic scheduled task that:
- Scans a small user_status table (user_id, status, version, updated_at). All the status updates will be recorded per user in this table.
- Compares the current version with the last applied version.
- Applies URL updates only where needed.
If there are no changes, the task will still run
For Idea 1, I thought of using Redis again as an intermediate buffer:
- Store the user_id in a list only if it is not present.
- Suppose user deactivates, we add the user_id to buffer.
- If user reactivates, we check the buffer and remove the user_id.
- We avoid per-user tasks here, instead we will have a periodic task running, scanning the buffer.
Questions:
- Is this Redis buffer + periodic sync approach suitable for this kind of workflow?
- How can I avoid long delays between user deactivation and URL deactivation (so that redirection stops quickly)?
Is there a simpler or better design to handle these at small-to-medium scale?