Today I want to share something unexpected I faced — a stuck EC2 instance that wouldn’t respond to SSH or even the AWS Web CLI. It took me a while to figure out what was wrong, but I learned a lot through the process. If you’re new to AWS or DevOps like me, this might save you some stress!
🧩 The Problem
Everything seemed fine at first. But suddenly:
- I couldn’t SSH into my EC2 instance
- The Web CLI also wouldn’t connect
- The EC2 instance showed as “running” in the console
- Docker containers were running inside from the last few days
At first, I thought it was just a temporary glitch. But even after stopping and starting the instance again, nothing worked. I was stuck.
💡 The Root Cause
After digging deeper (and asking ChatGPT!), I learned that:
- My EC2 instance was a t2.micro, which uses CPU credits to manage performance.
- I had left Docker containers running for several days.
- This drained my CPU credit balance to 0.
- When CPU credits are gone, AWS throttles the CPU to 10% — which makes SSH and Web CLI unusable.
This completely locked me out of my instance. I couldn’t even access it to shut down Docker.
🔧 The Solution That Worked
Here’s how I fixed it without terminating the instance:
1. Launch a Temporary EC2 Instance
I created a new t2.micro with the same OS as my original instance.
2. Detach the Volume
From the AWS Console, I detached the root volume from my stuck instance.
3. Attach It to the New Instance
I then attached the volume to the temporary instance as a secondary disk (e.g. /dev/xvdf
).
4. Mount the Volume
I mounted the volume and created a rescue directory:
sudo mkdir /mnt/rescue
sudo mount /dev/xvdf1 /mnt/rescue
Then mounted other required filesystems:
sudo mount --bind /dev /mnt/rescue/dev
sudo mount --bind /sys /mnt/rescue/sys
sudo mount --bind /proc /mnt/rescue/proc
5. Enter chroot Environment
This allowed me to "enter" the original system:
sudo chroot /mnt/rescue
6. Disable Docker
Inside the chroot environment, I disabled Docker from auto-starting:
systemctl disable docker
Then I exited the chroot:
exit
7. Reattach the Volume
I detached the volume from the temporary instance and re-attached it to the original EC2 instance.
8. Reboot and Success 🎉
Now, when I started the instance, SSH worked again because Docker didn’t start, and the system wasn’t overloaded.
✅ What I Learned
Always monitor CPU credits if you're using t2.micro or burstable instances.
Don’t keep containers running forever unless you have a strong reason.
It's possible to rescue a stuck instance without deleting it.
AWS chroot rescue method is a powerful trick.
Thanks to ChatGPT, I saved hours and didn’t lose my data.
Resources
Over to You
Have you ever faced something like this?
How do you monitor or avoid issues like this with EC2 or Docker?
Let’s discuss below or connect with me:
🔗 x
🔗 LinkedIn
Thanks for reading — see you in the next blog! 👋
Top comments (0)