DEV Community

Cover image for πŸ” Retrying Failed Requests with Exponential Backoff
Abhinav
Abhinav

Posted on

πŸ” Retrying Failed Requests with Exponential Backoff

In a perfect world, every network request would succeed instantly. But in the real world, APIs fail, networks drop, and servers hiccup. Rather than giving up on the first failure, retrying with an intelligent strategy can make our applications more resilient.

Exponential Backoff β€” a proven algorithm for handling retries efficiently and politely.


🚧 Why Do Requests Fail?

APIs can fail for many transient reasons:

  • 🌐 Network timeouts
  • πŸ“Ά Temporary internet issues
  • 🚦 Rate limiting (429 Too Many Requests)
  • πŸ”§ Server-side overload (5xx errors)

In many of these cases, retrying the request after a delay can succeed.


βœ… The Problem with Naive Retries

Imagine if every client instantly retried after failure β€” the server would be flooded, making recovery even harder.

So we need to retry smarter β€” not harder.


⏳ What Is Exponential Backoff?

Exponential backoff increases the wait time between retries exponentially after each failure.

βŒ› Formula:

delay = baseDelay * (2 ^ attemptNumber)
Enter fullscreen mode Exit fullscreen mode

πŸ“ˆ Visualization: Exponential Growth of Delays

For example, with baseDelay = 500ms:

Attempt  Delay (ms)
-------  ----------
1        500
2        1000
3        2000
4        4000
5        8000
Enter fullscreen mode Exit fullscreen mode

And with jitter, those values will vary slightly to avoid thundering herds.

πŸ”€ Add a Bit of Jitter

Without randomness, clients can retry at the exact same time β€” causing a thundering herd problem. Adding jitter (random noise) avoids this.

const jitter = Math.random() * 100;
const delay = baseDelay * 2 ** attempt + jitter;
Enter fullscreen mode Exit fullscreen mode

πŸ§ͺ Sample Implementation: JavaScript (Node.js)

async function fetchWithRetry(url, options = {}, maxRetries = 5, baseDelay = 500) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(url, options);
      if (!response.ok && response.status >= 500) {
        throw new Error(`Server error: ${response.status}`);
      }
      return response;
    } catch (error) {
      if (attempt === maxRetries) {
        throw new Error(`Failed after ${maxRetries + 1} attempts: ${error.message}`);
      }
      const backoff = baseDelay * 2 ** attempt;
      const jitter = Math.random() * 100;
      const delay = backoff + jitter;
      console.warn(`Attempt ${attempt + 1} failed. Retrying in ${delay.toFixed(0)}ms...`);
      await new Promise((r) => setTimeout(r, delay));
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

πŸ’‘ Retryable vs Non-Retryable Errors

Not every error deserves a retry.

Error Code Description Retry?
500 Internal Server Error βœ…
503 Service Unavailable βœ…
429 Too Many Requests (rate limit) βœ…
400 Bad Request ❌
401/403 Unauthorized/Forbidden ❌

Also, check for Retry-After headers in rate-limited responses:

HTTP/1.1 429 Too Many Requests
Retry-After: 120
Enter fullscreen mode Exit fullscreen mode

You can honor that delay before retrying.


🌍 Real-World Examples

  • Stripe API: Automatically retries on 409, 429, and 5xx with exponential backoff.
  • Google Cloud APIs: Recommend exponential backoff with jitter for handling transient errors.
  • AWS SDKs: Use full jitter to randomize retries even more.

πŸ“‹ Best Practices

βœ… Retry only for transient failures

βœ… Use maximum retry caps to avoid infinite loops

βœ… Add jitter to prevent synchronized retries

βœ… Use headers like Retry-After if available

βœ… Log failures with retry metadata for debugging


🧠 Adding Redis for Retry Tracking

When retrying critical or expensive operations, storing retry metadata in Redis can prevent duplicate retries or help in logging/debugging:

// pseudo-code
const retryKey = `retry:${jobId}`;
const attempt = await redis.get(retryKey) || 0;
if (attempt > maxRetries) throw new Error('Too many retries');

await redis.set(retryKey, attempt + 1, 'EX', retryExpirySeconds);
Enter fullscreen mode Exit fullscreen mode

Redis helps us:

  • Persist retry counts across process restarts
  • Coordinate retries in distributed systems
  • Add TTLs to automatically clear retry metadata

🌐 Test APIs to Try This Locally

1. httpstat.us

Returns custom HTTP status codes and delays.

Examples:

  • https://httpstat.us/503 β†’ Simulates 503 Service Unavailable
  • https://httpstat.us/500?sleep=2000 β†’ Simulates 500 with delay
  • https://httpstat.us/429 β†’ Simulates rate limiting
  • https://httpstat.us/200 β†’ Simulates success

2. reqres.in

REST-style fake API for testing:

  • https://reqres.in/api/users/2 β†’ Valid request
  • https://reqres.in/api/users/23 β†’ Returns 404

πŸš€ Final Thoughts

In today’s distributed systems, failures are not exceptions β€” they are expected. Retrying failed requests with exponential backoff makes our systems resilient, polite, and production-grade.

If you're building a system that talks to APIs, don't just retry blindly. Use backoff, use jitter, and always fail gracefully when it's time to stop.


✍️ Bonus: Want to Plug This into Axios?

Here’s a wrapper using Axios and exponential backoff:

const axios = require('axios');

async function axiosRetry(url, options = {}, retries = 3, baseDelay = 300) {
  for (let i = 0; i <= retries; i++) {
    try {
      return await axios(url, options);
    } catch (err) {
      if (i === retries || !shouldRetry(err)) throw err;
      const delay = baseDelay * 2 ** i + Math.random() * 100;
      await new Promise(r => setTimeout(r, delay));
    }
  }
}

function shouldRetry(err) {
  return [429, 500, 503].includes(err.response?.status);
}
Enter fullscreen mode Exit fullscreen mode

πŸ”— Reference Blogs & Docs & Good Reads


Top comments (0)