DEV Community

John Eliud Odhiambo
John Eliud Odhiambo

Posted on

Understanding Load Shedding

Introduction

In today’s digital landscape, applications must handle unpredictable traffic spikes without collapsing. Whether it’s a sudden surge in users or a distributed denial-of-service (DDoS) attack, systems need mechanisms to protect themselves from being overwhelmed. One such mechanism is load shedding, a defensive strategy that prioritizes critical functionality by selectively rejecting non-essential requests when the system is under stress.

In this article, we’ll explore:

  1. What is load shedding, and why is it necessary?
  2. A practical demo that shows load shedding in action.

By the end, you should understand how to apply this technique to keep your services resilient under pressure.

What Is Load Shedding?

Load shedding is the deliberate termination or deferral of non-critical requests to prevent system overload. It ensures that essential services remain available even when demand exceeds capacity.

Why Is It Necessary?

Without load shedding, a system might experience:

  1. Resource exhaustion: Too many requests consume CPU, memory, or network bandwidth.
  2. Cascading failures: One overloaded service can bring down dependent systems.
  3. Degraded performance: Even critical requests slow down, leading to timeouts and errors.

By shedding non-essential load, a system will:

  1. Preserve core functionality (e.g., login, payment processing).
  2. Prevent total outages by avoiding system collapse.
  3. Improve user experience by failing fast instead of slow degradation.

How Load Shedding Works In Code

Let us examine a practical implementation in Go and JavaScript. The demo consists of:

  1. Essential services (always available).
  2. Non-essential services (shed under high load).
  3. A global request tracker to enforce limits.

Tracking Active Requests

The shed_wrapper.go file defines a middleware that monitors concurrent requests:

var (
    maxRequests     = int32(50)  // Maximum allowed concurrent requests
    currentRequests int32        // Global counter for active requests
)
Enter fullscreen mode Exit fullscreen mode
  • maxRequests: A threshold beyond which we start shedding.
  • currentRequests: An atomic counter to track active requests safely.

The Shedding Logic

The ShedWrapper middleware wraps HTTP handlers and applies load shedding:

func ShedWrapper(h http.HandlerFunc, canShed bool) http.HandlerFunc {
    return func(w http.ResponseWriter, r *http.Request) {
        // Increment request count
        if atomic.AddInt32(&currentRequests, 1) > maxRequests {
            atomic.AddInt32(&currentRequests, -1) // Revert increment
            if canShed { // Only shed if allowed
                w.WriteHeader(http.StatusServiceUnavailable)
                fmt.Fprintln(w, "Service unavailable due to high load.")
                return
            }
        }
        defer atomic.AddInt32(&currentRequests, -1) // Decrement on exit
        h(w, r) // Proceed if under limit
    }
}
Enter fullscreen mode Exit fullscreen mode

Key Decisions:

  • canShed flag: Determines whether a request can be rejected (e.g., non-essential vs. essential).
  • Atomic operations: Ensure thread-safe increments/decrements.
  • Graceful rejection: Returns HTTP 503 (Service Unavailable) for shed requests.

Essential vs. Non-Essential Services

  • Essential Handler (essential_handler.go) Always processes requests, even under load:
fmt.Fprintf(w, "Essential service responded! (Load: %v)", simulatedLoad)
Enter fullscreen mode Exit fullscreen mode
  • Non-Essential Handler (non_essential_handler.go) Rejects requests when load exceeds 50:
if simulatedLoad > 50 {
    w.WriteHeader(http.StatusServiceUnavailable)
    fmt.Fprintln(w, "Non-essential service unavailable!")
}
Enter fullscreen mode Exit fullscreen mode

Simulating Traffic Spikes

The frontend (script.js) lets users simulate load:

async function simulateLoad(requestCount) {
    for (let i = 0; i < requestCount; i++) {
        fetch('/simulate-load'); // Triggers 2-second delay per request
    }
}
Enter fullscreen mode Exit fullscreen mode

Each /simulate-load request (simulate_handler.go) artificially increases load:

func SimulateLoadHandler(w http.ResponseWriter, r *http.Request) {
    time.Sleep(2 * time.Second) // Simulate processing
    w.WriteHeader(http.StatusOK)
}
Enter fullscreen mode Exit fullscreen mode

Putting It All Together

Demo Workflow

  1. User selects a load level (e.g., 75 concurrent requests).
  2. Essential services remain available even under high loads, demonstrating prioritization.

Webpage screenshot demonstarting an essential service online even under high loads

  1. The system tracks active requests:
  • If under 50, all services respond normally.
  • If over 50, non-essential requests are rejected.

Webpage screenshot demonstarting a non-essential service online  under normal load

Webpage screenshot demonstarting a non-essential service offline  under high loads to give priority to essential service

Try It Yourself

You can experiment with this implementation by checking out the GitHub repository. The demo includes:

  • A frontend to simulate load.
  • Backend logic for essential/non-essential services.
  • Configurable thresholds for testing.

Real-World Applications of Load Shedding

While our demo is simplified, load shedding is used in many production systems:

E-Commerce Platforms

  • Critical: Checkout, payment processing, inventory management.
  • Non-Critical: Product recommendations, reviews, wishlist updates.
  • During Black Friday sales, non-essential features might be temporarily disabled to ensure checkout remains stable.

Cloud APIs

  • Critical: Authentication, billing, core compute services.
  • Non-Critical: Logging, metrics, secondary APIs.
  • If a cloud provider faces unexpected demand, it may throttle non-critical API calls.

Social Media Platforms

  • Critical: Posting content, direct messaging.
  • Non-Critical: "Like" counts, friend suggestions.
  • During viral events, platforms might delay updating engagement metrics to prioritize content delivery.

Financial Systems

  • Critical: Stock trading, fund transfers.
  • Non-Critical: Transaction history, analytics.
  • At market open, trading systems prioritize order execution over historical data queries.

Advanced Considerations

While our demo uses a fixed threshold, real-world systems might:

  • Adjust maxRequests dynamically based on CPU/memory usage.
  • Implement retries for shed requests with exponential backoff.
  • Use circuit breakers to fail fast when dependencies are struggling.

Conclusion

Load shedding is a proactive way to handle traffic surges. By:

  • Monitoring request volume,
  • Prioritizing critical functions, and
  • Gracefully degrading non-essential features,

we can build systems that fail well instead of collapsing entirely. Try the demo yourself to see load shedding in action!

Top comments (0)