Alex Aslam

Posted on Jun 11

The Node.js Performance Trap: How To Fixed The API from Crashing Under Load

#webdev #programming #javascript #beginners

The Incident That Woke Us Up

It was 3 AM when the alerts started flooding in.

Our high-traffic API, which usually handled 5,000 requests per second (RPS), was suddenly timing out for 30% of users. Database queries were slow, CPU usage spiked to 95%, and the event loop was lagging by 500ms.

After a frantic hour of checking logs, we realized: Node.js itself wasn’t the problem—our code was.

We had fallen into common Node.js performance traps that only show up under real-world load.

Here’s what we learned—and how we fixed it.

1. Blocking the Event Loop: The Silent Killer

The Symptom:

High latency during peak traffic.
Unresponsive API even with low CPU usage.

The Culprit:

A synchronous CSV parser in our user-import feature:

function parseLargeCSV(filePath) {
  const data = fs.readFileSync(filePath); // 😱 Blocking call!
  return data.split('\n').map(processRow);
}

Even though it was a rarely used admin feature, it blocked the entire Node.js process for 2-3 seconds, causing request delays.

The Fix:

Use fs.promises + streams for large files:

  async function parseLargeCSV(filePath) {
    const stream = fs.createReadStream(filePath);
    for await (const chunk of stream) {
      // Process chunk by chunk
    }
  }

Offload CPU-heavy tasks to Worker Threads.

Result: API latency dropped by 40% during peak loads.

2. Memory Leaks: The Slow Death

The Symptom:

Gradual slowdowns over days.
Restarts temporarily fixed issues (a classic red flag).

The Culprit:

A misconfigured cache kept growing indefinitely:

const cache = {};

app.get('/data/:id', (req, res) => {
  if (!cache[req.params.id]) {
    cache[req.params.id] = fetchData(req.params.id); // 🚀 Grows forever!
  }
  res.json(cache[req.params.id]);
});

The Fix:

Use WeakMap or LRU caches (like lru-cache):

  const LRU = require('lru-cache');
  const cache = new LRU({ max: 1000 }); // Automatically evicts old entries

Monitor heap usage with --inspect flag + Chrome DevTools.

Result: Memory usage stabilized at 300MB instead of creeping to 2GB+.

3. Promise Hell: Uncontrolled Concurrency

The Symptom:

Database timeouts under load.
High pendingPromises count in metrics.

The Culprit:

An unbatched Promise.all fetching 10,000 rows at once:

async function fetchAllUsers(userIds) {
  return Promise.all(userIds.map(id => db.query('SELECT * FROM users WHERE id = ?', [id])));
} // 💥 Database gets hammered!

The Fix:

Batch with p-limit or bluebird’s Promise.map:

  const limit = require('p-limit');
  const concurrency = limit(10); // Max 10 DB queries at once

  async function fetchAllUsers(userIds) {
    return Promise.all(userIds.map(id => concurrency(() => db.query('...', [id]))));
  }

Result: Database load reduced by 70%, no more timeouts.

4. Poorly Optimized Logging

The Symptom:

High disk I/O during traffic spikes.
console.log in production (yes, we did it 😅).

The Culprit:

app.use((req, res, next) => {
  console.log(`Incoming: ${req.method} ${req.url}`); // ⚠️ Sync + unbuffered!
  next();
});

The Fix:

Use winston or pino (async, structured logging):

  const logger = require('pino')();
  app.use((req, res, next) => {
    logger.info({ method: req.method, url: req.url }, 'Request');
    next();
  });

Result: Logging overhead reduced from 15ms to <1ms per request.

Key Takeaways

✅ Never block the event loop (use streams, workers).
✅ Cache wisely (LRU, TTL, WeakMap).
✅ Control concurrency (avoid unlimited Promise.all).
✅ Log efficiently (pino > console.log).

Our API now handles 10,000 RPS without breaking a sweat.

What Node.js performance traps have YOU faced? Let’s discuss in the comments! 👇

DEV Community

The Node.js Performance Trap: How To Fixed The API from Crashing Under Load

The Incident That Woke Us Up

1. Blocking the Event Loop: The Silent Killer

The Symptom:

The Culprit:

The Fix:

2. Memory Leaks: The Slow Death

The Symptom:

The Culprit:

The Fix:

3. Promise Hell: Uncontrolled Concurrency

The Symptom:

The Culprit:

The Fix:

4. Poorly Optimized Logging

The Symptom:

The Culprit:

The Fix:

Key Takeaways

Top comments (0)