DEV Community

Cover image for A Developer’s Guide to SQL NOT IN: Smarter Queries, Faster Results
DbVisualizer
DbVisualizer

Posted on

A Developer’s Guide to SQL NOT IN: Smarter Queries, Faster Results

The NOT IN clause in SQL is useful for filtering out unwanted data. But it comes with caveats—mainly around NULLs, subqueries, and scalability. Here’s a breakdown of how NOT IN behaves in real scenarios and how to use it safely.

Real-World Uses of NOT IN

These examples illustrate how NOT IN works in typical queries:

  • Excluding known values:
SELECT * FROM company.invoices
WHERE issued_by NOT IN ('Jack', 'Josh', 'Matthew');
Enter fullscreen mode Exit fullscreen mode
  • Using a subquery for exclusions:
SELECT username FROM demo_table
WHERE user_id NOT IN (SELECT id FROM demo_table2);
Enter fullscreen mode Exit fullscreen mode
  • Nested joins and subquery filtering:
SELECT * FROM purchases.suppliers
WHERE supplier NOT IN (
  SELECT supplier_id
  FROM old_purchases.suppliers
  INNER JOIN orders ON customers.customer_id = orders.customer_id
);
Enter fullscreen mode Exit fullscreen mode

These queries are clear-cut but don’t always scale well or behave consistently when nulls are present.

Performance Watchouts

NOT IN struggles in high-data environments or when queries become complex. Performance suffers when:

  • Datasets grow beyond a few million rows.
  • Subqueries involve joins and multiple filters.
  • NULL values are returned, affecting accuracy.
  • The database is not optimized for reads or lacks indexing.

Make sure to profile your queries using tools available in your SQL environment or a dedicated SQL client.

FAQ

What does NOT IN do?

It filters out rows that match values in a list or subquery.

What makes it problematic?

Large datasets, NULL values, and deeply nested logic can cause performance issues or unexpected results.

When is it safe to use?

When working with moderate-sized datasets that don’t include nulls in subqueries.

Why use a tool like DbVisualizer?

A solid SQL client helps visualize query plans, optimize performance, and prevent logic bugs. DbVisualizer supports most major databases.

Conclusion

SQL NOT IN is easy to implement but needs to be handled with care. It works best on clean, reasonably sized datasets where nulls and performance are not major concerns. As your database grows or your logic gets more complex, keep an eye on query speed and result accuracy.

Read out the article SQL NOT IN: the Good, Bad & the Ugly for more insights.

Top comments (0)