You're reading from scikit-learn Cookbook Over 80 recipes for machine learning in Python with scikit-learn

Product type Paperback

Published in Dec 2025

Publisher Packt

ISBN-13 9781836644453

Length 388 pages

Edition 3rd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Author (1):

John Sukup

View More author details

Table of Contents (17) Chapters

Preface

1. Chapter 1: Common Conventions and API Elements of scikit-learn

2. Chapter 2: Pre-Model Workflow and Data Preprocessing FREE CHAPTER

3. Chapter 3: Dimensionality Reduction Techniques

4. Chapter 4: Building Models with Distance Metrics and Nearest Neighbors

5. Chapter 5: Linear Models and Regularization

6. Chapter 6: Advanced Logistic Regression and Extensions

7. Chapter 7: Support Vector Machines and Kernel Methods

8. Chapter 8: Tree-Based Algorithms and Ensemble Methods

9. Chapter 9: Text Processing and Multiclass Classification

10. Chapter 10: Clustering Techniques

11. Chapter 11: Novelty and Outlier Detection

12. Chapter 12: Cross-Validation and Model Evaluation Techniques

13. Chapter 13: Deploying scikit-learn Models in Production

14. Chapter 14: Unlock Your Exclusive Benefits

Unlock this Book’s Free Benefits in 3 Easy Steps

15. Index

Why subscribe?

16. Other Books You May Enjoy

Understanding Isolation Forest

Isolation Forest is an efficient and scalable algorithm for detecting outliers in high-dimensional datasets. Rather than profiling normal data points and identifying deviations, it works by isolating anomalies. Outliers are easier to isolate because they tend to differ significantly from most of the data. The algorithm randomly selects a feature and splits the data based on a random threshold; fewer splits are typically needed to isolate anomalies.

This method is particularly well-suited for large datasets and is capable of both outlier and novelty detection, making it a versatile tool in the ML toolkit. This recipe utilizes Isolation Forest to identify both inlier and outliers in datasets.

Getting ready

We’ll generate a synthetic dataset that includes visible outliers. This will allow us to compare the performance of Isolation Forest against the known distribution.

Load the libraries:

import numpy as np
import matplotlib.pyplot as plt
from sklearn...

The rest of the chapter is locked

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

You're reading from scikit-learn Cookbook Over 80 recipes for machine learning in Python with scikit-learn

Table of Contents (17) Chapters

Understanding Isolation Forest

Getting ready

Authors (1)

Personalised recommendations for you

You're reading from scikit-learn Cookbook Over 80 recipes for machine learning in Python with scikit-learn

Table of Contents (17) Chapters

Understanding Isolation Forest

Getting ready

Authors (1)

Personalised recommendations for you

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access