DEV Community

Cover image for Efficient Data Processing with PHP Generators
Eddie Goldman
Eddie Goldman

Posted on

Efficient Data Processing with PHP Generators

Modern PHP includes Generators, which let you work with large datasets without using too much memory. Instead of building a huge array and then looping through it, a Generator uses yield to return one item at a time. This keeps only the current value in memory.

Example: Generating Numbers

function getNumbersWithGenerator(int $max): Generator {
    for ($i = 1; $i <= $max; $i++) {
        yield $i;
    }
}

// Only one number is in memory at once:
foreach (getNumbersWithGenerator(1_000_000) as $n) {
    // Process $n…
}
Enter fullscreen mode Exit fullscreen mode

If you tried to build an array of one million numbers first, PHP would use a lot of memory. With a Generator, it stays small.

Example: Reading a Large CSV File

function readCsvLineByLine(string $filename): Generator {
    if (!is_readable($filename)) {
        throw new InvalidArgumentException("Cannot read: $filename");
    }
    $handle = fopen($filename, 'r');
    if (!$handle) {
        throw new RuntimeException("Failed to open: $filename");
    }

    // First row is the header:
    $header = fgetcsv($handle);
    while (($row = fgetcsv($handle)) !== false) {
        yield array_combine($header, $row);
    }
    fclose($handle);
}

// Use it like this:
foreach (readCsvLineByLine(__DIR__ . '/large-data.csv') as $record) {
    // Each $record is an associative array, e.g. ['id'=>'123', 'name'=>'Alice', …]
    // Process $record without loading the entire file into memory.
}
Enter fullscreen mode Exit fullscreen mode

Because fgetcsv() reads one line at a time and yield returns one row, PHP never loads the whole file at once.

When to Use Generators

  • Fetching thousands of database rows: yield each row instead of fetchAll().
  • Processing large log files or text files line by line.
  • Any situation where you would otherwise build a huge array.

Generators make your PHP code run β€œlazy”: values are produced on demand. This keeps memory usage low and avoids slowdowns. Next time you need to handle big data in PHP, try a Generator function!

Top comments (0)