Iterate large directories efficiently with python.
python-getdents is a simple wrapper around Linux system call getdents64 (see man getdents for details). Here's some study on why ls, os.listdir() and others are so slow when dealing with extremely large directories.
- Verify that implementation works on platforms other than
x86_64.
pip install getdentspython3 -m venv env
. env/bin/activate
pip install -e .ulimit -v 33554432 && py.test tests/Or
ulimit -v 33554432 && ./setup.py testfrom getdents import getdents
for inode, type, name in getdents('/tmp', 32768):
print(name)import os
from getdents import *
fd = os.open('/tmp', O_GETDENTS)
for inode, type, name in getdents_raw(fd, 2**20):
print({
DT_BLK: 'blockdev',
DT_CHR: 'chardev ',
DT_DIR: 'dir ',
DT_FIFO: 'pipe ',
DT_LNK: 'symlink ',
DT_REG: 'file ',
DT_SOCK: 'socket ',
DT_UNKNOWN: 'unknown ',
}[type], {
True: 'd',
False: ' ',
}[inode == 0],
name,
)
os.close(fd)