Linked Questions
135 questions linked to/from Lazy Method for Reading Big File in Python?
4
votes
2
answers
4k
views
Efficient way to read data in python [duplicate]
Possible Duplicate:
Lazy Method for Reading Big File in Python?
I need to read 100 GB (400 million lines) of data from a file line by line. This is my current code, but is there any efficient ...
1
vote
3
answers
2k
views
How to read file chunk by chunk? [duplicate]
I have test.txt file:
"hi there 1, 3, 4, 5"
When I use python to read it,how can I read it part by part for example first I read the first 4 character and then read the next 4 and then all ...
1
vote
1
answer
741
views
Opening 1GB wave file leads to memory error [duplicate]
Hello stackoverflow users,
Currently I am facing the following problem, I have a function to open a .wav file, it returns sample rate, length and samples. I have tried it will small files, it worked ...
0
votes
1
answer
298
views
Manipulating very large text file and clustering analysis [duplicate]
I'm trying to work with a (very) large 45gb .txt file that cannot be opened using normal text editors.
Data within each row is separated by a spacing, although there are also spaces within each ...
3
votes
0
answers
74
views
Memory error while lowercasing lines in a large textfile [duplicate]
I am trying to lowercase text in a text file. The text file is about 7 GB and I am trying to go through it line by line and lowercase all words. But I get a memory error and I don't understand why ...
0
votes
0
answers
37
views
how to process binary strings in chunks from memory? [duplicate]
I have the following code:
output = io.BytesIO()
some_function(output) # some_function writes output to file n times
buffer = output.getbuffer()
output.getvalue()
output.getvalue() returns the ...
378
votes
21
answers
466k
views
gunicorn: how to resolve "WORKER TIMEOUT"?
I have setup gunicorn with 3 workers, 30 worker connections and using eventlet worker class. It is set up behind Nginx. After every few requests, I see this in the logs.
[ERROR] gunicorn.error: WORKER ...
217
votes
14
answers
162k
views
Get the MD5 hash of big files in Python
I have used hashlib (which replaces md5 in Python 2.6/3.0), and it worked fine if I opened a file and put its content in the hashlib.md5() function.
The problem is with very big files that their sizes ...
50
votes
11
answers
74k
views
How to read file N lines at a time? [duplicate]
I need to read a big file by reading at most N lines at a time, until EOF. What is the most effective way of doing it in Python? Something like:
with open(filename, 'r') as infile:
while not EOF:
...
64
votes
8
answers
62k
views
Python how to read N number of lines at a time
I am writing a code to take an enormous textfile (several GB) N lines at a time, process that batch, and move onto the next N lines until I have completed the entire file. (I don't care if the last ...
51
votes
4
answers
50k
views
Where to use yield in Python best?
I know how yield works. I know permutation, think it just as a math simplicity.
But what's yield's true force? When should I use it? A simple and good example is better.
49
votes
8
answers
29k
views
Why doesn't Python's mmap work with large files?
[Edit: This problem applies only to 32-bit systems. If your computer, your OS and your python implementation are 64-bit, then mmap-ing huge files works reliably and is extremely efficient.]
I am ...
41
votes
4
answers
30k
views
What is the idiomatic way to iterate over a binary file?
With a text file, I can write this:
with open(path, 'r') as file:
for line in file:
# handle the line
This is equivalent to this:
with open(path, 'r') as file:
for line in iter(file....
25
votes
6
answers
42k
views
Get progress back from shutil file copy thread
I've got an application from which a file is copied from src to dst:
import shutil
from threading import Thread
t = Thread(target=shutil.copy, args=[ src, dst ]).start()
I wish to have the ...
13
votes
9
answers
10k
views
Pythonic way to ignore for loop control variable [duplicate]
A Python program I'm writing is to read a set number of lines from the top of a file, and the program needs to preserve this header for future use. Currently, I'm doing something similar to the ...