2

I'm trying to figure out a python memory-handling issue I'm having when filling an array. I’m filling a huge multi-dimensional array of length [2048,3000,256,76], which I already created so its memory is already allocated. I fill it in a for loop with random numbers like so:

import numpy as np

myarray = np.zeros((2048,3000,256,76))
for i in range(2048):
    myarray[i,:,:,:] = np.random.normal(0.,1.,[3000,256,76])

However, if I see the memory the process is using, it keeps increasing steadily up to a point in which I have to kill it, I presume because the previous calls to np.random.normal (whose values I have already stored on myarray) are not disposed of. How can I get rid of them? Is it possible? I’ve tried running the garbage collector, but that didn’t work.

I realize this is a rather basic question, but all my memory allocation skills come from C. There it was just a matter of freeing arrays/vectors to not fall into problems like this, but I don’t know how to translate those skills to object-calling/creation disposal other than del and gc calls.

Thanks in advance for any pointers (pun intended)!

PS: This is just a toy code snippet of a larger problem. My actual problem has to do with multithreading, but this can shine some light into that problem.

1 Answer 1

3

Your array is huge. 891 GiB of huge to be precise. On my system, windows, I get a MemoryError:

>>> myarray = np.zeros((2048,3000,256,76))
MemoryError: Unable to allocate 891. GiB for an array with shape (2048, 3000, 256, 76) and data type float64

which I already created so its memory is already allocated.

This unfortunately isn't true. On systems which aren't windows, I believe the OS does not perform the allocation until you replace the zeros with real data, which is why your memory usage keeps climbing.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.