0

I know this question has been asked before (I did a pretty thorough search), and recognize Python intentionally doesn't really want you to do this. And, if you create a readable NumPy array that references locations in memory (where your NumPy smaller matrix values are), then the matrix is no longer a contiguous array. Which may cause issues if you were to do certain things with it (Numba or Cython I suppose).

Nonetheless, looking for a smart answer, where we can still use this non-contiguous array in calculations, to not increase the memory footprint of a larger NumPy array. Yes, it's easiest to just resize the data (which will copy it), but that defeats the goal of minimizing memory in RAM. Here is a sample of what I'm doing on a very basic level:

So step 1) here I'm going to generate some random numbers and do it for 12 assets and 1000 simulations and save into the variable a:

import numpy as np
a = np.random.randn(12,1000)

Okay lets look at it's initial shape:

a.shape
(12,1000)

So now all I want to do is make these EXACT numbers available for say 20 iterations (vectorized, not using loops). But I DO NOT want to just make the matrix BIGGER. So my goal here is to have instead of a (12,1000) shape, a (12*20,1000) shape, with simple replication via pointers (or Python's version of them) and not just copy the (12,1000) matrix into more memory. The same numbers are used in this example 20 times (all at once) when passed into another function. They never get overwritten either (read-only is fine, with views). I could explain the reason why but it's pretty complex math; all you need to know is that the function needs the original random numbers replicated exactly. The brainless mem copy routine would be something like:

b = np.resize(a, (12*20,1000))

Which does what I want on the surface, with the new shape:

b.shape
(240, 1000)

As I can check that they are equal with a couple commands, first, the start of the array vs. the 2nd copy:

np.allclose(b[0:11,:],b[12:23,:])
True

And the end of the array vs. the 1st one:

np.allclose(b[0:11,:],b[228:239,:])
True

So great, that's what I want - a repeat of these random numbers through the whole array. BUT I don't want my memory to blow up (I am using HUGE arrays, that can't fit into most PC's memory - I am a quant developer with a ton of RAM, but end users don't have as much RAM as I do). So let us examine the size in memory of a and b:

a.nbytes
96000

b.nbytes
1920000

Which makes perfect sense since the memory of a has been multiplied by 20 to store all the repeated values, i.e.:

b.nbytes/a.nbytes
20.0

So of course, 20x the memory usage. So what I'm trying to get at here is quite simple (well, in other languages). It is to construct b so that the only overhead is just pointers to the 20 replications of a, so that the memory space is merely a + the pointer(s). Of course, the math has to work using this setup. I have seen some tricks using strides although I am not sure they will work here. I don't want to use loops either (the idea is in 1 run it's done, with 20 slightly different inputs). So if ANYONE has figured out a way to do this without using a ton of memory (compared to the base case, here the variable a, versus the replicated array, here the variable b), I would like to know your approach.

Any help is greatly appreciated!

2
  • This is trivial if you use python based containers, the problem is you want a numpy array, which won't work the way you want it to (i guess you could use object dtype) Commented Nov 29, 2021 at 22:14
  • A bit of work but worth it I think - make some sublcass of numpy array, adding parameter repeat to init and overriding indexing maybe will suffice for you Commented Nov 29, 2021 at 22:20

1 Answer 1

1

First, your use of resize actually does (summarizing the code)

a = np.concatenate((a,) * 20).reshape(new_shape)

I'm a little confused about the 20 repeats, but also talk about "20 slightly different inputs". Is that this array, or some other inputs. Also what's the point to using a (240, 1000) shape, instead of a (20,12,1000)?

With broadcasting a (1,12,1000) can behave the same as (20,12,1000).

A small sample array:

In [646]: arr = np.arange(12).reshape(3,4)
In [647]: arr.shape, arr.strides
Out[647]: ((3, 4), (32, 8))

We can "resize" as you do with repeat:

In [655]: arr1 =arr.repeat(5,0)
In [656]: arr1.shape, arr1.strides
Out[656]: ((15, 4), (32, 8))

Or repeat on a new leading axis:

In [657]: arr1 =arr[None,:,:].repeat(5,0)
In [658]: arr1.shape, arr1.strides
Out[658]: ((5, 3, 4), (96, 32, 8))

Or we can use broadcasting to make an equivalent array:

In [660]: arr2 = np.broadcast_to(arr,(5,3,4))
In [661]: arr2.shape, arr2.strides
Out[661]: ((5, 3, 4), (0, 32, 8))

It has the same shape as arr1, but the leading strides is 0.

In [662]: np.allclose(arr2, arr1)
Out[662]: True

arr1 is a copy from the original, but arr2 is a view (or the original arange use to make arr):

In [665]: arr1.base
In [666]: arr2.base
Out[666]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In other works, arr2 doesn't increase the memory footprint of arr.

Usually we don't have use np.broadcast_to.

 arr3 = arr[None,:,:]

is enough or even arr itself.

For example we can add a size 5 array to any of these:

In [670]: x = np.arange(5)[:,None,None]
In [671]: np.allclose(x+arr1, x+arr2)
Out[671]: True
In [672]: np.allclose(x+arr1, x+arr[None,:,:])
Out[672]: True
In [673]: np.allclose(x+arr1, x+arr)
Out[673]: True

The result in will be the full size, same as arr1. Use of strides and broadcasting can reduce the size of initial arrays, compared to a 'repeat'. But the final result still has 534 unique values.

Sign up to request clarification or add additional context in comments.

2 Comments

I think you gathered my question correctly (even with some confusion); my array is actually 3D to begin with, but I made it 2D so people could propose solutions a bit easier. This array does not change; it's the usage of this array in another function which varies the inputs, but this array stays the same. So your interpretation was correct. Now I'm still trying to digest your proposed solutions but I believe 1 of them does what I was asking; i.e. use views to do the replication of an initial matrix without expanding the memory footprint. I will test and likely accept your answer, thanks!
b=a.repeat(20,0) does have the same strides as a, but when I use a.nbytes and b.nbytes, they appear to still be the same size in memory (b is 20x bigger). Also using sys.getsizeof shows the same. Maybe I missed something here. I also couldn't make the array view with np.broadcast_to(a, (12*20, 1000)) without an error. I have seen np.lib.stride_tricks.as_strided replicate an array with pointers to the contents of the initial array, I just can't find the proper syntax. I added info to my question (this array is read only after creating it, so a non-contiguous array view is fine).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.