I know this question has been asked before (I did a pretty thorough search), and recognize Python intentionally doesn't really want you to do this. And, if you create a readable NumPy array that references locations in memory (where your NumPy smaller matrix values are), then the matrix is no longer a contiguous array. Which may cause issues if you were to do certain things with it (Numba or Cython I suppose).
Nonetheless, looking for a smart answer, where we can still use this non-contiguous array in calculations, to not increase the memory footprint of a larger NumPy array. Yes, it's easiest to just resize the data (which will copy it), but that defeats the goal of minimizing memory in RAM. Here is a sample of what I'm doing on a very basic level:
So step 1) here I'm going to generate some random numbers and do it for 12 assets and 1000 simulations and save into the variable a:
import numpy as np
a = np.random.randn(12,1000)
Okay lets look at it's initial shape:
a.shape
(12,1000)
So now all I want to do is make these EXACT numbers available for say 20 iterations (vectorized, not using loops). But I DO NOT want to just make the matrix BIGGER. So my goal here is to have instead of a (12,1000) shape, a (12*20,1000) shape, with simple replication via pointers (or Python's version of them) and not just copy the (12,1000) matrix into more memory. The same numbers are used in this example 20 times (all at once) when passed into another function. They never get overwritten either (read-only is fine, with views). I could explain the reason why but it's pretty complex math; all you need to know is that the function needs the original random numbers replicated exactly. The brainless mem copy routine would be something like:
b = np.resize(a, (12*20,1000))
Which does what I want on the surface, with the new shape:
b.shape
(240, 1000)
As I can check that they are equal with a couple commands, first, the start of the array vs. the 2nd copy:
np.allclose(b[0:11,:],b[12:23,:])
True
And the end of the array vs. the 1st one:
np.allclose(b[0:11,:],b[228:239,:])
True
So great, that's what I want - a repeat of these random numbers through the whole array. BUT I don't want my memory to blow up (I am using HUGE arrays, that can't fit into most PC's memory - I am a quant developer with a ton of RAM, but end users don't have as much RAM as I do). So let us examine the size in memory of a and b:
a.nbytes
96000
b.nbytes
1920000
Which makes perfect sense since the memory of a has been multiplied by 20 to store all the repeated values, i.e.:
b.nbytes/a.nbytes
20.0
So of course, 20x the memory usage. So what I'm trying to get at here is quite simple (well, in other languages). It is to construct b so that the only overhead is just pointers to the 20 replications of a, so that the memory space is merely a + the pointer(s). Of course, the math has to work using this setup. I have seen some tricks using strides although I am not sure they will work here. I don't want to use loops either (the idea is in 1 run it's done, with 20 slightly different inputs). So if ANYONE has figured out a way to do this without using a ton of memory (compared to the base case, here the variable a, versus the replicated array, here the variable b), I would like to know your approach.
Any help is greatly appreciated!
repeatto init and overriding indexing maybe will suffice for you