0

I have an nxN numpy array where N is around 500_000. Depending on the use case of the program, n can be 1 or, e.g., 1000. I'd like to avoid flooding my code with if clauses on the shape of the array every time I use it. Also, I need the first dimension of this array (n) to match that of other arrays in the code base.

So, for the n=1 case, I initially decided to allocate an nxN array, populate the first row and replicate the data with:

my_array = np.repeat(np.reshape(my_array[0,:],(1,my_array.shape[1])),1000,axis=0)

However, this is rather wasteful and takes up 9 GiB of memory. I'd like to have a 1xN array with the data and a 1000xN pointer array (if such thing exist in python) pointing to the 1xN array.

Any ideas? Thanks


Update: The solution with np.broadcast_to() worked brilliantly. Indeed, no need to create the 2D array, which is what I was looking for.

5
  • @user2357112 why do you think this was not a duplicate? Commented Jun 13, 2024 at 8:52
  • @mozway: Trying to create a 1-dimensional array as a repeating view of another 1-dimensional array doesn't work, but it's a different story for trying to create a 2-dimensional array, repeating the data along an axis that didn't exist or was length-1 in the original array. Commented Jun 13, 2024 at 8:55
  • There is no pointer in Numpy nor in Python (only references in Python). In general, indices are enough to simulate pointers. When this is not enough Cython or native language like C/C++ are the way to go. Note that indirections (i.e. both pointers and indices) tend to prevent the use of SIMD units and so decrease performance though using a lot less memory can make codes more cache-friendly and increase performance too. Commented Jun 13, 2024 at 8:55
  • @user2357112 maybe I misunderstood what is wanted, I thought making a view of the array as suggested here would be what OP is looking for. Commented Jun 13, 2024 at 8:59
  • 1
    @mozway: The answers there do cover what the questioner wants to do here, but the question itself is critically different, enough so that the top answer starts with "you can't do this". I don't think it's a good enough match to be an appropriate dupe target. Commented Jun 13, 2024 at 9:06

1 Answer 1

2

There's no such thing as a "pointer array" in NumPy, but you can get what you need with broadcasting.

First off, there's a decent chance you don't actually need to create that 2D array in the first place. With the NumPy broadcasting rules, trying to use a shape-(N,) 1-dimensional array and a shape-(M, N) 2-dimensional array together in the same operation will usually treat the 1-dimensional array exactly like the 2-dimensional array you want to create from it. You can probably just make a 1D array and go ahead and use that.

If you need a 2D array for some reason, you can use numpy.broadcast_to to create a 2D view of a 1D array, where every row shares the same underlying memory:

a = # N-element array
b = numpy.broadcast_to(a, (1000, N))
Sign up to request clarification or add additional context in comments.

1 Comment

And a (1,N)` will broadcast in the same way (if for some other reason you need it to be 2d).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.