4

This is a simplification of my question. I have a numpy array:

x = np.array([0,1,2,3])

and I have a function:

def f(y): return y**2

I can compute f(x).

Now suppose I really want to compute f(x) for a repeated x:

x = np.array([0,1,2,3,0,1,2,3,0,1,2,3])

Is there a way to do this without creating a repeated version of x and in a way that is transparent to f?

In my particular case, f is an involved function and one of the arguments is x. I would like to be able to calculate f when x is repeated without actually repeating it as it wont fit into memory.

Rewriting f to handle repeated x would be work and I was hoping for a clever way possibly to subclass a numpy array to do this.

Any tips appreciated.

2
  • 1
    To clarify, are you looking for an efficient data structure to hold a matrix of the block form (A A A) for some matrix A? Commented Nov 19, 2011 at 14:58
  • @katrielex yes, that's right. Commented Nov 20, 2011 at 14:12

1 Answer 1

8

You can (almost) do this by using a few tricks with strides.

However, there are some major caveats...

import numpy as np
x = np.arange(4)
numrepeats = 3

y = np.lib.stride_tricks.as_strided(x, (numrepeats,)+x.shape, (0,)+x.strides)

print y
x[0] = 9
print y

So, y is now a view into x where each row is x. No new memory is used, and we can make y as large as we like.

For example, I can do this:

import numpy as np
x = np.arange(4)
numrepeats = 1e15

y = np.lib.stride_tricks.as_strided(x, (numrepeats,)+x.shape, (0,)+x.strides)

...and not use any more memory than the 32 bytes required for x. (y would use ~8 Petabytes of ram, otherwise)

However, if we reshape y so that it only has one dimension, we'll get a copy which will use the full amount of memory. There's no way to describe a "horizontally" tiled view of x using strides and shape, so any shape with less than 2 dimensions will return a copy.

Additionally, if we operate on y in a way that would return a copy (e.g. the y**2 in your example), we'll get a full copy.

For that reason, it makes more sense to operate on things in-place. (e.g. y **= 2, or equivalently x **= 2. Both will accomplish the same thing.)

Even for a generic function, you can pass in x and place the result back in x.

E.g.

def f(x):
    return x**3

x[...] = f(x)
print y

y will be updated, as well, as it's just a view into x.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.