Swap Array Data in NumPy

Question

I have many large multidimensional NP arrays (2D and 3D) used in an algorithm. There are numerous iterations in this, and during each iteration the arrays are recalculated by performing calculations and saving into temporary arrays of the same size. At the end of a single iteration the contents of the temporary arrays are copied into the actual data arrays.

Example:

global A, B # ndarrays
A_temp = numpy.zeros(A.shape)
B_temp = numpy.zeros(B.shape)
for i in xrange(num_iters):
    # Calculate new values from A and B storing in A_temp and B_temp...
    # Then copy values from temps to A and B
    A[:] = A_temp
    B[:] = B_temp

This works fine, however it seems a bit wasteful to copy all those values when A and B could just swap. The following would swap the arrays:

A, A_temp = A_temp, A
B, B_temp = B_temp, B

However there can be other references to the arrays in other scopes which this won't change.

It seems like NumPy could have an internal method for swapping the internal data pointer of two arrays, such as numpy.swap(A, A_temp). Then all variables pointing to A would be pointing to the changed data.

Can you give an example of "other references to the arrays in other scopes which this won't change"? — dmytro
– dmytro, Commented Mar 31, 2012 at 9:49
For example, for the calculation step I could (and will) have multiple threads, called with a function where the fields are passed as arguments. To reduce the overhead of instantiating multiple threads they cycle as well. — coderforlife
– coderforlife, Commented Apr 1, 2012 at 3:57
if I recall correctly, you can't pass the data between threads/processes without copy unless you're using specific features like sharedctypes arrays or so. But I might be wrong... — dmytro
– dmytro, Commented Apr 1, 2012 at 22:04
Sharing between threads is not difficult, in fact does not require any extra work. It just requires you to be extra careful about race conditions (and since I am only reading in the loop and only one thread is writing the arrays this is not really an issue). For multiprocessing some additional work is required, but the Scipy site says "It is possible to share memory between processes, including numpy arrays." (scipy.org/ParallelProgramming). — coderforlife
– coderforlife, Commented Apr 4, 2012 at 4:05
indeed, you need to use sharedctypes arrays for that (I only worked with multiprocessing so I ain't sure about threading of course) — dmytro
– dmytro, Commented Apr 6, 2012 at 13:17

dmytro · Accepted Answer · 2012-03-31 10:03:35Z

2

Even though you way should work as good (I suspect the problem is somewhere else), you can try doing it explicitly:

import numpy as np
A, A_temp = np.frombuffer(A_temp), np.frombuffer(A)

It's not hard to verify that your method works as well:

>>> import numpy as np
>>> arr = np.zeros(100)
>>> arr2 = np.ones(100)
>>> print arr.__array_interface__['data'][0], arr2.__array_interface__['data'][0]
152523144 152228040

>>> arr, arr2 = arr2, arr
>>> print arr.__array_interface__['data'][0], arr2.__array_interface__['data'][0]
152228040 152523144

... pointers succsessfully switched

edited Mar 31, 2012 at 10:03

answered Mar 31, 2012 at 9:52

dmytro

1,30310 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

coderforlife Over a year ago

The frombuffer idea is interesting, however it does require a "reshape(A.shape)" command since it only returns 1D arrays. Besides that though this does not solve the issue with alternate references however. I tried a few different things with it, and what is needed a "setbuffer" function... This has given me the idea to look into views though (which I don't really know too much about).

NPE · Accepted Answer · 2012-03-31 09:58:12Z

1

Perhaps you could solve this by adding a level of indirection.

You could have an "array holder" class. All that would do is keep a reference to the underlying NumPy array. Implementing a cheap swap operation for a pair of these would be trivial.

If all external references are to these holder objects and not directly to the arrays, none of those references would get invalidated by a swap.

answered Mar 31, 2012 at 9:58

NPE

503k114 gold badges970 silver badges1k bronze badges

3 Comments

coderforlife Over a year ago

This is definitely a viable alternative. It just seems silly since the ndarray already has a pointer to the proper data that could be changed... if I cannot figure out a way using views this will be the answer.

NPE Over a year ago

@thaimin: The Pythonic way to swap two things is a, b = b, a. This swaps the references. I can't think of a single standard class that implements a swap interface like the one you're looking for. It's just not done in Python since it's almost never needed.

coderforlife Over a year ago

I understand. I attempted to use views which would work if it let you re-assign the "base" field. I attempted to assign the "data" field which almost worked but doesn't allow swapping (although the first assignment does exactly what I want and a.data, b.data = b.data, a.data does swap "one" of them).

Iguananaut · Accepted Answer · 2015-07-15 00:05:58Z

I realize this is an old question, but for what it's worth you could also swap data between two ndarray buffers (without a temp copy) by performing an xor swap:

A_bytes = A.view('ubyte')
A_temp_bytes = A.view('ubyte')
A_bytes ^= A_temp_bytes
A_temp_bytes ^= A_bytes
A_bytes ^= A_temp_bytes

Since this was done on views, if you look at the original A and A_temp arrays (in whatever their original dtype was) their values should be correctly swapped. This is basically equivalent to the numpy.swap(A, A_temp) you were looking for. It's unfortunate that it requires 3 loops--if this were implemented as a ufunc (maybe it should be) it would be a lot faster.

Collectives™ on Stack Overflow

Swap Array Data in NumPy

3 Answers 3

1 Comment

3 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

3 Comments

Comments

Linked

Related