6

I have a 2-dimensional array of integers, we'll call it "A".

I want to create a 3-dimensional array "B" of all 1s and 0s such that:

  • for any fixed (i,j) sum(B[i,j,:])==A[i.j], that is, B[i,j,:] contains A[i,j] 1s in it
  • the 1s are randomly placed in the 3rd dimension.

I know how I would do this using standard python indexing but this turns out to be very slow.

I am looking for a way to do this that takes advantage of the features that can make Numpy fast.

Here is how I would do it using standard indexing:

B=np.zeros((X,Y,Z))
indexoptions=range(Z)

for i in xrange(Y):
    for j in xrange(X):
        replacedindices=np.random.choice(indexoptions,size=A[i,j],replace=False)
        B[i,j,[replacedindices]]=1

Can someone please explain how I can do this in a faster way?

Edit: Here is an example "A":

A=np.array([[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4],[0,1,2,3,4]])

in this case X=Y=5 and Z>=5

1
  • 1
    Trying to make progress on this, I asked a simpler question: stackoverflow.com/questions/26310897/… - but then I realized that my planned np.random.shuffle(np.rollaxis(B, 2)) doesn't shuffle all the rows independently, so this is not quite an answer yet. Building blocks, maybe. :) Commented Oct 11, 2014 at 4:22

1 Answer 1

5

Essentially the same idea as @JohnZwinck and @DSM, but with a shuffle function for shuffling a given axis:

import numpy as np

def shuffle(a, axis=-1):
    """
    Shuffle `a` in-place along the given axis.

    Apply numpy.random.shuffle to the given axis of `a`.
    Each one-dimensional slice is shuffled independently.
    """
    b = a.swapaxes(axis,-1)
    # Shuffle `b` in-place along the last axis.  `b` is a view of `a`,
    # so `a` is shuffled in place, too.
    shp = b.shape[:-1]
    for ndx in np.ndindex(shp):
        np.random.shuffle(b[ndx])
    return


def random_bits(a, n):
    b = (a[..., np.newaxis] > np.arange(n)).astype(int)
    shuffle(b)
    return b


if __name__ == "__main__":
    np.random.seed(12345)

    A = np.random.randint(0, 5, size=(3,4))
    Z = 6

    B = random_bits(A, Z)

    print "A:"
    print A
    print "B:"
    print B

Output:

A:
[[2 1 4 1]
 [2 1 1 3]
 [1 3 0 2]]
B:
[[[1 0 0 0 0 1]
  [0 1 0 0 0 0]
  [0 1 1 1 1 0]
  [0 0 0 1 0 0]]

 [[0 1 0 1 0 0]
  [0 0 0 1 0 0]
  [0 0 1 0 0 0]
  [1 0 1 0 1 0]]

 [[0 0 0 0 0 1]
  [0 0 1 1 1 0]
  [0 0 0 0 0 0]
  [0 0 1 0 1 0]]]
Sign up to request clarification or add additional context in comments.

3 Comments

Hmmph. I'm annoyed that shuffle doesn't work like I thought it did. Could the Python-level loop be avoided by reshaping to a lower-D object and shuffling that?
@DSM: I share your annoyance! I couldn't find a way to make this work with a single call to np.random.shuffle. (My first version of shuffle--not shown here--is a vectorized Fisher-Yates algorithm, but it is not as clear as this one, and probably a lot slower that this one when the non-axis dimensions are small.)
Thanks! For large arrays this method is more than 100 times faster than the way I was originally doing it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.