1

I have something like a np.arange([100000]) and i need to retrieve data between two indexes multiple times. Currently i running this which is slow

data = np.arange([100000])
# This array usually contains thousands of slices
slices = np.array( [
       [1, 4],
       [10,20],
       [100,110],
       [1000,1220]
])

# One way i have been doing it
np.take(data, [i for iin, iout in slices for idx in range(iin, iout)])
# The other way
[data[iin:iout] for iin, iout in slices]

Both ways are slow. I need this to be very fast. I looking for something like this.

data[slices[:,0], slices[:,1]]

1 Answer 1

1

Some timings with your slices and data = np.arange(2000)

Your take, corrected:

In [360]: timeit np.take(data, [idx for iin, iout in slices for idx in range(iin,iout)])
10000 loops, best of 3: 92.5 us per loop

In [359]: timeit data[[idx for iin, iout in slices for idx in range(iin,iout)]]
10000 loops, best of 3: 92.2 us per loop

Your 2nd version (corrected) - quite a bit better

In [361]: timeit np.concatenate([data[iin:iout] for iin,iout in slices])
100000 loops, best of 3: 15.8 us per loop

Using np.r_ to concatenate slices - not much of an improvement over your 1st.

In [362]: timeit data[np.r_[tuple([slice(i[0],i[1]) for i in slices])]]
10000 loops, best of 3: 79 us per loop
In [363]: timeit np.r_[tuple([slice(i[0],i[1]) for i in slices])]
10000 loops, best of 3: 67.5 us per loop

Constructing the index takes the bulk of the time.

Of course rankings at this size might change with a much scaled up problem.

Since your slices vary in length, there isn't much hope of generating them all in a vectorized way, that is 'in parallel'. I don't know if a cython implementation would speed it up much or not.

More timings from an earlier similar question https://stackoverflow.com/a/11062055/901925

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.