Demystify numpy indexing/slicing

Question

could you please help demystify the following numpy indexing/slicing behaviours? Thanks!

arr = np.arange(60).reshape(3,4,5)

print(arr[2, :, 4])     #1

print(arr[[2], :, 4])   #2
print(arr[2, :, [4]])   #3
print(arr[[2], :, [4]]) #4

[44 49 54 59]
[[44 49 54 59]]
[[44 49 54 59]]
[[44 49 54 59]]

#1 is comprehensible whereas #2,#3,#4 are really confusing for me when it comes to the shape of results ((1,4) arrays). More specifically, when would inner [] impact dimensions of resulting array?

A trickier example:

arr = np.arange(120).reshape(4,6,5)
arr[[1,3], :3, [4,2]]

array([[ 34,  39,  44],
       [ 92,  97, 102]])

Have you looked at numpy.org/doc/stable/user/basics.indexing.html . This describes everything. You're in the "advanced indexing" section. — Frank Yellin
– Frank Yellin, Commented Dec 11, 2024 at 20:45
2, 3 ,4 are actually examples of mixed mixed basic-advanced indexing. The size 4 middle slice is 'tacked' on to the end. The other indices select a size 1 dimension. #1 is 'pure' basic, with scalars and a slice — hpaulj
– hpaulj, Commented Dec 11, 2024 at 21:46
@FrankYellin, thanks for your attention and the hint! In fact, I've combed through that chapter a few times; it turns out that I can read every word, but can't understand the paragram. Here's an example: "In general, the shape of the resultant array will be the concatenation of the shape of the index array (or the shape that all the index arrays were broadcast to) with the shape of any unused dimensions (those not indexed) in the array being indexed." — ThxAlot
– ThxAlot, Commented Dec 12, 2024 at 0:28
In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that. numpy.org/doc/stable/user/… — hpaulj
– hpaulj, Commented Dec 12, 2024 at 1:43
I (and others) have answered this mixed indexing case for a long time, stackoverflow.com/q/45471197/901925. I really need to favorite the best answer for use as a duplicate. — hpaulj
– hpaulj, Commented Dec 12, 2024 at 1:51

hpaulj · Accepted Answer · 2024-12-12 02:13:55Z

In [118]: arr = np.arange(60).reshape(3,4,5)

Your first example is the straightforward basic indexing, with a 2 scalar indices and slice. The result is a view, and the shape is that of the 2nd dimension, (4,):

In [119]: arr[2, :, 4]
Out[119]: array([44, 49, 54, 59])

Same thing, but a copy, when using an array/list instead of the slice:

In [120]: arr[2, [0,1,2,3], 4]
Out[120]: array([44, 49, 54, 59])

If I provide size 1 lists (arrays) for all indices, the result is (1,):

In [121]: arr[[2], [0], [4]]
Out[121]: array([44])

Same if one or more is a scalar:

In [122]: arr[[2], [0], 4]
Out[122]: array([44])

With size 1 lists instead of the scalars, the same (4,) shape - because (1,) broadcasts with (4,):

In [123]: arr[[2], [0,1,2,3], [4]]
Out[123]: array([44, 49, 54, 59])

But if there is slice in the middle, the shape is (1,4):

In [124]: arr[[2], :, [4]]
Out[124]: array([[44, 49, 54, 59]])

same if one those is a scalar:

In [125]: arr[2, :, [4]]
Out[125]: array([[44, 49, 54, 59]])

If I move the [2] out, I get a (4,1) array:

In [126]: arr[2][ :, [4]]
Out[126]: 
array([[44],
       [49],
       [54],
       [59]])

The 2 selects the first plane, the 4 comes from the slice, and 1 from the last dimesion.

Generalizing [125] so the last dimension is 2, the result is (2,4).

In [127]: arr[2, :, [1,4]]
Out[127]: 
array([[41, 46, 51, 56],
       [44, 49, 54, 59]])

Both this and [125] are examples where the slice is in the middle, and its dimension is tacked on the end, after the dimensions produced by advanced indexing.

As I commented this has come up periodically for many years.

Without a 'slice in the middle', we get the expected shape - (3,4) from the slices, (1,) from the advanced index:

In [130]: arr[:, :, [4]].shape
Out[130]: (3, 4, 1)

This is a copy of arr, but it is actually a view, a transpose, of a (1,3,4) base:

In [131]: arr[:, :, [4]].base.shape
Out[131]: (1, 3, 4)

As in the 'slice in the middle' cases, the advanced indexing dimension is first, and the slices are 'tacked' on. But in this case it can transpose it to the desired shape. It's an implementation detail that usually is ignored.

While this is probably overwhelming from a basic user perspective, it is interesting insight for more advanced users, especially the last bit.
Dear expert, thank you very much for such detailed demystification! I'm now trying to digest it and hopefully my foggy mind would get cleaned by your great help!

Frank Yellin · Accepted Answer · 2024-12-11 20:51:57Z

The basic idea is that just like x[0] is the first element of the array, you can use x[[1, 3, 4, 6]] to create a new array pulling out four elements of the array. Note that this notation creats a dimension. You also get the same result if [1, 3, 4, 6] is replaced with the 4-element numpy array containing those values.

If you have more than one such array or list in the index, they are merged together. Hence you can use:

x[[1, 2],[3, 4]] to get elements x[1,3] and x[2, 4] as a two-element array. There are numerous numpy functions that return lists of arrays, where the first array is the first index, the second array is the second index, and so forth. You can use those arrays directly in indexing notation.

He's using mixed basic-advanced indexing, which confuses many users
@hpaulj. Agreed. It never made sense to me until I realized how nicely it fit in with the output of np.where(condition).

Collectives™ on Stack Overflow

Demystify numpy indexing/slicing

2 Answers 2

2 Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

2 Comments

Linked

Related