Pandas MultiIndex get all rows with label value

Question

Assume you have a Panda DataFrame with a MultiIndex. You want to get all the rows that have a label with a particular value. How do you do this?

My first thought was a boolean mask...

df[df.index.labels == 1].head()

but this does not work.

Thanks!

You can convert index back to columns and then filter. It certainly works with one index. It should work with multiindex but I am not sure. — keiv.fly
– keiv.fly, Commented Jul 23, 2016 at 22:19
Why the downvote? Is this clearly documented somewhere? Is it unclear? Is it not helpful? It would have helped me obviously meta.stackoverflow.com/questions/252677/… — conner.xyz
– conner.xyz, Commented Jul 23, 2016 at 23:45
I wasn't the one who down voted, nor do I know who did. But I can say that I've seen this question many times and has been answered many times. Try stackoverflow.com/search?q=pandas+filter+rows. Someone probably didn't think you put enough effort into the research. If you hover over the down vote button, it says "this question doesn't show any research effort". Hope that helps. — piRSquared
– piRSquared, Commented Jul 24, 2016 at 0:57

Andy Hayden · Accepted Answer · 2016-07-24 06:31:01Z

3

I would use xs (cross-section):

In [11]: df = pd.DataFrame([[1, 2, 3], [3, 4, 5]], columns=list("ABC")).set_index(["A", "B"])

In [12]: df
Out[12]:
     C
A B
1 2  3
3 4  5

then you can take those which have level A equal to 1:

In [13]: df.xs(key=1, level="A")
Out[13]:
   C
B
2  3

Using drop_level=False does the filter (without dropping the A index):

In [14]: df.xs(key=1, level="A", drop_level=False)
Out[14]:
     C
A B
1 2  3

answered Jul 24, 2016 at 6:31

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

keiv.fly · Accepted Answer · 2016-07-23 22:36:21Z

You need to specify which index you use. In my example I took the second index (My dataframe is s because it was so in Multiindex page of Pandas):

s[s.index.labels[1]==1]

You can actually see how index is constructed if you type:

s.index

The resulting structure is:

MultiIndex(levels=[['bar', 'baz', 'foo', 'qux'], [1, 2]],
       labels=[[0, 0, 1, 1, 2, 2, 3, 3], [0, 1, 0, 1, 0, 1, 0, 1]],
       names=['first', 'second'])

Below I have the full code:

>>> import pandas as pd
>>> import numpy as np
>>> arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
...           [1, 2, 1, 2, 1, 2, 1, 2]]
... 
>>> tuples = list(zip(*arrays))
>>> index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
>>> s = pd.Series(np.random.randn(8), index=index)
>>> s[s.index.labels[1]==1]
first  second
bar    2        -0.304029
baz    2        -1.216370
foo    2         1.401905
qux    2        -0.411468
dtype: float64

MaxU - stand with Ukraine · Accepted Answer · 2016-07-24 09:05:33Z

1

alternative solution:

In [62]: df = pd.DataFrame({'idx1': ['A','B','C'], 'idx2':[1,2,3], 'val': [30,10,20]}).set_index(['idx1','idx2'])

In [63]: df
Out[63]:
           val
idx1 idx2
A    1      30
B    2      10
C    3      20

In [64]: df[df.index.get_level_values('idx2') == 2]
Out[64]:
           val
idx1 idx2
B    2      10

In [65]: df[df.index.get_level_values(1) == 2]
Out[65]:
           val
idx1 idx2
B    2      10

answered Jul 24, 2016 at 9:05

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

Collectives™ on Stack Overflow

Pandas MultiIndex get all rows with label value

3 Answers 3

Comments

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Related