Selecting multiple rows of hierarchical DataFrame with Pandas MultiIndex

Question

I have a Pandas DataFrame with MultiIndex with 3 levels. Suppose I have the following data

df = pd.DataFrame({'ColB': {('A1', 'B1', 1): 'cb1',
  ('A1', 'B1', 2): 'cb2',
  ('A1', 'B2', 1): 'cb3',
  ('A1', 'B2', 2): 'cb4',
  ('A2', 'B1', 1): 'cb5',
  ('A2', 'B1', 2): 'cb6',
  ('A2', 'B2', 1): 'cb7',
  ('A2', 'B2', 2): 'cb8'},
 'colA': {('A1', 'B1', 1): 'ca1',
  ('A1', 'B1', 2): 'ca2',
  ('A1', 'B2', 1): 'ca3',
  ('A1', 'B2', 2): 'ca4',
  ('A2', 'B1', 1): 'ca5',
  ('A2', 'B1', 2): 'ca6',
  ('A2', 'B2', 1): 'ca7',
  ('A2', 'B2', 2): 'ca8'}})

        ColB colA
A1 B1 1  cb1  ca1
      2  cb2  ca2
   B2 1  cb3  ca3
      2  cb4  ca4
A2 B1 1  cb5  ca5
      2  cb6  ca6
   B2 1  cb7  ca7
      2  cb8  ca8

Now, I have a MultiIndex object that contains the index of the first two levels, like

MultiIndex([('A1', 'B2'),
            ('A2', 'B1')],
           )

I want to use that MultiIndex to select all the rows corresponding to that MultiIndex including all the index from level 3, such as,

        ColB colA
A1 B2 1  cb3  ca3
      2  cb4  ca4
A2 B1 1  cb5  ca5
      2  cb6  ca6

How can I do this? I've been searching for answer for hours but I still have no clue. Thank you.

jezrael · Accepted Answer · 2020-08-25 05:15:30Z

2

Use Index.isin with remove 3rd level by MultiIndex.droplevel and filter by boolean indexing:

df = df[df.index.droplevel(2).isin(mux)]
print (df)
        ColB colA
A1 B2 1  cb3  ca3
      2  cb4  ca4
A2 B1 1  cb5  ca5
      2  cb6  ca6

It working correct for any index:

mux = pd.MultiIndex.from_tuples([('A1', 'B1'),('A2', 'B2')])

df = df[df.index.droplevel(2).isin(mux)]
print (df)
        ColB colA
A1 B1 1  cb1  ca1
      2  cb2  ca2
A2 B2 1  cb7  ca7
      2  cb8  ca8

answered Aug 25, 2020 at 5:15

jezrael

867k102 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

jwonlee Over a year ago

Great, but is there a way to get it work with mixed index? I mean, I have a disordered MultiIndex and I want to get the resultant data frame to be disordered as well

jezrael Over a year ago

@jwonlee - Not sure if understand, it should working nice also in disordered MultiIndex

Alex Rajan Samuel · Accepted Answer · 2020-08-25 04:42:03Z

1

I guess this is what you are looking for, you can try.

    df[('A1','B1')]

KR, Alex

answered Aug 25, 2020 at 4:42

Alex Rajan Samuel

1443 bronze badges

Comments

sushanth · Accepted Answer · 2020-08-25 05:18:37Z

1

Let's try Advanced indexing with hierarchical index

df.loc[('A1', 'B2'):('A2','B1')]

Out[56]: 
        ColB colA
A1 B2 1  cb3  ca3
      2  cb4  ca4
A2 B1 1  cb5  ca5
      2  cb6  ca6

edited Aug 25, 2020 at 5:18

answered Aug 25, 2020 at 5:12

sushanth

8,2923 gold badges20 silver badges31 bronze badges

Collectives™ on Stack Overflow

Selecting multiple rows of hierarchical DataFrame with Pandas MultiIndex

3 Answers 3

2 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Linked

Related