Pandas How to slice multiindex Dataframe?

Question

           name                     address             contact_info    
        first_name  last_name       stret   city    mobile      email
    1   AAA             BBB         XXX     YYY     02020       [email protected]
    2   111             222         333     444     239393      [email protected]

I have an excel in the above format. What I want is to have every column inside name and then only mobile column inside contact_info would someone please let me know how I can do this. Following code gives me everything inside name and contact_info

import pandas as pd
df = pd.read_excel("test.xlsx", header=[0, 1], sheet_name="Mapping")
print df[["name", "contact_info"]]

I am trying to get something like this,

first_name  last_name   mobile
AAA         BBB        102020
111         222        239393

BENY · Accepted Answer · 2018-02-18 00:02:44Z

3

By using IndexSlice + concat

idx = pd.IndexSlice
pd.concat([df.loc[:, idx['name',:]],df.loc[:,idx[:,'mobile']]])
Out[104]: 
   contact_info       name          
         mobile first_name last_name
1           NaN        AAA       BBB
2           NaN        111       222
1          2020        NaN       NaN
2        239393        NaN       NaN

answered Feb 18, 2018 at 0:02

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

cs95 Over a year ago

Hmm, looks like concat cannot be avoided here.

BENY Over a year ago

@cᴏʟᴅsᴘᴇᴇᴅ yep , cause the multiple index

cs95 · Accepted Answer · 2018-02-18 00:07:39Z

3

You can use df.xs here:

i = df.xs('name', axis=1)
j = df.xs('mobile', axis=1, level=-1)

pd.concat([i, j], axis=1)

  first_name last_name  contact_info
1        AAA       BBB          2020
2        111       222        239393

answered Feb 18, 2018 at 0:07

cs95

406k106 gold badges744 silver badges794 bronze badges

1 Comment

Gaurang Shah Over a year ago

is there any reason I could do without concat.

piRSquared · Accepted Answer · 2018-02-18 01:22:17Z

Option 1
Simplest I could think of would be column slicing:

df['name'].join(df['contact_info']['mobile'])

  first_name last_name  mobile
1        AAA       BBB  020202
2        111       222  239393

Option 2
pd.DataFrame.filter

df.filter(regex='name|mobile')

        name           contact_info
  first_name last_name       mobile
1        AAA       BBB       020202
2        111       222       239393

And we can drop the level

d = df.filter(regex='name|mobile')
d.columns = d.columns.droplevel(0)
d

  first_name last_name  mobile
1        AAA       BBB  020202
2        111       222  239393

Nice, the filter idiom is useful here, will commit to memory.

ZaxR · Accepted Answer · 2018-02-18 01:36:30Z

0

Not sure why you want to avoid concat, but this does it:

df = pd.read_excel("multi-index-test.xlsx", header=[0, 1], sheet_name="Mapping")
df.drop('address', level=0, axis=1).drop('e-mail', level=1, axis=1)

This takes advantage of MultiIndex.drop().

edited Feb 18, 2018 at 1:36

answered Feb 18, 2018 at 0:43

ZaxR

5,1954 gold badges29 silver badges46 bronze badges

Comments

ZaxR · Accepted Answer · 2018-02-18 00:18:46Z

0

What you're looking for just requires basic indexing on a multiindex along with concat. Here's an example:

df = pd.read_excel("multi-index-test.xlsx", header=[0, 1])
df1 = df[["name"]]
df2 = df['contact_info', 'mobile']
pd.concat([df1, df2], axis=1)

I believe this solution has the benefit of being 1) simple and 2) general.

answered Feb 18, 2018 at 0:18

ZaxR

5,1954 gold badges29 silver badges46 bronze badges

Collectives™ on Stack Overflow

Pandas How to slice multiindex Dataframe?

5 Answers 5

2 Comments

1 Comment

1 Comment

Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

1 Comment

1 Comment

Comments

Comments

Related