Pandas MuliIndex selection of hierarchical columns

Question

Goal: Transform raw data pulled from EuroStat via Pandas DataReader and reshape the data such that it has a Pandas DateTime object as the index and countries across as columns.

Code:

import pandas as pd
import pandas_datareader as web  
import datetime
start = datetime.datetime(1900,1,1)
end = datetime.date.today()
df2 = web.DataReader('tipsii20', 'eurostat', start = start,end = end)
df2.columns

looking at the columns, we can see that we are working with a MultiIndex

MultiIndex(levels=[[u'Rest of the world'], [u'Net liabilities (liabilities minus assets)'], [u'Net external debt'], [u'Percentage of gross domestic product (GDP)'], [u'Unadjusted data (i.e. neither seasonally adjusted nor calendar adjusted data)'], [u'Austria', u'Belgium', u'Bulgaria', u'Croatia', u'Cyprus', u'Czech Republic', u'Denmark', u'Estonia', u'Finland', u'France', u'Germany (until 1990 former territory of the FRG)', u'Greece', u'Hungary', u'Ireland', u'Italy', u'Latvia', u'Lithuania', u'Luxembourg', u'Malta', u'Netherlands', u'Poland', u'Portugal', u'Romania', u'Slovakia', u'Slovenia', u'Spain', u'Sweden', u'United Kingdom'], [u'Annual']], labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 2, 4, 5, 10, 6, 7, 11, 25, 8, 9, 3, 12, 13, 14, 16, 17, 15, 18, 19, 20, 21, 22, 26, 24, 23, 27], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], names=[u'PARTNER', u'STK_FLOW', u'BOP_ITEM', u'UNIT', u'S_ADJ', u'GEO', u'FREQ'])

I would like to transform this dataset so that it maintains its DateTime index, but uses names['GEO'] as the columns. Should this be df2.xs?

What is start and end ?

jezrael
– jezrael

2017-10-27 12:04:21 +00:00
Commented Oct 27, 2017 at 12:04 — jezrael
– jezrael, Commented Oct 27, 2017 at 12:04
Thanks, just added the start and end objects

Merv Merzoug
– Merv Merzoug

2017-10-27 12:06:23 +00:00
Commented Oct 27, 2017 at 12:06 — Merv Merzoug
– Merv Merzoug, Commented Oct 27, 2017 at 12:06

jezrael · Accepted Answer · 2017-10-27 12:17:30Z

2

You can use droplevel:

df2.columns = df2.columns.droplevel([0,1,2,3,4,6])

Another solution if know level name similar as Bharath shetty' solution:

df2.columns =  df2.columns.get_level_values('GEO')

edited Oct 27, 2017 at 12:17

answered Oct 27, 2017 at 12:12

jezrael

867k102 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Bharath M Shetty Over a year ago

This is so simple

Bharath M Shetty Over a year ago

I hate the documentation now they say it should be int but can also take strings?

jezrael Over a year ago

Yes, it can be string or int, all are valid values.

Bharath M Shetty Over a year ago

It said int pandas.pydata.org/pandas-docs/stable/generated/…. Can you edit the documentation

Bharath M Shetty · Accepted Answer · 2017-10-27 12:22:42Z

Use pd.DataFrame with get_level_values(5) since GEO is in fifth level for columns incase you want to preserve the dataframe for future reference i.e

ndf = pd.DataFrame(df2.values,df2.index,df2.columns.get_level_values(5))

Or assign the columns by getting level values like

df2.columns =  df2.columns.get_level_values(5)

Output :

print(ndf.head().iloc[:,:4])

GEO          Austria  Belgium  Bulgaria  Cyprus
TIME_PERIOD                                    
2010-01-01      28.0   -121.2      37.1    70.9
2011-01-01      24.0   -118.8      29.6   127.1
2012-01-01      25.8   -102.7      25.4   137.2
2013-01-01      20.1    -88.4      21.6   140.0
2014-01-01      20.0    -71.1      18.3   136.1

Collectives™ on Stack Overflow

Pandas MuliIndex selection of hierarchical columns

2 Answers 2

4 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Related