25

Show my code

>>> df = pd.DataFrame({'key1': ['a', 'a', 'b', 'b', 'a'], \
                   'key2': ['one', 'two', 'one', 'two', 'one'], \
                   'data1': np.random.randn(5), \
                   'data2': np.random.randn(5)})

>>> new_df = df.groupby(['key1', 'key2']).mean().unstack()
>>> print new_df
         data1               data2
key2       one       two       one       two
key1
a    -0.070742 -0.598649 -0.349283 -1.272043
b    -0.109347 -0.097627 -0.641455  1.135560 
>>> print new_df.columns
MultiIndex(levels=[[u'data1', u'data2'], [u'one', u'two']],
       labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
       names=[None, u'key2'])

As you can see, the MultiIndex dataframe is different with normal dataframes, so how to access the data in the MultiIndex dataframe.

1
  • 1
    Though it's not easy to follow the documentation (explanations buried into an "advanced indexing" section), keep in mind multilevel indexing is based on tuple indices, hence accessing data requires loc and tuples, even if there are ambiguous shortcuts not using loc and even not using tuples. Commented Jan 2, 2021 at 16:15

2 Answers 2

27

Accessing data in multiindex dataframe is similar to the way on a general dataframe. For example, if you want to read data at (a, data1.two), you can simply do: new_df['data1']['two']['a'] or new_df.loc['a', ('data1', 'two')]

Please read the official docs for more details.

Sign up to request clarification or add additional context in comments.

Comments

-2

This might helps you to know and visualize

unstacked = multi_indexDataFrame.unstack().dropna()
unstacked.plot(kind="bar")

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.