1

I have the following DataFrame, with each observation on a separate row.

df = pd.DataFrame({'geo': ['US', 'US', 'US', 'NY', 'NY', 'NY', 'NY', 'CT', 'CT'], 
              'series': ['a', 'a', 'b', 'a', 'a', 'b', 'b', 'a', 'b'], 
              'value': [1,2,3,7,4,3,4,12,13], 
               'date':  ['3/1', '3/2', '3/1', '3/1', '3/2', '3/1', '3/2', '3/1', '3/2']})

  date geo series  value
0  3/1  US      a      1
1  3/2  US      a      2
2  3/1  US      b      3
3  3/1  NY      a      7
4  3/2  NY      a      4
5  3/1  NY      b      3
6  3/2  NY      b      4
7  3/1  CT      a     12
8  3/2  CT      b     13

What I want: I want to re-organize the DataFrame so that the "date" variable is the index and geo & series are multiindex column variables. That is:

     US  US  NY  NY  CT  CT
     a   b   a   b   a   b
3/1  1   3   7   3   12  13
3/2  2  nan  4   4  nan  nan

What I've tried: I tried setting the index to date, geo, series, and then using "unstack", but it gives me a "duplicate value" error.

0

1 Answer 1

1

Normally you can stack() and unstack():

df.set_index(['date','geo','series'])['value'].unstack(['geo','series'])

Output:

geo      US        NY         CT      
series    a    b    a    b     a     b
date                                  
3/1     1.0  3.0  7.0  3.0  12.0   NaN
3/2     2.0  NaN  4.0  4.0   NaN  13.0

It gives you duplicate error because you have duplicated data on the three columns date, geo, series, for example:

date geo series value
 3/1  US      a     1     
 3/1  US      a     2

To confirm that, try to do:

df.duplicated(['date','geo','series']).any()
# should give you True

Depending what you want to do with the duplicates, you can use groupby:

# mean:
(df.groupby(['date','geo','series'])
   ['value'].mean()
   .unstack(['geo','series'])
)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.