0

I have a sample data frame like below

df1 = pd.DataFrame({'Gender':['Male','Male','Male','Male','Female','Female','Female','Female','Male','Male','Male','Male','Female','Female','Female','Female'],
                'Year' :[2008,2008,2009,2009,2008,2008,2009,2009,2008,2008,2009,2009,2008,2008,2009,2009],
           'rate':[2.3,3.2,4.5,6.7,5.6,3.2,3.5,2.6,2.3,3.2,4.5,6.7,5.6,3.2,3.5,2.6],
           'Heading':['TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123',
                     'TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456'],
           'target':[31.2,33.4,33.4,35.2,35.2,36.4,36.4,37.2,31.2,33.4,33.4,35.2,35.2,36.4,36.4,37.2],
            'day_type':['wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend']})

I would like to transpose/pivot them to get the output like as shown below but for my code, it throws an error as shown below

df1.pivot(index='Year', columns='Heading', values='rate')

With the help of SO post, I wrote this but for 3 columns, I am not sure how to make it work?

df1 = df1.pivot_table(index=['Year','Gender','day_type'],columns='Heading',values='rate').unstack()
df1.columns = ['_'.join(i) for i in df1.columns.tolist()]

I expect my output to be like as shown below where each year is made as a row and all the corresponding entries for that year are made as columns.

Please note I haven't filled in the values as table column structure is more important.

enter image description here

1 Answer 1

2

Try with map, also you need unstack two level

df1 = df1.pivot_table(index=['Year','Gender','day_type'],columns='Heading',values='rate').unstack([1,2])
df1.columns=df1.columns.map('_'.join)
df1
      TNMAB123_Female_wk  ...  TNMAB456_Male_wkend
Year                      ...                     
2008                 5.6  ...                  3.2
2009                 3.5  ...                  6.7
[2 rows x 8 columns]
Sign up to request clarification or add additional context in comments.

6 Comments

Fantastic. It works. It works. But only issue is few of my columns repeat twice in my data? Meaning there is an extra entry of the same column (but empty/blank values). Upvoted
df1.pivot_table(index=['Year','Gender','day_type'],columns='Heading',values='rate',aggfunc='sum') @TheGreat
May I know why do we need to do aggfunc=sum?
@TheGreat pivot is to reshape, with aggfunc, it will make the duplicate agg with sum, or you can do df1=df1.sum(level=0,axis=1)
No, the issue right now is for ex: I have two columns. readata123_Female_wk and realdata123 _Female_wk. Note the gap in between hyphen. The former has proper values whereas the latter (incorrect space issue) column is empty
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.