1

I have a Dataframe that looks as follows:


Serial : Status   : Date     : Comment
---------------------------------------
306    :  in      : 4/25/22  : Reported 
306    :  out     : 3/15/22  : Reported
306    :  damaged : 2/1/22   : Reported
307    :  in      : 7/6/22   : 
309    :          :          : 
310    :  out     : 6/5/22   : Not Reported
311    :  out     : 5/5/22   : Not Reported
311    :  in      : 4/5/22   : 

I would like to reshape it to look as follows:

Serial : Status   : Date     : Comment        : Status_1   : Date_1     : Comment_1. : Status_2   : Date_2     : Comment_2
-----------------------------------------------------------------------------------------------------------------------
306    :  in      : 4/25/22  : Reported       :  out       : 3/15/22    : Reported   :  damaged   : 2/1/22     : Reported
307    :  in      : 7/6/22   : 
309    :          :          : 
310    :  out     : 6/5/22   : Not Reported
311    :  out     : 5/5/22   : Not Reported   :  in        : 4/5/22     : 

I have tried using Transpose and Pivot tables, but am unable to get the desired output. I have also tried iterating over the Dataframe by row and column and assigning values to new columns but that hasn't worked as expected either. How can I achieve the desired output ?

1
  • Something along the lines of the answers to this question Commented Aug 1, 2022 at 17:09

1 Answer 1

1
import pandas as pd
import numpy as np
import itertools
df = pd.DataFrame({
    'Serial':[306,306,306,307,309,310,311,311],
    'Status':['in','out','damage','in',np.nan,'out','out','in'],
    'Date':['4/25/22','3/15/22','2/1/22','7/6/22',np.nan,'6/5/22','5/5/22','4/5/22'],
    'Comment':['Reported','Reported','Reported',np.nan,np.nan,'Not Reported','Not Reported',np.nan]
})
df
###
   Serial  Status     Date       Comment
0     306      in  4/25/22      Reported
1     306     out  3/15/22      Reported
2     306  damage   2/1/22      Reported
3     307      in   7/6/22           NaN
4     309     NaN      NaN           NaN
5     310     out   6/5/22  Not Reported
6     311     out   5/5/22  Not Reported
7     311      in   4/5/22           NaN
df['group_count'] = df.groupby('Serial').cumcount()
temp = df.pivot(index=['Serial'],  columns=['group_count'])
temp.columns = temp.columns.droplevel(0)
max_count = df.groupby('Serial').count().values.max()
output = temp.loc[:,[i for i in range(max_count)]]

# Rename columns
name_list = [[f"Status_{i}",f"Date_{i}",f"Comment_{i}"] for i in range(3)]
name_list = list(itertools.chain.from_iterable(name_list))

output.columns = name_list
output

enter image description here




If you wanna fill NA

output.fillna('', inplace=True)
output

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.