1

Related to this question, I've been trying to use melt but without success..

I've got a DataFrame with 1 row, like this:

   A   B   C   total   date   A_size   B_size   C_size   total_size
0  4   2   5    11 2019-01-01  123      456      789        1368

Which I'd like to turn into this (at this point I don't care about date anymore):

      Values     Sizes
A        4        123
B        2        456
C        5        789
total    11       1368

I've got something terribly hacky that does the job, but it's not flexible. I'd like to be able to add D and D_size without having to modify the downstream code.

Hacky code:

def format_table(todays_metadata: pd.DataFrame):
    todays_metadata_reformat = todays_metadata.loc[:, 'A':'total'] # hardcoded 'A'
    todays_metadata_reformat.index = ['Values']
    sizes = todays_metadata.loc[:, 'A_size':'total_size'] # hardcoded 'A_size'
    sizes.index = ['Sizes']
    sizes.columns = todays_metadata_reformat.columns
    todays_metadata_reformat = 
    todays_metadata_reformat.append(sizes).transpose()
    return todays_metadata_reformat

1 Answer 1

1

You can check the index for position of the total column by pd.Index.get_loc and create slices:

df1=df.drop('date',1)
i=df1.columns[:df1.columns.get_loc('total')+1] # ['A', 'B', 'C', 'total']
j=df1.columns[df1.columns.get_loc('total')+1:] #['A_size','B_size','C_size','total_size']

Then melt and concat:

m=df1[j].melt(value_name='size')
m.index=m.pop('variable').str.split('_').str[0]
pd.concat([df1[i].melt().set_index('variable'),m],axis=1)

EDIT:

another way using df.filter():

df1=df.drop('date',1)
m=df.filter(like='size')

x=df1[df1.columns.difference(m.columns,sort=False)].melt().set_index('variable')
y=m.rename(columns=lambda x: x.split('_')[0]).melt(value_name='Sizes').set_index('variable')
print(pd.concat([x,y],axis=1))

           value  Sizes
variable              
A             4    123
B             2    456
C             5    789
total        11   1368
Sign up to request clarification or add additional context in comments.

3 Comments

Is there a way to make it more robust, i.e. not rely on ordering of the columns? I'd like it to be flexible enough that anything appended with _size it can link to the Values of that object, if at all possible
@Guus added another way, please check if it scales for you
that's great, thanks! I had to replace the x.split('_') with x.replace('_size', '') as my actual values could contain underscores as well. But that's a beautifully flexible solution

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.