0

I'm working in Python with a pandas DataFrame similar to:

REQUESET_ID | DESCR | TEST | TEST_DESC | RESULT |
         1  |     1 |   T1 | TEST_1    |  2.0   |           
         1  |     2 |   T2 | TEST_2    |  92.0  |                       
         2  |     1 |   T1 | TEST_1    |  8.0   |
         3  |     3 |   T3 | TEST_3    |  12.0  |
         3  |     4 |   T4 | TEST_4    |  45.0  |

What I want is a final dataframe like this:

 REQUESET_ID | DESCR_1 | TEST_1 | TEST_DESC_1 | RESULT_1 | DESCR_2 | TEST_2 | TEST_DESC_2 | RESULT_2 |
          1  |       1 |     T1 |   TEST_1    |  2.0     |       2 |   T2   |   TEST_2    |  92.0    | 
          2  |       1 |     T1 |   TEST_1    |  8.0     |   NaN   |  NaN   |     NaN     | Nan      |  
          3  |       3 |     T3 |    TEST_3   |  12.0    |       4 |   T4   |   TEST_4    |  45.0    |

How I should implement that as a method working with DataFrames. I understand that if I try to do it with a merge instead of having 4x2 columns added beacuse the value_counts method of the REQUEST_ID will return 2, will add the 4 columns for each entry in the request column.

1 Answer 1

1

Assign a new column with cumcount, then do stack + unstack

s=df.assign(col=(df.groupby('REQUESET_ID').cumcount()+1).astype(str)).\
       set_index(['REQUESET_ID','col']).unstack().sort_index(level=1,axis=1)
s.columns=s.columns.map('_'.join)
s
             DESCR_1  RESULT_1 TEST_1  ... RESULT_2  TEST_2  TEST_DESC_2
REQUESET_ID                            ...                              
1                1.0       2.0     T1  ...     92.0      T2       TEST_2
2                1.0       8.0     T1  ...      NaN     NaN          NaN
3                3.0      12.0     T3  ...     45.0      T4       TEST_4
[3 rows x 8 columns]
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.