0

I have a DataFrame like this:

+-------+-----------+
| File  |  Column   |
+-------+-----------+
| File1 | FirstName |
| File1 | LastName  |
| File2 | ID        |
| File2 | City      |
| File2 | State     |
+-------+-----------+

How could I group the File column and pass the respective Column values as rows? i.e.:

+-------+-----------+----------+-------+
| File  |   Col1    |   Col2   | Col3  |
+-------+-----------+----------+-------+
| File1 | FirstName | LastName | NaN   |
| File2 | ID        | City     | State |
+-------+-----------+----------+-------+

I'm thinking I need to pivot it and pass File as the index and Column as the values:

df.pivot(index='File', columns='', values='Column')

But here's where I'm stumped - I'm unsure what to pass for the columns parameter, or even if pivot is what I need.

2
  • Have you had a look at this Commented May 21, 2019 at 23:22
  • @razdi Thank you, that was very helpful! I did not find that earlier. Commented May 21, 2019 at 23:29

2 Answers 2

1

One way to do thanks to @razdi comment and @WeNYoBen comment here.

import pandas as pd
df = pd.DataFrame([["File1", "FirstName"],
                   ["File1", "LastName"],
                   ["File2", "ID"],
                   ["File2", "City"],
                   ["File2", "State"], ],
                  columns=["File", "Column"])

df = pd.pivot_table(df, index=['File'], columns=df.groupby(['File']).cumcount().add(1), values=['Column'], aggfunc='sum')
print(df)
#           Column
#                1         2      3
# File
# File1  FirstName  LastName    NaN
# File2         ID      City  State

df = df.reset_index()
print("df2: ", df)
#       File     Column
#                   1         2      3
# 0  File1  FirstName  LastName    NaN
# 1  File2         ID      City  State

df.columns = ["Col" + str(i) for i in range(len(df.columns))]
print(df)
#     Col0       Col1      Col2   Col3
# 0  File1  FirstName  LastName    NaN
# 1  File2         ID      City  State

Sign up to request clarification or add additional context in comments.

Comments

1
df = pd.pivot_table(df,index=['File'], columns=df.groupby(['File']).cumcount().add(1), values=['Column'],aggfunc='sum')
df.columns=df.columns.map('{0[0]}{0[1]}'.format) 

Found the answer using:

Pandas - Convert columns to new rows after groupby

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.