0

I have the following DataFrame:

df=pd.DataFrame(index = ['2018-01-01','2018-01-02','2018-01-03','2018-01-04'])
df["ticker"] = ['TSLA', 'TSLA', 'IBM', 'IBM']
df["price"] = ['1000', '1200', '101', '108']
df["volume"] = ['100000', '123042', '1087878', '108732']
df["marketcap"] = ['1.2T', '1.4T', '30B', '35B']
df.index.rename('Date', inplace=True)
df:
                 ticker  price  volume  marketcap
          Date              
    2018-01-01   TSLA    1000   100000  1.2T
    2018-01-02   TSLA    1200   123042  1.4T
    2018-01-03   IBM     101    1087878 30B
    2018-01-04   IBM     108    108732  35B

I would like to set the tickers as columns-level = 0 and the price, volume, marketcap to columns-level = 1. I want my DataFrame to look like this:

df
                 TSLA                              IBM         
                 price  volume  marketcap          price  volume    marketcap
          Date              
    2018-01-01   1000   100000  1.2T               NaN    NaN       NaN
    2018-01-02   1200   123042  1.4T               NaN    NaN       NaN
    2018-01-03   NaN    NaN     NaN                101    1087878   30B
    2018-01-04   NaN    NaN     NaN                108    108732    35B

How can this be done using pandas?

1 Answer 1

2

You can set_index and unstack with a bit of cleaning:

(df.set_index('ticker', append=True)
   .unstack('ticker')
   .swaplevel(axis=1)
   .sort_index(axis=1, level=0, sort_remaining=False)
)

output:

ticker       IBM                     TSLA                  
           price   volume marketcap price  volume marketcap
Date                                                       
2018-01-01   NaN      NaN       NaN  1000  100000      1.2T
2018-01-02   NaN      NaN       NaN  1200  123042      1.4T
2018-01-03   101  1087878       30B   NaN     NaN       NaN
2018-01-04   108   108732       35B   NaN     NaN       NaN
Sign up to request clarification or add additional context in comments.

2 Comments

What if I have duplicate entries? i.e. two similar dates for different tickers. How would I do that? I am getting "ValueError: Index contains duplicate entries, cannot reshape"
It is not an issue to have duplicates dates for different tickers, it would be an issue if dates are duplicated within a ticker. In such case you'd need to build an extra index or to drop duplicates

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.