2

I would like to add a column SUM to the df1 below. It's a Datetime MultiIndex and the new column SUM should return the sum of the price row.

multex = pd.MultiIndex.from_product([['price',
 'weight','quantity','portfolio'] ,df1.index],names=['Date', 'Stats'])

new_df = pd.DataFrame(index=multex, columns= df1.columns.values.tolist())

Subsequently would like to add a row SUM as well that returns the same value. I've tried the following so far:

df1['SUM']= df1.ix['price'].sum(axis=1) 


                           A      B      C        D     E

2006-04-28 00:00:00                                    
            price        69.62  69.62  6.518   65.09  69.62
            weight
            std
2006-05-01 00:00:00  
            price         71.5   71.5  6.522   65.16   71.5
            weight
            std
2006-05-02 00:00:00   
            price        72.34  72.34  6.669   66.55  72.34
            weight
            std
2
  • I would update your question so people can easily reproduce your dataframe. Also, what was the result of df1['SUM']= df1.ix['price'].sum(axis=1) Commented Sep 19, 2016 at 3:20
  • HI. thanks. thats the error that I get: ValueError: shape mismatch: value array of shape (2939,) could not be broadcast to indexing result of shape (2939,2) Commented Sep 19, 2016 at 3:46

1 Answer 1

3

You can use first DataFrame.sort_index, because error:

KeyError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (2), lexsort depth (1)'

Then use slicers:

df1 = df1.sort_index()

idx = pd.IndexSlice
df1['SUM'] = df1.loc[idx[:,'price'],:].sum(axis=1)
print (df1)
                                A      B      C      D      E      SUM
Date                Stats                                             
2006-04-28 00:00:00 price   69.62  69.62  6.518  65.09  69.62  280.468
                    std       NaN    NaN    NaN    NaN    NaN      NaN
                    weight    NaN    NaN    NaN    NaN    NaN      NaN
2006-05-01 00:00:00 price   71.50  71.50  6.522  65.16  71.50  286.182
                    std       NaN    NaN    NaN    NaN    NaN      NaN
                    weight    NaN    NaN    NaN    NaN    NaN      NaN
2006-05-02 00:00:00 price   72.34  72.34  6.669  66.55  72.34  290.239
                    std       NaN    NaN    NaN    NaN    NaN      NaN
                    weight    NaN    NaN    NaN    NaN    NaN      NaN

df1['SUM'] = df1.loc[(slice(None), slice('price')),:].sum(axis=1)
print (df1)
                                A      B      C      D      E      SUM
Date                Stats                                             
2006-04-28 00:00:00 price   69.62  69.62  6.518  65.09  69.62  280.468
                    std       NaN    NaN    NaN    NaN    NaN      NaN
                    weight    NaN    NaN    NaN    NaN    NaN      NaN
2006-05-01 00:00:00 price   71.50  71.50  6.522  65.16  71.50  286.182
                    std       NaN    NaN    NaN    NaN    NaN      NaN
                    weight    NaN    NaN    NaN    NaN    NaN      NaN
2006-05-02 00:00:00 price   72.34  72.34  6.669  66.55  72.34  290.239
                    std       NaN    NaN    NaN    NaN    NaN      NaN
                    weight    NaN    NaN    NaN    NaN    NaN      NaN
Sign up to request clarification or add additional context in comments.

15 Comments

perfect! thnk you!
Glad can help you!
Do you need df1.loc[('', 'SUM'),:] = df1.loc[(slice(None), slice('price')),:].sum() ?
And what about df1.loc[(None, 'SUM'),:] = df1.loc[(slice(None), slice('price')),:].sum() ?
df2 = df1.loc[(slice(None), slice('price')),:].sum().to_frame().T df2.index = pd.MultiIndex.from_arrays([None,'Sum']) print (pd.concat([df1,df2]))
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.