3

As the title implies, I would like to add an empty row to my MultiIndex DataFrame. The first level index needs to have a defined index value and the second level index needs to be np.nan. The values in the columns need to be np.nan.

Consider the following:

import pandas as pd
import numpy as np

iterables = [['foo'], ['r_1', 'r_2', 'r_3']]
idx = pd.MultiIndex.from_product(iterables, names=['idx_1', 'idx_2'])
data = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
df = pd.DataFrame(data, idx, columns=['col_1', 'col_2', 'col_3'])
df
Out[93]:
             col_1  col_2  col_3
idx_1 idx_2                     
foo   r_1        1      2      3
      r_2        4      5      6
      r_3        7      8      9

I would normally append a Series if this were a not a MultiIndex like this:

s = pd.Series(
    [np.nan, np.nan, np.nan], 
    index=['col_1', 'col_2', 'col_3'], 
    name='bar'
)
df.append(s)
Out[95]:
            col_1  col_2  col_3
(foo, r_1)    1.0    2.0    3.0
(foo, r_2)    4.0    5.0    6.0
(foo, r_3)    7.0    8.0    9.0
bar           NaN    NaN    NaN

In this case, my MultiIndex is converted to tuples. I can't ignore_index=True in the append method because that removes the MultiIndex. I feel like I'm close, yet so far.

My output should look like this:

# some magic
Out[96]:
             col_1  col_2  col_3
col_a col_b
foo   r_1    1.0    2.0    3.0
      r_2    4.0    5.0    6.0
      r_3    7.0    8.0    9.0
bar   NaN    NaN    NaN    NaN

(Also acceptable to have the second level index None).

How do I do this?

Using Python 3.6 and Pandas 0.20.3.

1 Answer 1

3

Use setting with enlargement:

df.loc[('bar', ''), ['col_1', 'col_2', 'col_3']] = np.nan

Or use tuple in name:

s = pd.Series(
    [np.nan, np.nan, np.nan], 
    index=['col_1', 'col_2', 'col_3'], 
    name=('bar', np.nan)
)

print (df.append(s))
             col_1  col_2  col_3
idx_1 idx_2                     
foo   r_1      1.0    2.0    3.0
      r_2      4.0    5.0    6.0
      r_3      7.0    8.0    9.0
bar   NaN      NaN    NaN    NaN

s = pd.Series(
    [np.nan, np.nan, np.nan], 
    index=['col_1', 'col_2', 'col_3'], 
    name=('bar', '')
)

print (df.append(s))
             col_1  col_2  col_3
idx_1 idx_2                     
foo   r_1      1.0    2.0    3.0
      r_2      4.0    5.0    6.0
      r_3      7.0    8.0    9.0
bar            NaN    NaN    NaN
Sign up to request clarification or add additional context in comments.

2 Comments

Ah damn. I was using a list in the name argument and Series was throwing an error: Series.name must be a hashable type.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.