1

Trying to Drop Null Values Pertaining to Index in Multi-level Index

I'm trying to drop all IDs (index level 1) that contain any nulls in the "Data" column.

As an example, I've created a sample dataframe below:

import pandas as pd
import numpy as np

ids = ['0', '0', '0', '0', '0', '1','1','1','1','1','2','2','2','2','2','3','3','3','3','3']
dates = ['1/1/21', '1/2/21', '1/3/21', '1/4/21', '1/5/21','1/1/21', '1/2/21', '1/3/21', '1/4/21', '1/5/21','1/1/21', '1/2/21', '1/3/21', '1/4/21', '1/5/21','1/1/21', '1/2/21', '1/3/21', '1/4/21', '1/5/21']
data = [np.nan, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, np.nan, 16, 17, 18, 19]

df = pd.DataFrame(data=data, index=[ids,dates], columns=['Data'])

I'm looking to clean the dataframe to effectively return just the rows pertaining where ID is 1 or 2 because those are the only IDs with no null values for any of the dates in the second level of the index.

I've tried df.dropna(subset=['Data'], inplace=True), but that only drops the rows with null values, not the entire index.

What is the best way to drop all rows pertaining to an index if any of those rows has a null value in a Pandas Dataframe?

1
  • 2
    similar to @wwnde's answer: df.loc[~df.groupby(level=0).Data.transform(lambda x: x.isna().any())] Commented Jan 27, 2021 at 20:37

2 Answers 2

2

Identify andex that has any NaN and filter out the reverse

df.groupby(level=0).filter(lambda x:~(x.isna().any()))

as suggested by @sammywemmy can also filter groups which have no NaNs using x.notna().all(). Code below;

df.groupby(level=0).filter(lambda x: (x.notna().all()))




          Data
1 1/1/21   5.0
  1/2/21   6.0
  1/3/21   7.0
  1/4/21   8.0
  1/5/21   9.0
2 1/1/21  10.0
  1/2/21  11.0
  1/3/21  12.0
  1/4/21  13.0
  1/5/21  14.0
Sign up to request clarification or add additional context in comments.

3 Comments

This also works :df.groupby(level=0).filter(lambda x: (x.notna().all()))
If I want to reference the "Data" column specifically (my actual data has more columns), how could I apply that?
Lets Try df.groupby(level=0).filter(lambda x:~(x['Data'].isna().any()))
0

I was also able to use df.loc[~df.index.get_level_values(0).isin(df.loc[df['Data'].isna()].index.get_level_values(0))], but that is much less pythonic.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.