Drop Indexes with Nulls in Column in Pandas

Question

Trying to Drop Null Values Pertaining to Index in Multi-level Index

I'm trying to drop all IDs (index level 1) that contain any nulls in the "Data" column.

As an example, I've created a sample dataframe below:

import pandas as pd
import numpy as np

ids = ['0', '0', '0', '0', '0', '1','1','1','1','1','2','2','2','2','2','3','3','3','3','3']
dates = ['1/1/21', '1/2/21', '1/3/21', '1/4/21', '1/5/21','1/1/21', '1/2/21', '1/3/21', '1/4/21', '1/5/21','1/1/21', '1/2/21', '1/3/21', '1/4/21', '1/5/21','1/1/21', '1/2/21', '1/3/21', '1/4/21', '1/5/21']
data = [np.nan, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, np.nan, 16, 17, 18, 19]

df = pd.DataFrame(data=data, index=[ids,dates], columns=['Data'])

I'm looking to clean the dataframe to effectively return just the rows pertaining where ID is 1 or 2 because those are the only IDs with no null values for any of the dates in the second level of the index.

I've tried df.dropna(subset=['Data'], inplace=True), but that only drops the rows with null values, not the entire index.

What is the best way to drop all rows pertaining to an index if any of those rows has a null value in a Pandas Dataframe?

similar to @wwnde's answer: df.loc[~df.groupby(level=0).Data.transform(lambda x: x.isna().any())] — sammywemmy
– sammywemmy, Commented Jan 27, 2021 at 20:37

wwnde · Accepted Answer · 2021-01-27 21:04:15Z

2

Identify andex that has any NaN and filter out the reverse

df.groupby(level=0).filter(lambda x:~(x.isna().any()))

as suggested by @sammywemmy can also filter groups which have no NaNs using x.notna().all(). Code below;

df.groupby(level=0).filter(lambda x: (x.notna().all()))




          Data
1 1/1/21   5.0
  1/2/21   6.0
  1/3/21   7.0
  1/4/21   8.0
  1/5/21   9.0
2 1/1/21  10.0
  1/2/21  11.0
  1/3/21  12.0
  1/4/21  13.0
  1/5/21  14.0

edited Jan 27, 2021 at 21:04

answered Jan 27, 2021 at 20:37

wwnde

26.7k6 gold badges21 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

sammywemmy Over a year ago

This also works :df.groupby(level=0).filter(lambda x: (x.notna().all()))

DJHeels Over a year ago

If I want to reference the "Data" column specifically (my actual data has more columns), how could I apply that?

wwnde Over a year ago

Lets Try df.groupby(level=0).filter(lambda x:~(x['Data'].isna().any()))

DJHeels · Accepted Answer · 2021-01-27 21:15:47Z

0

I was also able to use df.loc[~df.index.get_level_values(0).isin(df.loc[df['Data'].isna()].index.get_level_values(0))], but that is much less pythonic.

answered Jan 27, 2021 at 21:15

DJHeels

898 bronze badges

Collectives™ on Stack Overflow

Drop Indexes with Nulls in Column in Pandas

Trying to Drop Null Values Pertaining to Index in Multi-level Index

2 Answers 2

3 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

Trying to Drop Null Values Pertaining to Index in Multi-level Index

2 Answers 2

3 Comments

Comments

Related