Python: pandas to discard all rows if any of the row item is null

Question

I am having a dataframe df as follows:

ID  IndentNo    PO_Ref_No
 1  10023       470089AB
 2  10023       470089DC
 3  10023   
 4  10024       674005TT
 5  10024       674005LP
 6  10024       674005TN

Objective: I want to drop the entire row against IndentNo= 10024 because it has got the PO_Ref_No for all 3 rows.

So the Resultant df would be like :

ID  IndentNo    PO_Ref_No
 1  10023       470089AB
 2  10023       470089DC
 3  10023

Is there any clue on how to do the same efficiently? If I use below:

df['Flag'] = np.where(pd.isnull(df['PO_Ref_No']),1,0)
df = df.loc[df['Flag']!=1]

But this would take away ID number 3 of IndentNo 10023.

Any clue would be helpful.

df[df['IndentNo'].isin(df[df['PO_Ref_No'].isna()]['IndentNo'].unique())] should work. — Satya
– Satya, Commented Aug 25, 2021 at 10:14

jezrael · Accepted Answer · 2021-08-25 10:02:11Z

Solution is for how to discard all rows if any of the row item is not null:

You can check missing values and test if per groups at least one has NaN by Series.isna with GroupBy.transform and GroupBy.any:

df = df[df['PO_Ref_No'].isna().groupby(df['IndentNo']).transform('any')]
print (df)
   ID  IndentNo PO_Ref_No
0   1     10023  470089AB
1   2     10023  470089DC
2   3     10023       NaN

Or get all groups with NaNs by filtering by isna and then filter original column IndentNo by Series.isin for membership:

df = df[df['IndentNo'].isin(df.loc[df['PO_Ref_No'].isna(), 'IndentNo'])]
print (df)
   ID  IndentNo PO_Ref_No
0   1     10023  470089AB
1   2     10023  470089DC
2   3     10023       NaN

Slow, but possible is use DataFrameGroupBy.filter:

df = df.groupby('IndentNo').filter(lambda x: x['PO_Ref_No'].isna().any())

Collectives™ on Stack Overflow

Python: pandas to discard all rows if any of the row item is null

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related