Pandas dataframe, get the row and index for a column meeting certain conditions

Question

I have the below df:

import pandas as pd
import numpy as np

output =  [['Owner', 'Database', 'Schema', 'Table', 'Column', 'Comment', 'Status'], ['', 'DEV', 'AIRFLOW', 'TASK_INSTANCE', '_LOAD_DATETIME', 'Load datetime'], ['', 'DEV', 'AIRFLOW', 'TEST', '_LOAD_FILENAME', 'load file name', 'ADDED'],['', 'DEV', 'AIRFLOW', 'TEST_TABLE', 'TEST_COL', 'COMMENT TEST'],]


df = pd.DataFrame(output[1:], columns=output[0])



query_list = []
empty_status_idx = []

for index, row in df.iterrows():
    if row['Status'] is None:
        sql = f"ALTER TABLE {row['Table']} ALTER {row['Column']} COMMENT {row['Comment']}; "
        # idx = np.where(df["Status"] is None)
        # idx = df.index[df['Status']]
        idx = df.iloc[df['Status']]
        empty_status_idx.append(idx)
        print(f'idx: {idx}')
       
        query_list.append(sql)
query_list

I see the below error with idx:

TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

What I want to see if a list of the positions of None cells:

empty_status_idx = [0, 2]

The above idx value I got from some of the answers from this stack overflow question

quasi-human · Accepted Answer · 2022-02-13 15:46:51Z

1

You can also use a np.where method as follows:

import numpy as np
empty_status_idx = np.where(df.Status.isnull())[0].tolist()

[0, 2]

answered Feb 13, 2022 at 15:46

quasi-human

1,9381 gold badge5 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

KristiLuna Over a year ago

thanks! is there a way to add this into the loop after the sql variable instead?

quasi-human Over a year ago

You're welcome. I don't understand the meaning of "to add this into the loop after the sql variable". Can you explain more?

KristiLuna Over a year ago

sorry never mind I confused myself! works perfectly thank you!

quasi-human Over a year ago

Okay, good luck with your project!

sophocles · Accepted Answer · 2022-02-13 15:50:05Z

1

You are overcomplicating it:

empty_status_idx = df[df['Status'].isnull()].index.tolist()

Out[65]: [0, 2]

edited Feb 13, 2022 at 15:50

answered Feb 13, 2022 at 15:34

sophocles

13.9k3 gold badges18 silver badges36 bronze badges

Collectives™ on Stack Overflow

Pandas dataframe, get the row and index for a column meeting certain conditions

2 Answers 2

4 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Linked

Related