Python pandas df index error when column empty

Question

For a project I am importing a csv-file(s) into SQLite with the aid of pandas dataframe.

It 's a very big file (914 columns) so I want to split this by selecting the columns. I do this with pandas df.

This works fine, but not when there are no values in the column, then I get an index-error. I don't know in advance if a column will be empty.

def limit_rubric1(df):
    limit_df = df[[
        "Company",
        "RUB 1",
        "RUB 2",
        "RUB 3",
        "FileBase"]].fillna(value=0)
    # limit_df = limit_df.reset_index(drop=True)
    return limit_df

This is the error I get: File "C:\Users\xxx\PycharmProjects\MF\venv\lib\site-packages\pandas\core\indexes\base.py", line 6176, in _raise_if_missing raise KeyError(f"{not_found} not in index") KeyError: "['RUB 2'] not in index"

Note, this is not caused by column being empty. As per error, there is no column named RUB 2 in this df. — dm2
– dm2, Commented Jan 2, 2023 at 9:11

jezrael · Accepted Answer · 2023-01-02 09:10:53Z

1

It depends what needs - if not exist RUB 2 and need this column in ouput with 0 use:

def limit_rubric1(df):
    return df.reindex(columns=[
        "Company",
        "RUB 1",
        "RUB 2",
        "RUB 3",
        "FileBase"], fill_value=0)

Or if need only existing columns:

def limit_rubric1(df):
    
    cols=df.columns.intersection(["Company","RUB 1","RUB 2","RUB 3","FileBase"],sort=False)
    return df[cols].fillna(value=0).reset_index(drop=True)

answered Jan 2, 2023 at 9:10

jezrael

867k102 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python pandas df index error when column empty

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related