0

For a project I am importing a csv-file(s) into SQLite with the aid of pandas dataframe.

It 's a very big file (914 columns) so I want to split this by selecting the columns. I do this with pandas df.

This works fine, but not when there are no values in the column, then I get an index-error. I don't know in advance if a column will be empty.

def limit_rubric1(df):
    limit_df = df[[
        "Company",
        "RUB 1",
        "RUB 2",
        "RUB 3",
        "FileBase"]].fillna(value=0)
    # limit_df = limit_df.reset_index(drop=True)
    return limit_df

This is the error I get: File "C:\Users\xxx\PycharmProjects\MF\venv\lib\site-packages\pandas\core\indexes\base.py", line 6176, in _raise_if_missing raise KeyError(f"{not_found} not in index") KeyError: "['RUB 2'] not in index"

1
  • 1
    Note, this is not caused by column being empty. As per error, there is no column named RUB 2 in this df. Commented Jan 2, 2023 at 9:11

1 Answer 1

1

It depends what needs - if not exist RUB 2 and need this column in ouput with 0 use:

def limit_rubric1(df):
    return df.reindex(columns=[
        "Company",
        "RUB 1",
        "RUB 2",
        "RUB 3",
        "FileBase"], fill_value=0)

Or if need only existing columns:

def limit_rubric1(df):
    
    cols=df.columns.intersection(["Company","RUB 1","RUB 2","RUB 3","FileBase"],sort=False)
    return df[cols].fillna(value=0).reset_index(drop=True)
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.