1

I need your help on a pandas problem :

I am currently extracting data via APIs that contain gaps in their ranks.

However I need to take into account these on the dataset by replacing them with an average value.

Then I need to insert a row in my dataframe to fill the dataframe.

Illustration :

Here's what my problem looks like :

   rank timestamp value
0    1     21:50  3450
1    4     21:40  3442
2    5     21:41  5964
3    6     14:27  5258
4    7     13:10  3001
5    8     14:02  2782

ranks 2 and 3 are missing

So,hHere's what I'm trying to get :

   rank timestamp value
0    1     21:50  3450
1    2      NaN   avg
2    3      NaN   avg
3    4     21:40  3442
4    5     21:41  5964
5    6     14:27  5258
6    7     13:10  3001
7    8     14:02  2782

I know approximately how to deal with columns, but I have no idea how to deal with rows.

Do you have an idea ?

I have already tried to use "append" but I struggle then to reindex my dataframe :/

1 Answer 1

2

You can use reindex to add missing ranks and fillna to fill missing values.

df = df.set_index('rank').reindex(np.arange(df['rank'].min(), df['rank'].max()+1)).reset_index()
df['value'] = df['value'].fillna(df['value'].mean()).round()


    rank    timestamp   value
0   1       21:50       3450
1   2       NaN         3982
2   3       NaN         3982
3   4       21:40       3442
4   5       21:41       5964
5   6       14:27       5258
6   7       13:10       3001
7   8       14:02       2782
Sign up to request clarification or add additional context in comments.

2 Comments

Oh yes ! That's perfect ! Thank you very much :)
@Diev, thank you. If the question was answered completely, don't forget to mark it as accepted by ticking the check box next to the question. Happy coding!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.