0

I am trying to fill the missing data in my pandas data frame. However this data can only be filled in a certain non-traditional way. First I've marked missing data with -1, so what i want is to replace value of -1 in a certain way.

Column A Column B
12 11
99 -1
43 34
23 -1
65 -1
17 42
12 66
99 -1
43 22
23 -1
65 -1
17 42

I want to replace every missing value or -1 with next available positive value from the same column.

Column A Column B
12 11
99 34
43 34
23 42
65 42
17 42
12 66
99 22
43 22
23 42
65 42
17 42

I am able to attain the desired output if number of continuous -1s remain constant using df['col].shift(1) however that won't work because here placement of -1s are random.

Data size i am dealing with is fairly large.

3 Answers 3

3

You could use replace with bfill for this.

df['ColumnB'] = df['ColumnB'].replace(-1, method='bfill')

print(df)

Sample Output
    ColumnA  ColumnB
0        12       11
1        99       34
2        43       34
3        23       42
4        65       42
5        17       42
6        12       66
7        99       22
8        43       22
9        23       42
10       65       42
11       17       42
Sign up to request clarification or add additional context in comments.

Comments

2

use replace()+bfill():

df['Column B']=df['Column B'].replace(-1,float('nan')).bfill(downcast='infer')

output of df:

  Column A  Column B
0   12      11
1   99      34
2   43      34
3   23      42
4   65      42
5   17      42
6   12      66
7   99      22
8   43      22
9   23      42
10  65      42
11  17      42

Comments

1

Use pd.Series.bfill


In [24]: s = pd.Series([11, -1, 34, -1, -1, 42, 66, -1, 22, -1, -1, 42])

In [26]: s.replace({-1: np.nan}).bfill()
Out[26]:
0     11.0
1     34.0
2     34.0
3     42.0
4     42.0
5     42.0
6     66.0
7     22.0
8     22.0
9     42.0
10    42.0
11    42.0
dtype: float64

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.