Pandas: Loop through rows to update column value

Question

Here is sample dataframe look like:

>>> df
  point    x      y
0  0.1   NaN    NaN
1  0.2   NaN    NaN
2  0.3   5.0    NaN
3  0.4   NaN    NaN
4  0.5   NaN    1.0
5  0.6   NaN    NaN
6  0.7   1.0    1.0
7  0.8   NaN    NaN
8  0.9   NaN    NaN
9  1.1   NaN    NaN
10 1.2   NaN    NaN
11 1.3   NaN    NaN
12 1.4   NaN    2.0
13 1.5   NaN    NaN
14 1.6   NaN    NaN
15 1.7   NaN    NaN
16 0.1   NaN    NaN
17 0.2   NaN    NaN
18 0.3   NaN    NaN
19 0.4   NaN    NaN
20 0.5   NaN    NaN
21 0.6   2.0    NaN
22 0.7   NaN    NaN
23 1.1   NaN    NaN

From this dataFrame I want to update point value. Condition is when x or y is not NaN immediate next value of point will be replaced by previous point value afterthat next point value should be reindexed(cycle .1 to .6). eg. in row index(2) when point=0.3, x=5.0 So, the next point value should be also 0.3 instead of 0.4, Then in row index(4) point=0.5 will be replaced by 0.4(continue recursively)

OUTPUT I want:

  point    x      y
0  0.1   NaN    NaN
1  0.2   NaN    NaN
2  0.3   5.0    NaN
3  0.3   NaN    NaN
4  0.4   NaN    1.0
5  0.4   NaN    NaN
6  0.5   1.0    1.0
7  0.5   NaN    NaN
8  0.6   NaN    NaN
9  1.1   NaN    NaN
10 1.2   NaN    NaN
11 1.3   NaN    NaN
12 1.4   NaN    2.0
13 1.4   NaN    NaN
14 1.5   NaN    NaN
15 1.6   NaN    NaN
16 0.1   NaN    NaN
17 0.2   NaN    NaN
18 0.3   NaN    NaN
19 0.4   NaN    NaN
20 0.5   NaN    NaN
21 0.6   2.0    NaN
22 0.6   NaN    NaN
23 1.1   NaN    NaN

Code I tried:

import pandas as pd
df = pd.read_csv("data.csv")
df['point'] = df.groupby() #Don't know how should I approach

For example cycle: 0.1-0.6 ,1.1-1.6 ,2.1-2.6 and so on. In between .1-.6 cycle any value might appear several times consecutively but index should be followed like 0.1 0.1 0.2 0.3 0.4 0.4 0.5 0.5 0.6 1.1 ... — python noob
– python noob, Commented Jul 9, 2021 at 14:30

Corralien · Accepted Answer · 2021-07-09 20:12:21Z

1

Can you try that:

mask = df[['x', 'y']].any(axis=1).shift(1, fill_value=False)
point = df['point'].astype(int)
group = point.sub(point.shift(1)).ne(0).cumsum()

df['point'] = df['point'].sub(mask.groupby(group).cumsum().div(10))

>>> df
    point    x    y
0     0.1  NaN  NaN
1     0.2  NaN  NaN
2     0.3  5.0  NaN
3     0.3  NaN  NaN
4     0.4  NaN  1.0
5     0.4  NaN  NaN
6     0.5  1.0  1.0
7     0.5  NaN  NaN
8     0.6  NaN  NaN
9     1.1  NaN  NaN
10    1.2  NaN  NaN
11    1.3  NaN  NaN
12    1.4  NaN  2.0
13    1.4  NaN  NaN
14    1.5  NaN  NaN
15    1.6  NaN  NaN
16    0.1  NaN  NaN
17    0.2  NaN  NaN
18    0.3  NaN  NaN
19    0.4  NaN  NaN
20    0.5  NaN  NaN
21    0.6  2.0  NaN
22    0.6  NaN  NaN
23    1.1  NaN  NaN

edited Jul 9, 2021 at 20:12

answered Jul 9, 2021 at 11:25

Corralien

121k8 gold badges43 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

python noob Over a year ago

Thaks for this. But still it didn't solve my problem. point value should be also reindexed. For instance, in row index(4), point should be 0.4 instead of 0.5. There is a cycle _.1 to _.6 it must be followed. Please take a look on my sample output.

Shaig Hamzaliyev · Accepted Answer · 2021-07-09 13:52:09Z

0

So I tried something. First created some dataframe for my self. I tried to follow your txt (was little confusing for me non native speaker). I wrote something. It is not very generic but it should work for your case and with this idea I think you can solve problems.

import numpy as np
import pandas as pd
df = np.zeros((8, 3))
f = np.random.randint(8, size=8)

df[:, 0] = f
df[:, 1:] = np.nan
df[1, 1] = 5
df[3, 1:] = 4

df = pd.DataFrame(df)
print(df)

for i in range(len(df)):
    if (df.iloc[i, 1:].notnull()).any()&(df.iloc[i, 1:].isnull()).any():
        print(i)
        df[0][i+1] = df[0][i]

answered Jul 9, 2021 at 13:52

Shaig Hamzaliyev

3094 silver badges7 bronze badges

2 Comments

python noob Over a year ago

Sorry, I can't relate these codes with my given problem.

Shaig Hamzaliyev Over a year ago

Untill loop it is just creating some dataframe as I do not have your data. The loop is doing your idea.

Collectives™ on Stack Overflow

Pandas: Loop through rows to update column value

2 Answers 2

1 Comment

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Related