Using Pandas to assign specific values

Question

I have the following dataframe:

data = {'id': [1, 2, 3, 4, 5, 6, 7, 8],
        'stat': ['ordered', 'unconfirmed', 'ordered', 'unknwon', 'ordered', 'unconfirmed', 'ordered', 'back'],
        'date': ['2021', '2022', '2023', '2024', '2025','2026','2027', '1990']    
}
df = pd.DataFrame(data)
df

I am trying to get the following data frame:

Unfortunate I am not successful so far and I used the following commands (for loops) for only stat==ordered:

y0 = np.zeros((len(df), 8), dtype=int)
y1 = [1990]
if stat=='ordered':
    for i in df['id']:
        for j in y1:
            if df.loc[i].at['date'] in y1:
                  y0[i][y1.index(j)] = 1
            else:
                  y0[i][y1.index(j)] = 0

But unfortunately it did not returned the expected solution and beside that it takes a very long time to do the calculation. I tried to use gruopby, but it could not fgure out either how to use it perporly since it is faster than using for loops. Any idea would be very appreiciated.

I do not understand how you arrive at those numbers in those columns. Does that not matter and you just want those values and are asking how to do it? — piRSquared
– piRSquared, Commented Apr 14, 2022 at 15:54
I did it manuely, but in case my data column is almost has 30k entries. The solution below would not be good enough to cover the issue. I tried to solve it ,but I was not able to fix it… any idea about it? Thank you. — Adam
– Adam, Commented Apr 19, 2022 at 9:17

piRSquared · Accepted Answer · 2022-04-14 15:57:58Z

IIUC:

df.join(
    pd.get_dummies(df.date).cumsum(axis=1).mul(
        [1, 2, 1, 3, 1, 2, 1, 0], axis=0
    ).astype(int)

)

   id         stat  date  1990  2021  2022  2023  2024  2025  2026  2027
0   1      ordered  2021     0     1     1     1     1     1     1     1
1   2  unconfirmed  2022     0     0     2     2     2     2     2     2
2   3      ordered  2023     0     0     0     1     1     1     1     1
3   4      unknwon  2024     0     0     0     0     3     3     3     3
4   5      ordered  2025     0     0     0     0     0     1     1     1
5   6  unconfirmed  2026     0     0     0     0     0     0     2     2
6   7      ordered  2027     0     0     0     0     0     0     0     1
7   8         back  1990     0     0     0     0     0     0     0     0

Collectives™ on Stack Overflow

Using Pandas to assign specific values

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related