1

I have a pandas dataframe with time index and want to normalize every row of a column by the maximum value observed to that date and time.

# an example input df
rng = pd.date_range('2020-01-01', periods=8)
a_lst = [2, 4, 3, 8, 2, 4, 10, 2]
df = pd.DataFrame({'date': rng, 'A': a_lst})
df.set_index('date', inplace=True, drop=True)

enter image description here

(a possible solution is to iterate over the rows, subset the past rows,and then divide by the max [1,2,3], but it would be inefficient)

1 Answer 1

2

you are looking at cummax:

df['A_normalized'] = df['A']/df['A'].cummax()

Output:

             A  A_normalized
date                        
2020-01-01   2          1.00
2020-01-02   4          1.00
2020-01-03   3          0.75
2020-01-04   8          1.00
2020-01-05   2          0.25
2020-01-06   4          0.50
2020-01-07  10          1.00
2020-01-08   2          0.20
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.