0

I have been using pandas library for data manipulation. And I am stuck somewhere while doing the below calculation.

I have below table in my excel file which contains two columns and I need to create third column (cap). I need to do this excel calculation of third column in my program using python pandas.

First I will create the below dataframe DF which contains two columns and I need to create third column in df using Excel formulas in python.

Data frame (DF):-

   Period       rates     
  01-01-2021   0.0028    
  01-02-2021   0.0082   
  01-03-2021   0.0020   
  01-04-2021   0.0043    
  01-05-2021   0.0066   

Excel Table:-

     A             B         C
1   Period       rates     cap 
2   01-01-2021   0.0028    =if(month(A2)=04,1,(1+$B3)*C3)
3   01-02-2021   0.0082    =if(month(A3)=04,1,(1+$B4)*C4)
4   01-03-2021   0.0020    =if(month(A4)=04,1,(1+$B5)*C5)
5   01-04-2021   0.0043    =if(month(A5)=04,1,(1+$B6)*C6)
6   01-05-2021   0.0066    =if(month(A6)=04,1,(1+$B7)*C7)


I have just created the third column (cap) to understand the formula. And I need to do this in my python program.

6
  • 1
    This post should answer your question stackoverflow.com/questions/40353519/… Commented Aug 2, 2021 at 17:06
  • I would recommend using the 'apply' function from pandas. [pandas.pydata.org/pandas-docs/stable/reference/api/… doc link) Commented Aug 2, 2021 at 17:08
  • I think we will have to use something like cumulative function to do this because it uses the below cell value to get current cell value. Commented Aug 2, 2021 at 17:11
  • And we will have to reverse this column because it uses the down cell value. If it had normal formula then I would have definitely done this using python pandas. But it's not that easy I think. Commented Aug 2, 2021 at 17:14
  • In cell C2 it uses the C3 value. Please see the excel formula. Commented Aug 2, 2021 at 17:16

1 Answer 1

2

Assuming the "period" column in dataframe has already been converted to the datetime object, then by simply defining custom function and using df.apply() would most likely do your job.

Example: (please also change the custom function correctly since I did not include multiply by the cap value below term)

import pandas as pd

df = pd.DataFrame({
        'period': ['01-01-2021', '01-02-2021', '01-03-2021', '01-04-2021', '01-05-2021'],
        'rates': [0.0028, 0.0082, 0.0020, 0.0043, 0.0066]
    })

def cap_criteria(row):
    if row['period'].month == 4:
        return 1
    else:
        return (1 + row['rates'])

df['cap'] = df.apply(cap_criteria, axis=1)

print(df)

Output:

      period   rates     cap
0 2021-01-01  0.0028  1.0028
1 2021-02-01  0.0082  1.0082
2 2021-03-01  0.0020  1.0020
3 2021-04-01  0.0043  1.0000
4 2021-05-01  0.0066  1.0066

If the "period" column is in string format, you can converted it by:

df['period'] = pd.to_datetime(df['period'], format='%d-%m-%Y')

or

df['period'] = pd.to_datetime(df['period'], format='%m-%d-%Y')

depending on which date convention you are using.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks. But could you please explain the else part in your program. Actually in my given excel formula I am using (1+$B3)*C3) and I think it should all be in else part of your program but there is only 1 + row['rates'].
And second thing in my cell C2 I am using down cell value which is C3.
Good one SayYoungMan +1.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.