Hello I am working with two dataframes, and need to apply a custom-made function but I'm getting the following error: ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index 0'). I know why this is happening, but don't know how to solve the problem.
The first dataframe contains a list of all workable days for the current year:
print(df_workable)
Date workable_day inv_workable_day day month
1 2019-01-02 1.0 22.0 2 1
2 2019-01-03 2.0 21.0 3 1
3 2019-01-04 3.0 20.0 4 1
6 2019-01-07 4.0 19.0 7 1
7 2019-01-08 5.0 18.0 8 1
.. ... ... ... ... ...
364 2019-12-31 20.0 1.0 31 12
The second dataframe contains data regarding some day values and a flag.
print(df)
day_a1 wday_a1 iwday_a1 flag
0 24.0 4.0 6.0 2.1
1 NaN NaN NaN NaN
3 31.0 22.0 1.0 2.2
4 27.0 18.0 5.0 3.3.2.1.3
26816 25.0 19.0 5.0 1
26817 31.0 NaN NaN 3.2
I'm trying to apply a function that will return a date from either dataframe depending on multiple conditions (but I'm just using "this" and "that" for simplicity). This is the function:
def rec_date(row):
if row['flag'] == '2.1':
if df_workable[df_workable['workable_day'] == int(row['wday_a1']) & df_workable['month'] == 1]['day'] <= dt.datetime.today().day:
val = "this"
else:
val = "that"
else:
val = "Still missing"
return val
The issue is when I'm trying to solve condition 2.1 that I need to iterate over each row of df and check a condition. The issure arises, because when it's trying to iterate over each row, it doesn't know which row on df_workable to iterate over, so it needs an extra argument (.all(),.any(),etc...). However I do not wish to iterate, but simply extract the value corresponding to:
df_workable[df_workable['workable_day'] == 4 & df_workable['month'] == 1]['day']
(I'm passing 4 hard-coded because it would be the first value passed from df['wday_a1']). And the output for that should be 7. And that value compared to dt.datetime.today().day which is 10, would return true. I've tested both functions individually and they do return the expected output. However, the problem arises when applying these function over the dataframe, because of (I believe) the reasons explained above.
After passing the function I expect to have this:
df['rec_date'] = df.apply(rec_date,axis=1)
day_a1 wday_a1 iwday_a1 flag rec_date
0 24.0 4.0 6.0 2.1 this
1 NaN NaN NaN NaN Still missing
3 31.0 22.0 1.0 2.2 Still missing
4 27.0 18.0 5.0 3.3.2.1.3 Still missing
26816 25.0 19.0 5.0 1 Still missing
26817 31.0 NaN NaN 3.2 Still missing