I have a dataset with different columns. It looks like that:
ID = [97,97,97,19,19,33,33,33,33,33,11]
DATE = [2018-09,2018-09,2020-02,2021-01,2021-01,2017-04,2017-04,2017-04,2019-09,2019-09,2022-05]
Destination_1 = [BRAZIL,BRAZIL,BRAZIL,ARGENTINA,MOROCCO,INDONESIA,USA,BRAZIL,EGYPT,LEBANON,USA]
Destination_2 = [BRAZIL,URUGUAY,SINGAPORE,VENEZUELA,THAILAND,PERU,ECUADOR,USA,ALGERIA, EGYPT,CANADA]
I would like to create a new column OUTPUT based on the following condition:
for each ID if at the same DATE at least one of the DESTINATION_1 equals DESTINATION_2 then the output sould be TRUE otherwise FALSE.
The desired result for the OUTPUT column should then be:
OUTPUT=[TRUE, TRUE, FALSE , FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, FALSE]
meaning that if at the same DATE at least one destination is common to each ID then the condition is satisfied and the output is TRUE.
How could I do so? So far I tried:
for sublist in df["ID"]:
if (df["Destination_1"] == df["Destination_2"]).any():
print("True")
else :
print("False")
but this doesn't work for me and I do not get how to add the second condition concerning the DATE without getting an error.
Could someone help me please?