Python: Create a new column based on different conditions

Question

I have a dataset with different columns. It looks like that:

ID = [97,97,97,19,19,33,33,33,33,33,11] 
DATE = [2018-09,2018-09,2020-02,2021-01,2021-01,2017-04,2017-04,2017-04,2019-09,2019-09,2022-05] 
Destination_1 = [BRAZIL,BRAZIL,BRAZIL,ARGENTINA,MOROCCO,INDONESIA,USA,BRAZIL,EGYPT,LEBANON,USA] 
Destination_2 = [BRAZIL,URUGUAY,SINGAPORE,VENEZUELA,THAILAND,PERU,ECUADOR,USA,ALGERIA, EGYPT,CANADA]

I would like to create a new column OUTPUT based on the following condition:

for each ID if at the same DATE at least one of the DESTINATION_1 equals DESTINATION_2 then the output sould be TRUE otherwise FALSE.

The desired result for the OUTPUT column should then be:

OUTPUT=[TRUE, TRUE, FALSE , FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, FALSE]

meaning that if at the same DATE at least one destination is common to each ID then the condition is satisfied and the output is TRUE.

How could I do so? So far I tried:

   for sublist in df["ID"]:
  if (df["Destination_1"] == df["Destination_2"]).any():
    print("True")
  else :
    print("False")

but this doesn't work for me and I do not get how to add the second condition concerning the DATE without getting an error.

Could someone help me please?

Ynjxsjmh · Accepted Answer · 2022-05-11 16:46:14Z

You can try groupby and use isin to check value existence then assign

out = (df.groupby(['ID', 'DATE'])
       .apply(lambda g: g.assign(OUTPUT=[g['Destination_1'].isin(g['Destination_2']).any()]*len(g))))

print(out)

    ID     DATE Destination_1 Destination_2  OUTPUT
0   97  2018-09        BRAZIL        BRAZIL    True
1   97  2018-09        BRAZIL       URUGUAY    True
2   97  2020-02        BRAZIL     SINGAPORE   False
3   19  2021-01     ARGENTINA     VENEZUELA   False
4   19  2021-01       MOROCCO      THAILAND   False
5   33  2017-04     INDONESIA          PERU    True
6   33  2017-04           USA       ECUADOR    True
7   33  2017-04        BRAZIL           USA    True
8   33  2019-09         EGYPT       ALGERIA    True
9   33  2019-09       LEBANON         EGYPT    True
10  11  2022-05           USA        CANADA   False

thanks for your help! It works perfectly, can I ask you where to find good documentation concerning the lambda function please?
@Cai I would refer to How to Use Python Lambda Functions – Real Python, but I don't why it needs to be registered to read now. Anyway, lambda function isn't really a much complicated concept any tutorial you find on the web I think is enough.

Collectives™ on Stack Overflow

Python: Create a new column based on different conditions

1 Answer 1

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Related