Filtering pandas results using where clause but getting error

Question

I am trying to write a condition check for tagging Technical terms. I have used a dictionary to look up to and do a fuzzy match. My dataframe is something like this-

   Word      Entity     Score   NER_Tag technology  similarity
Stonetrust      CRR     0.90     MISC    xxx         90
Wilkes          CRR     0.80     ORG     xxx         60
linux           xxx     0.70     LOC     xxx         70
SILVER  INC     xxx     0.88     PER     xxx         80
PO BOX 988      xxx     0.99    MISC     xxx         70
LA 70520        xxx     0.67     PER     xxx         50
02/12/2019      xxx     0.23     MISC    xxx         100

I need to check for below condition and create a new column with final tags-

if similarity score = 100 then final_tag = TECH
if Tag = MISC and similarity score >=95 then final_tag = TECH

To do this I did wrote below code

filter1 = df1['similarity'] == 100
filter2 = (df1['NER_Tag'] == 'MISC') & (df1['similarity'] >= 95)

df1['Final_NER']  = np.where(filter1, filter2, 'TECH', df1['NER_Tag'])

I am not getting correct output and getting below error-

TypeError: where() takes from 1 to 3 positional arguments but 4 were given

Is there a better way of writing this logic?

jezrael · Accepted Answer · 2021-06-28 08:52:28Z

1

You are close, need numpy.select if want pass multiple values per multiple conditions:

df1['Final_NER']  = np.select([filter1, filter2], ['TECH', 'TECH'], default=df['NER_Tag'])

Or use | for bitwise OR between both conditions is simplier here:

df1['Final_NER']  = np.where(filter1 | filter2, 'TECH', df1['NER_Tag'])

answered Jun 28, 2021 at 8:52

jezrael

867k102 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Filtering pandas results using where clause but getting error

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related