I am trying to write a condition check for tagging Technical terms. I have used a dictionary to look up to and do a fuzzy match. My dataframe is something like this-
Word Entity Score NER_Tag technology similarity
Stonetrust CRR 0.90 MISC xxx 90
Wilkes CRR 0.80 ORG xxx 60
linux xxx 0.70 LOC xxx 70
SILVER INC xxx 0.88 PER xxx 80
PO BOX 988 xxx 0.99 MISC xxx 70
LA 70520 xxx 0.67 PER xxx 50
02/12/2019 xxx 0.23 MISC xxx 100
I need to check for below condition and create a new column with final tags-
- if similarity score = 100 then final_tag = TECH
- if Tag = MISC and similarity score >=95 then final_tag = TECH
To do this I did wrote below code
filter1 = df1['similarity'] == 100
filter2 = (df1['NER_Tag'] == 'MISC') & (df1['similarity'] >= 95)
df1['Final_NER'] = np.where(filter1, filter2, 'TECH', df1['NER_Tag'])
I am not getting correct output and getting below error-
TypeError: where() takes from 1 to 3 positional arguments but 4 were given
Is there a better way of writing this logic?