0

I am trying to filter rows in dataframe by multiple strings and I have searched and found this

search_values = ['vba','google']
df[df[0].str.contains('|'.join(search_values), case=False)]

But this I think based on finding either of the two strings vba or google. How can I join the both strings to be used AND not OR. I mean that the filter should be done if both the strings are there in the column so if this sentence for example I mean vba will be with google in one sentence. This row would be selected because it has both vba and google

1 Answer 1

1

The contains function uses regex to find the rows that match the string. Using | is like OR in regex. If you want rows that contain both, this should work:

search_values = ['vba','google']
df[df[0].str.contains(f'{search_values[0]}.*{search_values[1]}|{search_values[1]}.*{search_values[0]}', case=False)]

The .* part means that any string can be found between the searched terms. So we try to find rows with 'vba', some string and then 'google', or rows with 'google', some string and then 'vba'.

Sign up to request clarification or add additional context in comments.

3 Comments

Amazing. Thanks a lot. How can I add more terms to the list search_values? Is there an easy way?
There are some other ways to do this that might be easier while searching for multiple terms. You can look at this question for example: stackoverflow.com/questions/4389644/…
Thanks a lot. I am lost in the link you posted.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.