1

I am trying to scrape text in an H2 tag under a header which starts with "benefits of" ...so it could be like "benefits of toys" or "benefits of cups" etc.

The html code is:

<h2 class="DrugOverview__title___1OwgG">Benefits of Toys</h2>

The code I've used until now is

        benfit = soup.find('h2',text='Benefits of')
        q = benefit.get_text(strip=True)

How do I solve it? Also keep in mind the h2 class cant be used to scrape in this situation(due to other issues).

1 Answer 1

1

we can use regex to get some specific string

I used strs as input html content

Use the below code:

import re
strs = '<h2 class="DrugOverview__title___1OwgG">Benefits of Toys</h2><h2 class="DrugOverview__title___1OwgG">Benefits of kids</h2>'
soup = BeautifulSoup(strs, 'html.parser')
pattern = re.compile(r'Benefits of')
benefit =  soup.findAll(text = pattern)
print(benefit)

Output:

['Benefits of Toys', 'Benefits of kids']
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.