Python/Selenium - how to loop through hrefs in <li>?

Question

Web URL: https://www.ipsos.com/en-us/knowledge/society/covid19-research-in-uncertain-times

I want to parse the HTML as below:

I want to get all hrefs within the < li > elements and the highlighted text. I tried the code

elementList = driver.find_element_by_class_name('block-wysiwyg').find_elements_by_tag_name("li")
for i in range(len(elementList)):
    driver.find_element_by_class_name('blcokwysiwyg').find_elements_by_tag_name("li").get_attribute("href")

But the block returned none.

Can anyone please help me with the above code?

SIM · Accepted Answer · 2020-04-16 01:46:51Z

2

I suppose it will fetch you the required content.

import requests
from bs4 import BeautifulSoup

link = 'https://www.ipsos.com/en-us/knowledge/society/covid19-research-in-uncertain-times'

r = requests.get(link)
soup = BeautifulSoup(r.text,"html.parser")
for item in soup.select(".block-wysiwyg li"):
    item_text = item.get_text(strip=True)
    item_link = item.select_one("a[href]").get("href")
    print(item_text,item_link)

answered Apr 16, 2020 at 1:46

SIM

22.4k6 gold badges45 silver badges115 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Bangbangbang Over a year ago

Hi @SIM, I don't know why the bs4 does not work on my end and the block returns SSLError: HTTPSConnectionPool(host='www.ipsos.com', port=443): Max retries exceeded with url:

SIM Over a year ago

I ran and found success just now. If you still encounter the same error, try defining a headers like requests.get(link,headers={"User-Agent":"Mozilla/5.0"}).

Bangbangbang Over a year ago

I figured it out the bs4: requests.get(link, verify=False) it will work

Bangbangbang Over a year ago

Thanks for all the help!

Jack Fleeting · Accepted Answer · 2020-04-16 00:10:06Z

1

Try is this way:

coronas = driver.find_element_by_xpath("//div[@class='block-wysiwyg']/ul/li")
hr = coronas.find_element_by_xpath('./a')
print(coronas.text)
print(hr.get_attribute('href'))

Output:

The coronavirus is touching the lives of all Americans, but race, age, and income play a big role in the exact ways the virus — and the stalled economy — are affecting people. Here's what that means.
https://www.ipsos.com/en-us/america-under-coronavirus

answered Apr 16, 2020 at 0:10

Jack Fleeting

25k6 gold badges27 silver badges49 bronze badges

2 Comments

Bangbangbang Over a year ago

The block returns error NoSuchElementException: Message: no such element: Unable to locate element...I suspect something wrong with the path?

Jack Fleeting Over a year ago

@Bangbangbang No, the xpath works (just tried it again). Can't explain why it doesn't work on your side.

Collectives™ on Stack Overflow

Python/Selenium - how to loop through hrefs in <li>?

2 Answers 2

4 Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

2 Comments

Linked

Related