Waiting for a table to load completely using selenium with python

Question

I want to scrape some data from a page which is in a table. So I am only bothered about the data in the table. Earlier I was using Mechanize, but I found sometimes some of the data are missing, especially in the bottom of the table. Googling, I found out that it may be due to mechanize not handling Jquery/Ajax.

So I switched to Selenium today. How do I wait for one and only one table to load completely and then extract all links from that table using selenium and python? If I wait for complete page to load, it is taking some time. I want to ensure that only data in the table is loaded. My current code:

driver = webdriver.Firefox()
for page in range(1, 2):
    driver.get("http://somesite.com/page/"+str(page))
    table = driver.find_element_by_css_selector('div.datatable')
    links = table.find_elements_by_tag_name('a')
    for link in links:
        print link.text

alecxe · Accepted Answer · 2014-08-09 20:26:10Z

Use WebDriverWait to wait until the table is located:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

...
wait = WebDriverWait(driver, 10)
table = wait.until(EC.presence_of_element_located(By.CSS_SELECTOR, 'div.datatable'))

This would be an explicit wait.

Alternatively, you can make the driver wait implicitly:

An implicit wait is to tell WebDriver to poll the DOM for a certain amount of time when trying to find an element or elements if they are not immediately available. The default setting is 0. Once set, the implicit wait is set for the life of the WebDriver object instance.

from selenium import webdriver

driver = webdriver.Firefox()
driver.implicitly_wait(10) # wait up to 10 seconds while trying to locate elements
for page in range(1, 2):
    driver.get("http://somesite.com/page/"+str(page))
    table = driver.find_element_by_css_selector('div.datatable')
    links = table.find_elements_by_tag_name('a')
    for link in links:
        print link.text

Thanks :) Does the explicit wait of presence_of_element_located ensure that table is fully loaded and not just partially ? Sorry, if this is too silly a question, I have no idea about it. Also I wanted to have something, so that I do not wait for other elements in the page and only the table. Once the table has loaded, I should go ahead, instead of waiting for other elements
@user3215014 if precense_of_element_located doesn't make a difference here, set up and implicit wait - should help.

Alex Woolford · Accepted Answer · 2014-08-09 20:27:06Z

Perhaps you could use Selenium's expected conditions (http://docs.seleniumhq.org/docs/04_webdriver_advanced.jsp), e.g.

>>> from selenium import webdriver
>>> from selenium.webdriver.common.by import By
>>> from selenium.webdriver.support.ui import WebDriverWait
>>> from selenium.webdriver.support import expected_conditions as EC 
>>> 
>>> ff = webdriver.Firefox()
>>> ff.get("http://www.datatables.net/examples/data_sources/js_array.html")
>>> try:
...     element = WebDriverWait(ff, 10).until(EC.presence_of_element_located((By.ID, "example")))
...     print element.text
... finally:
...     ff.quit()
... 

Engine Browser Platform Version Grade
Gecko Firefox 1.0 Win 98+ / OSX.2+ 1.7 A
Gecko Firefox 1.5 Win 98+ / OSX.2+ 1.8 A
Gecko Firefox 2.0 Win 98+ / OSX.2+ 1.8 A
Gecko Firefox 3.0 Win 2k+ / OSX.3+ 1.9 A
Gecko Camino 1.0 OSX.2+ 1.8 A
Gecko Camino 1.5 OSX.3+ 1.8 A
Gecko Netscape 7.2 Win 95+ / Mac OS 8.6-9.2 1.7 A
Gecko Netscape Browser 8 Win 98SE+ 1.7 A
Gecko Netscape Navigator 9 Win 98+ / OSX.2+ 1.8 A
Gecko Mozilla 1.0 Win 95+ / OSX.1+ 1 A

Collectives™ on Stack Overflow

Waiting for a table to load completely using selenium with python

2 Answers 2

3 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Linked

Related