0

I'm new to Python and I need to get the data from a table on a Webpage and send to a list.

I've tried everything, and the best I got is:

f = urllib.request.urlopen(url)
url = "http://www2.bmf.com.br/pages/portal/bmfbovespa/lumis/lum-taxas-referenciais-bmf-enUS.asp?Data=11/22/2017&Data1=20171122&slcTaxa=APR#"
soup = BeautifulSoup(urllib.request.urlopen(url).read(),'lxml')
rows=list()
for tr in soup.findAll('table'):
    rows.append(tr)

Any suggestions?

2
  • 1
    There is an option to download the excel file. It is better to work with the xlsx files. Do you really need to read from the html? Commented Nov 23, 2017 at 15:36
  • No. The xlsx file is ok. Commented Nov 23, 2017 at 15:57

2 Answers 2

1

You're not that far !

First make sure to import the proper version of BeautifulSoup which is BeautifulSoup4 by doing apt-get install python3-bs4 (assuming you're on Ubuntu or Debian and running Python 3).

Then isolate the td elements of html table and clean data a bit. For example remove the first 3 elements of the lists which are useless, and remove the ugly '\n':

import urllib
from bs4 import BeautifulSoup
url = "http://www2.bmf.com.br/pages/portal/bmfbovespa/lumis/lum-taxas-referenciais-bmf-enUS.asp?Data=11/22/2017&Data1=20171122&slcTaxa=APR#"
soup = BeautifulSoup(urllib.request.urlopen(url).read(),'lxml')
rows=list()
for tr in soup.findAll('table'):
    for td in tr:
        rows.append(td.string)
temp_list=rows[3:]
final_list=[element for element in temp_list if element != '\n']

I don't know which data you want to extract precisely. Now you need to work on your Python list (called final_list here)!

Hope it's clear.

Sign up to request clarification or add additional context in comments.

1 Comment

Worked here. Thank you.
1

There is a Dowload option at the end of the webpage. If you can download the file manually you are good to go.

If you want to access different dates automatically, and since it is JavaScript, I suggest to use Selenium to download the xlsx files through Python.

With the xlsx file you can use Xlsxwriter to read the data and do what you want.

1 Comment

Thank you for sharing Selenium.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.