I am new to python and I am trying to scrape a website. I am able to log in into a website and get a html page, but i dont need the whole page, i just need the hyperlink in the specified table.
I have written the below code, but this gets all the hyperlinks.
soup = BeautifulSoup(the_page)
for table in soup.findAll('table',{'id':'ctl00_Main_lvMyAccount_Table1'} ):
for link in soup.findAll('a'):
print link.get('href')
Can anyone help me where am i going wrong?
Below is the html text of the table
<table id="ctl00_Main_lvMyAccount_Table1" width="680px">
<tr id="ctl00_Main_lvMyAccount_Tr1">
<td id="ctl00_Main_lvMyAccount_Td1">
<table id="ctl00_Main_lvMyAccount_itemPlaceholderContainer" border="1" cellspacing="0" cellpadding="3">
<tr id="ctl00_Main_lvMyAccount_Tr2" style="background-color:#0090dd;">
<th id="ctl00_Main_lvMyAccount_Th1"></th>
<th id="ctl00_Main_lvMyAccount_Th2">
<a id="ctl00_Main_lvMyAccount_SortByAcctNum" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$SortByAcctNum','')">
<font color=white>
<span id="ctl00_Main_lvMyAccount_AcctNum">Account number</span>
</font>
</a>
</th>
<th id="ctl00_Main_lvMyAccount_Th4">
<a id="ctl00_Main_lvMyAccount_SortByServAdd" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$SortByServAdd','')">
<font color=white>
<span id="ctl00_Main_lvMyAccount_ServiceAddress">Service address</span>
</font>
</a>
</th>
<th id="ctl00_Main_lvMyAccount_Th5">
<a id="ctl00_Main_lvMyAccount_SortByAcctName" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$SortByAcctName','')">
<font color=white>
<span id="ctl00_Main_lvMyAccount_AcctName">Name</span>
</font>
</a>
</th>
<th id="ctl00_Main_lvMyAccount_Th6">
<a id="ctl00_Main_lvMyAccount_SortByStatus" href="javascript:__doPostBack('ctl00$Main$lvMyAccount$SortByStatus','')">
<font color=white>
<span id="ctl00_Main_lvMyAccount_AcctStatus">Account status</span>
</font>
</a>
</th>
<th id="ctl00_Main_lvMyAccount_Th3"></th>
</tr>
<tr>
<td>
Thanks in advance.