7

My understanding is that pythonanywhere supports a headless Firefox browser but you need

from pyvirtualdisplay import Display

And so you can connect using

with Display():
    while True:
        try:
            driver = webdriver.Firefox()
            break
        except:
            time.sleep(3)

And I connect just fine. However, after I start using the driver with

with Display():
    while True:
        try:
            driver = webdriver.Firefox()
            break
        except:
            time.sleep(3)
    wb=load_workbook(r'/home/hoozits728/mutual_fund_tracker/Mutual_Fund_Tracker.xlsx')
    ws=wb.get_sheet_by_name('Tactical')

    for i in range(3, ws.max_row+1):
        if ws.cell(row=i,column=2).value is not None:
            driver.get('https://finance.yahoo.com/quote/' + ws.cell(row=i,column=2).value + '/performance?ltr=1')
            oneyear=driver.find_element_by_css_selector('#Col1-0-Performance-Proxy > section > div:nth-child(2) > div > div:nth-child(5) > span:nth-child(2)').text
            threeyear=driver.find_element_by_css_selector('#Col1-0-Performance-Proxy > section > div:nth-of-type(2) > div > div:nth-of-type(6) > span:nth-of-type(2)').text
            fiveyear=driver.find_element_by_css_selector('#Col1-0-Performance-Proxy > section > div:nth-of-type(2) > div > div:nth-of-type(7) > span:nth-of-type(2)').text
            ws.cell(row=i,column=10).value=oneyear
            ws.cell(row=i,column=11).value=threeyear
            ws.cell(row=i,column=12).value=fiveyear

           … and so on …

I get this error after just a little while

enter image description here

For what it's worth, this code works perfectly fine on my local machine. Also, I am a paying member, so there should be no whitelist issues.

9
  • Update the question with the error stack trace Commented Sep 16, 2018 at 12:09
  • @New contributor Can you see the image okay? I had trouble copying every line at once from the pythonanywhere console. Commented Sep 16, 2018 at 12:29
  • 1
    It sounds like the browser is crashing; maybe it's something to do with the contents of the page? What happens if you hit a different URL, say https://www.google.com/? Commented Sep 17, 2018 at 14:07
  • @GilesThomas You are onto something! Apparently there is a problem with my url; however, entering the url directly into a browser yields a proper webpage. This is what the code resolves to: finance.yahoo.com/quote/USCBX/performance?ltr=1 Commented Sep 17, 2018 at 14:19
  • 1
    I was just about to reply, but I see you've tracked down the problem in your answer below -- have upvoted it :-) Commented Sep 19, 2018 at 11:27

2 Answers 2

5

It has recently come to my understanding that yahoo has blocked pythonanywhere from running any web scraping scripts. I assume this is true for all AWS servers and those who use them, but I am not 100% certain of this. I hope this helps anyone who comes across this question.

https://www.pythonanywhere.com/forums/topic/5724/#id_post_52307

Sign up to request clarification or add additional context in comments.

Comments

0

You're getting that error because selenium is unable to connect to the browser you created. If you're running the first chunk of code, and then the second chunk of code, then the display has been close and that would probably cause the browser to crash.

You need to run the code that uses the browser inside the with block.

There is an example on the PythonAnywhere help pages that shows how to do all this in the most reliable way.

2 Comments

But it is all running inside with Display():.
I have updated my question to make this more clear. I do appreciate your response.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.