9
votes
Accepted
AniPop - The anime downloader
I have no familiarity with any of the libraries used here, so I can't comment on their usage.
What I will mention though is the giant chunk of ...
8
votes
Accepted
Simple yet Hackable WhatsApp API [UNOFFICIAL]
Here is a collection of things to improve.
Code Style Notes
Overall, there is a number of PEP8 Code Style violations, like lack of spaces around operators, blank lines, spaces before the ...
8
votes
Accepted
Python script that reboots the router every 600 seconds
Interesting script.
Let's talk about reliability.
You have a pair of concerns:
A reboot (attempt) must happen every ten minutes, no matter what.
Four .wait.until()'...
6
votes
Accepted
Extracting some names from a webpage
Overall, it looks clean, but here are some potential improvements:
XPaths, generally, don't handle multi-valued class attributes well - you would have to workaround it with ...
6
votes
Accepted
Scraping NHL Individual stats
I will make a few points about the code as code, but the first thing to consider in any sort of data collection is respecting the rights of the owners of the data.
In particular the terms of service ...
6
votes
Accepted
Scraping Instagram - Download posts, photos - videos
More constants
This:
self.url: str = 'https://www.instagram.com/{name}/'
appears to be a constant, so it can join the others in class scope. While you're ...
5
votes
Accepted
Making a reservation using Selenium/PhantomJS
You can try approaching the problem without using an actual browser and sending HTTP requests using requests and parsing HTML using, say, ...
5
votes
Accepted
Scraper to deal with some complicated site with ads
There is a room for improvement - like there always is:
avoid bare exception handling. Instead of having an except clause without specifying exception classes to handle, you can handle ...
5
votes
Accepted
Parsing a slow-loading webpage with scrapy in combination with selenium
The spider is readable and understandable. I would only extract some of the things into separate methods for readability. For example, the "infinite scroll" should probably be just defined in a ...
5
votes
Scraping NHL Individual stats
A few more comment to add after @Josiah good answer:
You can simplify the driver management in scrape_nhl_standings by using a context manager. It seems like ...
5
votes
Accepted
5
votes
Crawling a court website and downloading records
Log paths
It is not a good idea to hard-code this:
...
5
votes
Finding solutions on GitHub and Stack Overflow
Selenium
Selenium imposes a lot of overhead and complexity that you don't need to deal with. If you needed to scrape, use requests + ...
5
votes
Finding solutions on GitHub and Stack Overflow
For simply content scraping without JavaScript and ajax content try scrapy for best practices. Scrapy uses Python classes by default as it is a Python framework.
Easy tutorial to learn Scrapy:
Scrapy ...
5
votes
Accepted
Game scraper for Steam
Break up logic into functions
Having separate functions for each of the following steps will make the code easier to read.
Get game names
Scrape game information
Display game information
Guard your ...
5
votes
Accepted
Web Scraping using Selenium and Python
Repetition 1
You have several XPATHs which are largely the same. Especially when they share a common prefix, like these
...
5
votes
Multi-Page Web Scraping Code Using Selenium with Multithreading
Performance
I will skip over the usual suggestions concerning adding docstrings to the module and functions, using type hinting and adding comments where they would be useful to the reader and proceed ...
4
votes
Accepted
2048 Webgame bot
There isn't really much here to comment on, since there isn't even AI involved.
The only thing that stands out to me is your use of a silent "catch-all" try block:
...
4
votes
Accepted
Uploading songs from website to database and then to Spotify
Some simple suggestions after a first look at scraper.py:
class Song defines a method called ...
4
votes
Accepted
Elegant Method for Sleeping with Selenium Webscraper
WebDriverWait
Your intuition about time.sleep() not being ideal here is correct. A better alternative is to use the Selenium <...
4
votes
Accepted
Scraping Instagram with selenium, extract URLs, download posts
Requests raising
This pattern:
if search.ok:
...
else:
search.raise_for_status()
is redundant. Just call ...
4
votes
Crawl a website and download records
So a few tips:
Generally, functions go at the top level or as object or class methods. But doing that does mean you have to pass more things to the function, and can't rely on the closure (variables ...
4
votes
Accepted
Looping through keywords and multiple pages in Selenium
I'm going to give a partial answer that ignores most of what you've done, because the approach is Selenium-based when that is really not needed here. Compared to some of your other questions, this ...
4
votes
Extract flight arrival data from web
First: As @PavloSlavynskyy warns, FlightRadar24 has attempted to make it very clear that they do not want you to scrape their site or use their API. Among other verbiage, they say:
Copyright (c) 2014-...
4
votes
Accepted
Scraping a dynamic website with Scrapy (or Requests) and Selenium
the search field for this site is dynamically generated
That doesn't matter, since - if you bypass the UI - the form field name itself is not dynamic. Even if you were to keep using Selenium, it ...
4
votes
Web scraping data.cdc.gov for COVID-19 Data with Selenium in Python
Don't scrape. Delete all of your code. Go to that page and download one of the export types. XML is richer and has more fields, but CSV is more compact.
4
votes
Accepted
Web scraping data.cdc.gov for COVID-19 Data with Selenium in Python
In my opinion, Selenium isn't the right tool for web scraping much (probably, most) of the time. It turns out that even when websites use javascript, you can usually figure out what that js is doing ...
4
votes
Scraping campsite availability from a webpage using vba with selenium
If you aren't locked into using Selenium, you can accomplish something similar to this using a web request. Web Requests should be faster as they don't need to worry about rendering any items on ...
4
votes
Accepted
Crawler/scraper for soccer match results
Are you using Python2? If not the utf-8 declaration on top of your page isn't needed.
Obviously there is a lot of duplication as you repeat the same code to do the same thing. All you need is to write ...
3
votes
Accepted
Measure website home page total network size in bytes
You're using the wrong shebang. According to this StackOverflow answer:
Correct usage for Python 3 scripts is:
#!/usr/bin/env python3
This defaults to ...
Only top scored, non community-wiki answers of a minimum length are eligible
Related Tags
selenium × 164python × 109
web-scraping × 71
python-3.x × 55
java × 21
webdriver × 19
c# × 15
performance × 14
beginner × 12
beautifulsoup × 12
xpath × 10
instagram × 9
vba × 7
python-2.x × 5
excel × 5
unit-testing × 5
error-handling × 5
automation × 5
javascript × 4
object-oriented × 4
pandas × 4
authentication × 4
scrapy × 4
integration-testing × 3
google-chrome × 3