Newest 'web-scraping' Questions

3 votes

1 answer

125 views

Multi-Page Web Scraping Code Using Selenium with Multithreading

I have written a web scraping script using Selenium to crawl blog content from multiple URLs. The script processes URLs in batches of 1000 and uses multithreading with the ThreadPoolExecutor to ...

Minnie

31

asked Dec 24, 2024 at 0:31

5 votes

2 answers

697 views

Readability and error handling improvements for Python web scraping class

Description I recently wrote a Python script to download files from the Library of Congress (LOC) based on a search query. The code fetches metadata, extracts file ...

IntegerEuler

253

asked Nov 19, 2024 at 0:04

4 votes

1 answer

100 views

Scraping the calendar of some public libraries from their websites

I've been learning some Haskell as an amateur (to be precise: I started programming with this language, and it has been a year or less since I started seriously). So far, I have realised only small ...

user665110

141

asked Aug 3, 2024 at 13:34

2 votes

1 answer

88 views

Scrapy Spider for fetching product data from multiple pages of a website

I have written a Scrapy spider to scrape product data from a website. The spider navigates through multiple pages to reach a specific product and extracts details such as the product name, price, ...

I DON'T KNOW

29

asked Jun 27, 2024 at 20:02

3 votes

2 answers

99 views

Validating a web crawlers page visits with a decorator

I am writing a crawler that is going to end up in production and I was trying to come up with a way to validate its page visits. It scrapes asp.net pages so each scraping process involves a few ...

Gustavo Costa

33

asked May 12, 2024 at 20:21

5 votes

3 answers

839 views

code format and steps web scraping using beautiful soup

I've done simple web scraping and want to make sure all my steps are correct? Is it considered clean code? Is there a better way to use the multi-page scraping feature? ...

Lpython

51

asked May 9, 2024 at 9:35

3 votes

1 answer

108 views

Scraping website with Python and Selenium to collect data from dynamic website

Summary: The code scrapes the website and collects the data to store it in CSV. It also downloads selected information that is available for download in PDF format. The details and the entire code are ...

sangharsh

269

asked Apr 9, 2024 at 20:41

0 votes

2 answers

178 views

Drayage Webscraper: Limited to table structure

This is my first working scraper. I'm sure a lot can be improved. My biggest question is how can I better specify what data to pull? All the data I'm currently grabbing is needed, but I couldn't ...

wigglesthe3rd

17

asked Apr 8, 2024 at 23:16

2 votes

1 answer

78 views

A selenium web scraper to package NBA data

I'm building a selenium web scraper for basketball-reference.com that takes a player name and returns data in either a JSON format or Pandas DataFrame object. The class in question is one of many that ...

BluffShove

21

asked Mar 24, 2024 at 0:25

4 votes

1 answer

121 views

Java classes for downloading all in-coming/out-going links of an article in the Wikipedia article graph

(The entire project is in GitHub.) Introduction This project provides facilities for generating in-coming or out-going links in a given Wikipedia page. Code ...

coderodde

31.9k

asked Mar 20, 2024 at 11:26

5 votes

1 answer

212 views

Scraping the Divar.ir

I've wrote a code to scrape the Divar, which is an equivalent of Ebay in Iran. I have a few questions: Am I doing the error handling and logging ok? Is there a better way to optimize this code? (note ...

Amirhossein Rezaei

543

asked Sep 11, 2023 at 23:31

1 vote

2 answers

202 views

Web scraping spider

I'm currently working on my first web scraping project and I need to scrape a lot of websites. With my current code it takes more than a day but for my project I need to scan the same websites every 5 ...

Max

27

asked Jul 12, 2023 at 21:13

4 votes

2 answers

209 views

Enum to deserialize HTML sizes from JSON with serde

I added an enum for my webscraper to deserialize data from a JSON field that represents an HTML image size, which can either be an unsigned int like 1080 or a ...

Richard Neumann

6,421

asked Jun 20, 2023 at 19:17

2 votes

1 answer

108 views

Automatically extract useful cars from car site

I am using puppeteer to extract data and see when a car that meets my requirements shows up, this is what I did so far. I would like some basic syntax advice, or more advanced tips as well. I tried to ...

Mah Neh

79

asked Apr 19, 2023 at 20:12

2 votes

0 answers

74 views

Simplified HTML parsing for LEGO features

The goal is to extract the the Features section from a Lego product page. In the Features section, usually there's a header (...

alvas

709

asked Apr 7, 2023 at 7:29

Stack Exchange Network

Questions tagged [web-scraping]

Multi-Page Web Scraping Code Using Selenium with Multithreading

Readability and error handling improvements for Python web scraping class

Scraping the calendar of some public libraries from their websites

Scrapy Spider for fetching product data from multiple pages of a website

Validating a web crawlers page visits with a decorator

code format and steps web scraping using beautiful soup

Scraping website with Python and Selenium to collect data from dynamic website

Drayage Webscraper: Limited to table structure

A selenium web scraper to package NBA data

Java classes for downloading all in-coming/out-going links of an article in the Wikipedia article graph

Scraping the Divar.ir

Web scraping spider

Enum to deserialize HTML sizes from JSON with serde

Automatically extract useful cars from car site

Simplified HTML parsing for LEGO features

Hot Network Questions