Questions tagged [scrapy]
Scrapy is a fast open-source high-level screen scraping and web crawling framework written in python, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
37 questions
2
votes
1
answer
88
views
Scrapy Spider for fetching product data from multiple pages of a website
I have written a Scrapy spider to scrape product data from a website. The spider navigates through multiple pages to reach a specific product and extracts details such as the product name, price, ...
2
votes
1
answer
76
views
Save data in item wihitn Scrapy and Python
I need to store data in Scrapy Item in parts. To update only the information that I found at the moment.
I did it, but the code looks too verbose. Three lines I repeat very often:
...
3
votes
0
answers
79
views
Gather product information from the Apotheek Dedeyne web site
I'm learning Python Scrapy, and I'm scraping product information from the following website: https://apotheekdedeyne.be/
This project is for personal learning & ...
1
vote
1
answer
95
views
Parsing an XML tree of categories
I have parse function which is parsing tree of categories. I've written it in simplest way possible and now struggling with refactoring it.
Every nested loop is doing the same stuff but appending ...
4
votes
1
answer
1k
views
Scraping a dynamic website with Scrapy (or Requests) and Selenium
I am trying to use Scrapy for one of the sites I've scraped before using Selenium over here.
Because the search field for this site is dynamically generated and requires the user to hover the cursor ...
3
votes
0
answers
57
views
Scraping websites for presence of particular words
Im new to Python and Scrapy and Im trying to write a code that would find if a website contains a specific word in it. I need to scrapy many websites with different layouts so the search is a general ...
3
votes
0
answers
380
views
Microservice for scraping images with celery
I made a project, that scrapes images asynchronously and saves them in container. I have access to them through volume. Scrapy finds images on given web page.
Any tips will be good. But first I would ...
4
votes
1
answer
125
views
small web scraper to read product highlight from given urls
I have written a Python scraping program, using the Scrapy framework. The following code reads a list of urls that need to crawled and gets product highlights from the list. It also follows the next ...
3
votes
1
answer
160
views
Web Scraping Dynamically Generated Content Python
This is a small web scraping project I made in 2 hours that targets the website remote.co . I am looking forward for improvements in my code. I know about the inconsistency with the WebDriverWait and ...
2
votes
0
answers
52
views
Use Scrapy to find duplicate images across pages
I'm trying to use scrapy to find image URLs that are used more than once on a website across all pages.
This is my spider:
...
4
votes
1
answer
125
views
Sourcing data format from multiple different structures
Problem
I want to read in the data to dictionary
person = {
'name': 'John Doe',
'email': '[email protected]',
'age': 50,
'connected': False
}
The ...
2
votes
2
answers
2k
views
Creating a csv file using scrapy
I've created a script using Python in association with Scrapy to parse the movie names and its years spread across multiple pages from a torrent site. My goal here is to write the parsed data in a CSV ...
5
votes
3
answers
10k
views
Writing to a csv file in a customized way using scrapy
I've written a script in scrapy to grab different names and links from different pages of a website and write those parsed items ...
7
votes
1
answer
1k
views
Parsing different categories using Scrapy from a webpage
I've written a script in Python Scrapy to parse different "model", "country" and "year" of various bikes from a webpage. There are several subcategories to track to reach ...
2
votes
0
answers
194
views
Handling IndexError using lambda function within scrapy
I've written a script using python's scrapy library to parse some fields from craigslist. The spider I've created here is way normal than what usually gets considered ideal to be reviewed. However, I'...