Questions tagged [xpath]

Ask Question

The primary purpose of XPath is to address parts of an XML document. It also provides basic facilities for manipulation of strings, numbers and booleans. XPath uses a compact, non-XML syntax. XPath operates on the abstract, logical structure of an XML document, rather than its surface syntax.

37 questions

1 vote

1 answer

119 views

Counting Ads in webpage using XPath and EasyList in Python

I have the following the function the retrieves a given webpage and returns the number of adverts that are on the page using a shortened version of EasyList (17,000 rules). Using multiprocessing, this ...

NGH

asked Mar 13, 2023 at 3:01

1 vote

1 answer

62 views

Extracting information from HTML with XSLT 3.0 when data is grouped visually as siblings in a td separated by blank lines

I have a work-in-progress where I'm using XSLT 3 to extract information from some preprocessed archaic HTML. I'd like to produce JSON showing the relationships between the various entities for further ...

Forensic_07

asked Feb 1, 2022 at 2:24

2 votes

1 answer

551 views

Using XSLT 3.0 to extract information from real-world HTML and produce JSON

For work, I extract information from HTML and XML sources to save in databases. The objective is to generate JSON representing the source document's information and its relationships in order to 1) ...

Forensic_07

asked Jan 20, 2022 at 22:13

3 votes

1 answer

255 views

Getting xpath indices to count forward to preserve table structure

Scraping tables from the web get complicated when there are 2 or more values in a cell. In order to preserve the table structure, I have devised a way to count the row-number index of its xpath, ...

Sati

asked Jun 7, 2021 at 16:53

-2 votes

1 answer

114 views

Java xPath: simple expression evaluation [closed]

This code is using XPath to expose a library function that can check whether an XML file contains a string or an expression: ...

mattobob

asked Jun 13, 2020 at 7:56

2 votes

1 answer

109 views

Flattening XML and selecting nodes based on input to convert to CSV

I have some XML that I want to flatten based on its XPath. The task is to acquire certain nodes that are passed as input. That said, I have to get to parent nodes and look for nodes that might be part ...

noobie-php

asked Aug 29, 2019 at 12:38

2 votes

1 answer

1k views

Beautifulsoup and lxml (xpath) too slow with respect to regex when parsing HTML

I agree that using regex to parse HTML is not a good way, in particular I am worried about their fragility with respect to change in the HTML. The problem is that any alternatives are really too slow....

Ruggero Turra

asked Jul 12, 2019 at 8:35

2 votes

1 answer

5k views

Scrapy crawler to parse data recursively

I've written a script in python scrapy to parse "name" and "price" of different products from a website. Firstly, it scrapes the links of different categories from the upper sided bar located in the ...

SIM

2,501

asked Aug 5, 2017 at 16:33

5 votes

1 answer

3k views

Extracting necessary records from LinkedIn

I wanted to create a scraper in python which can fetch required data from LinkedIn. I tried with python in many different ways but I could not make it until I used selenium in combination with. ...

MITHU

asked Jul 17, 2017 at 13:15

3 votes

1 answer

6k views

Scraping content from a javascript enabled website with load more button

The script I've written is able to scrape name, address, phone and web address from a webpage using python and selenium. The main barrier I had to face was to exhaust the load more button to get the ...

SIM

2,501

asked Jul 14, 2017 at 8:52

2 votes

2 answers

305 views

Web-crawler for yellowpage designed using python

I've written some code to scrape name, address, and phone number from yellowpage using python. This scraper has got input parameter. If the input is properly filled in and the filled in url exists in ...

SIM

2,501

asked Jul 13, 2017 at 14:27

2 votes

1 answer

612 views

Scraping data unveiling a button from craigslist

I've written some code to parse the names and phone numbers from craigslist. It starts from the link in "m_url" then goes one layer deep to parse the name and then again another layer deep to parse ...

SIM

2,501

asked Jul 12, 2017 at 19:26

2 votes

1 answer

9k views

Scraping table contents from a webpage using vba with selenium

My script is able to harvest full contents of a table from a webpage with javascript encrypted using vba in combination with selenium. The table has got a drop-down option from where the full contents ...

SIM

2,501

asked Jul 11, 2017 at 13:12

3 votes

1 answer

1k views

Web scraper for parsing names and email addresses from Yellowpage

I've written a script using python to parse the names and email addresses of different pizza shops in USA. I am very new in writing classes using python so I'm not very sure I didn't do anything wrong ...

SIM

2,501

asked Jul 7, 2017 at 11:41

13 votes

3 answers

15k views

Evaluating an XPath with document.evaluate() to get an array of nodes

The Problem Statement: Filter all nodes with an existing attribute that starts with a specific string (temp for example purposes). Print an array of node string ...

alecxe

17.5k

asked Jul 7, 2017 at 3:47

15 30 50 per page

2 3 Next

Stack Exchange Network

Questions tagged [xpath]

Counting Ads in webpage using XPath and EasyList in Python

Extracting information from HTML with XSLT 3.0 when data is grouped visually as siblings in a td separated by blank lines

Using XSLT 3.0 to extract information from real-world HTML and produce JSON

Getting xpath indices to count forward to preserve table structure

Java xPath: simple expression evaluation [closed]

Flattening XML and selecting nodes based on input to convert to CSV

Beautifulsoup and lxml (xpath) too slow with respect to regex when parsing HTML

Scrapy crawler to parse data recursively

Extracting necessary records from LinkedIn

Scraping content from a javascript enabled website with load more button

Web-crawler for yellowpage designed using python

Scraping data unveiling a button from craigslist

Scraping table contents from a webpage using vba with selenium

Web scraper for parsing names and email addresses from Yellowpage

Evaluating an XPath with document.evaluate() to get an array of nodes

Hot Network Questions