Parsel
Parsel is a BSD-licensed Python library to extract data from HTML, JSON, and XML documents.
It supports:
JMESPath expressions for JSON documents
Find the Parsel online documentation at https://parsel.readthedocs.org.
Example (open online demo):
>>> from parsel import Selector
>>> text = """
... <html>
... <body>
... <h1>Hello, Parsel!</h1>
... <ul>
... <li><a href="http://example.com">Link 1</a></li>
... <li><a href="http://scrapy.org">Link 2</a></li>
... </ul>
... <script type="application/json">{"a": ["b", "c"]}</script>
... </body>
... </html>"""
>>> selector = Selector(text=text)
>>> selector.css("h1::text").get()
'Hello, Parsel!'
>>> selector.xpath("//h1/text()").re(r"\w+")
['Hello', 'Parsel']
>>> for li in selector.css("ul > li"):
... print(li.xpath(".//@href").get())
...
http://example.com
http://scrapy.org
>>> selector.css("script::text").jmespath("a").get()
'b'
>>> selector.css("script::text").jmespath("a").getall()
['b', 'c']
Parsel Documentation Contents
Contents:
- Installation
- Usage
- API reference
- History
- 1.10.0 (2024-12-16)
- 1.9.1 (2024-04-08)
- 1.9.0 (2024-03-14)
- 1.8.1 (2023-04-18)
- 1.8.0 (2023-04-18)
- 1.7.0 (2022-11-01)
- 1.6.0 (2020-05-07)
- 1.5.2 (2019-08-09)
- 1.5.1 (2018-10-25)
- 1.5.0 (2018-07-04)
- 1.4.0 (2018-02-08)
- 1.3.1 (2017-12-28)
- 1.3.0 (2017-12-28)
- 1.2.0 (2017-05-17)
- 1.1.0 (2016-11-22)
- 1.0.3 (2016-07-29)
- 1.0.2 (2016-04-26)
- 1.0.1 (2015-08-24)
- 1.0.0 (2015-08-22)
- 0.9.6 (2015-08-14)
- 0.9.5 (2015-08-11)
- 0.9.4 (2015-08-10)
- 0.9.3 (2015-08-07)
- 0.9.2 (2015-08-07)
- 0.9.1 (2015-08-04)
- 0.9.0 (2015-07-30)