GitHub - scrapy/parsel: Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Parsel

Parsel is a BSD-licensed Python library to extract and remove data from HTML and XML using XPath and CSS selectors, optionally combined with regular expressions.

Find the Parsel online documentation at https://parsel.readthedocs.org.

Example (open online demo):

>>> from parsel import Selector
>>> selector = Selector(text="""<html>
        <body>
            <h1>Hello, Parsel!</h1>
            <ul>
                <li><a href="http://example.com">Link 1</a></li>
                <li><a href="http://scrapy.org">Link 2</a></li>
            </ul>
        </body>
        </html>""")
>>> selector.css('h1::text').get()
'Hello, Parsel!'
>>> selector.xpath('//h1/text()').re(r'\w+')
['Hello', 'Parsel']
>>> for li in selector.css('ul > li'):
...     print(li.xpath('.//@href').get())
http://example.com
http://scrapy.org

Oct	NOV	Dec
	21
2021	2022	2023

README.rst

Parsel

About

Releases 13

Packages

Contributors 45

Languages

License

scrapy/parsel

Launching GitHub Desktop

Launching GitHub Desktop

Launching Xcode

Launching Visual Studio Code

Latest commit

Git stats

Files

README.rst

Parsel

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 13

Packages 0

Contributors 45

Languages

Packages