-
Updated
Jul 4, 2022 - Ruby
scraper
Here are 6,041 public repositories matching this topic...
-
Updated
Nov 9, 2020 - Python
-
Updated
Jun 21, 2022 - PHP
-
Updated
Jun 17, 2022 - Python
Unless I missed something, the documentation doesn't explain how to query document metadata (searching "site:montferret.dev metadata" through Google returned nothing, neither did grepping the source code).
As an example, I tried to query the og:url metadata.
I tried variations of //meta[property='og:url']::attr(content), with or without the leading //, and with or without the `attr(conte
-
Updated
Jun 2, 2022
-
Updated
Feb 3, 2021 - Python
-
Updated
Jun 5, 2022 - JavaScript
-
Updated
Jul 2, 2022 - JavaScript
-
Updated
Jun 23, 2022 - Python
Is your feature request related to a problem? Please describe.
A new contributor will feel overwhelmed when they will try to contribute to this project.
Describe the solution you'd like
Add a architecture.md as described in this blog post
Describe alternatives you've considered
Not having architecture.md.
**Add
-
Updated
Jul 8, 2022 - Go
scoreText is broken
Tested in search and list methods, scoreText shows ' Rated 4.3 stars out of five stars ' instead of 4.3
-
Updated
Jul 5, 2022 - Python
Improve this page
Add a description, image, and links to the scraper topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the scraper topic, visit your repo's landing page and select "manage topics."

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

When looking up an attribute with .attr(), the name of the attribute should be lowercased before looking up in .attribs object.