Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Pymeta will search the web for files on a domain to download and extract metadata. This technique can be used to identify: domains, usernames, software/version numbers and naming conventions.
I have mostly tested htmldate on a set of English, German and French web pages I had run into by surfing or during web crawls. There are definitely further web pages and cases in other languages for which the extraction of a date doesn't work so far.
Please install the dateparser library beforehand as it significantly extends linguistic coverage: pipor pip3 install -U dateparser or `pi
A cross-platform library and command-line tool that extracts the currently playing track in Traktor and optionally outputs to a file with configurable formatting.
Grav AutoSEO is a plugin for Grav with which you can fill automatically the description and keywords metadata of a page using its content. It also adds Open Graph (used by facebook, google plus..) and Twitter Cards metadata.
I have mostly tested
htmldateon a set of English, German and French web pages I had run into by surfing or during web crawls. There are definitely further web pages and cases in other languages for which the extraction of a date doesn't work so far.Please install the
dateparserlibrary beforehand as it significantly extends linguistic coverage:piporpip3 install -U dateparseror `pi