Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
A web page article parser which returns an object containing the article's formatted text and other attributes including sentiment, keyphrases, people, places, organisations, spelling suggestions, in-article links, meta data & lighthouse audit results.
A python project (with nlp integration) to denoise any news article and strip off any images, advertisement from it giving a basic and hassle free article. It provides a 'smart view' for web-view in mobile devices with heading, keywords and text. Powered with newspaper3k.