article-extractor

I have mostly tested trafilatura on a set of English, German and French web pages I had run into by surfing or during web crawls. There are definitely further web pages and cases in other languages for which the extraction doesn't work so far.

Corresponding bug reports can either be filed as a list in an issue like this one or in the code as XPath expressions in [xpaths.py](https://github.com

Nov	DEC	Jan
	26
2020	2021	2022

article-extractor

Here are 33 public repositories matching this topic...

scotteh / php-goose

adbar / trafilatura

Test trafilatura on further web pages and report bugs

ndaidong / article-parser

hipstermojo / paperoni

Strumenta / SmartReader

web64 / nlpserver

fterh / sneakpeek

web64 / laravel-nlp

KotlinSpringBoot / saber

jungyoun / html-article-extractor

victormartinez / ferret

bharathvaj-ganesan / artixtractor

KhanShaheb34 / ProthomAloScraper

gloomyzerg / textractor

clarivate / wos-excel-converter

Documentation of the configuration path

gadzan / generatoc

Sathish-Vasudev / Article-Scraper

mccallofthewild / alexandrias-revenge

metalwarrior665 / actor-article-extractor-smart

sters / extract-content

sters / compare-article-extractors

MrKioZ / Mawdoo3Picker

brookmg / TodayOnEarth_Backend

AbdulMoizAli / Extractive-Text-Summarization

korhanyuzbas / python-articlecrawler

warchildmd / gorticle

Dmtechnolab98 / Free-Article-Generator-from-YouTube-Videos

eneiromatos / NebulaExpiredArticleHunter

0x01h / yozdil-article-scraper-generator

kl / the-daily-stallman

Improve this page

Add this topic to your repo