#
nutch
Here are 19 public repositories matching this topic...
A OCR Search Engine With Tesseract Nutch Solr And PHP
-
Updated
Jan 25, 2019 - JavaScript
rrgirish
commented
Oct 4, 2015
I tried crawling with a couple of sites using the nutch crawler. It shows that it has crawled ~13000 pages. When i click on the visualize button, the kibana dashboard says i have to configure an index pattern.
"logstash-* " doesn't seem to work.
Does this need log.io to work? What sort of values can i give in the index field to see the visualization?
Is there any documentation for this part?
How to use Apache Nutch without command line
-
Updated
Apr 17, 2019 - Java
Link ranking with Apache Giraph for Apache Nutch
-
Updated
Nov 26, 2019 - Java
Python port of Nutch that allows controlling Apache Nutch via its REST API.
-
Updated
Dec 2, 2015 - Python
A very simple search engine "specialised" in searching financial news.
-
Updated
Dec 5, 2016 - Shell
Launch fast and easy an Apache Solr linked with Apache Nutch in separated docker containers.
-
Updated
Dec 3, 2015
Nutch 1.x Indexer Plugin that runs against ES6.7
-
Updated
Aug 12, 2019 - Java
Developed a Spatial Search website that allow users to search documents from FBI Vault website. Extract the most frequently occurring location in each of documents, and load the geo-tagged data into Apache Solr to index the documents, visualize search results using the Google Maps API.
-
Updated
Sep 11, 2014 - Java
Simple crawler using apache nutch and elasticsearch
-
Updated
May 27, 2020 - Shell
Search engine knowledge systems(搜索引擎知识体系).
-
Updated
Feb 22, 2020
-
Updated
Dec 3, 2019 - Java
Improve this page
Add a description, image, and links to the nutch topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the nutch topic, visit your repo's landing page and select "manage topics."


Issue Description
It would be cool to override the config file as a whole on the cmd line so that lots of options could be updated in one place.
How to reproduce it
Environment and Version Information
All environments.
An external links for reference
Contributing
I'll fix this.