The Wayback Machine - https://web.archive.org/web/20220328125410/https://github.com/Norconex
Skip to content

Pinned

  1. Norconex Web Crawler (or spider) is a flexible web crawler for collecting, parsing, and manipulating data from the Internet (or Intranet) to various data repositories such as search engines.

    Java 141 63

  2. Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search …

    Java 16 11

  3. importer Public

    Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allo…

    Java 28 22

Repositories

  • importer Public

    Norconex Importer is a Java library and command-line application meant to "parse" and "extract" content out of a file as plain text, whatever its format (HTML, PDF, Word, etc). In addition, it allows you to perform any manipulation on the extracted text before using it in your own service or application.

    Java 28 Apache-2.0 22 9 1 Updated Mar 24, 2022
  • collector-filesystem Public

    Norconex Filesystem Collector is a flexible crawler for collecting, parsing, and manipulating data ranging from local hard drives to network locations into various data repositories such as search engines.

    Java 16 11 4 2 Updated Feb 9, 2022
  • collector-core Public

    Collector-related code shared between different collector implementations

    Java 5 Apache-2.0 15 6 0 Updated Feb 5, 2022
  • committer-sql Public

    Implementation of Norconex Committer for SQL (JDBC) databases.

    Java 1 Apache-2.0 5 1 0 Updated Feb 5, 2022
  • commons-maven-parent Public

    Maven parent POM for many Norconex Maven projects.

    JavaScript 0 Apache-2.0 0 0 0 Updated Jan 5, 2022
  • collector-http Public

    Norconex Web Crawler (or spider) is a flexible web crawler for collecting, parsing, and manipulating data from the Internet (or Intranet) to various data repositories such as search engines.

    Java 141 Apache-2.0 63 18 0 Updated Jan 5, 2022
  • committer-solr Public

    Solr implementation of Norconex Committer. Should also work with any Solr-based products, such as LucidWorks.

    Java 3 Apache-2.0 5 7 0 Updated Jan 5, 2022
  • committer-neo4j Public

    Implementation of Norconex Committer for Neo4j.

    Java 2 Apache-2.0 1 2 0 Updated Jan 4, 2022
  • committer-idol Public

    Autonomy IDOL implementation of Norconex Committer.

    Java 4 Apache-2.0 2 0 0 Updated Jan 4, 2022
  • committer-elasticsearch Public

    Implementation of Norconex Committer for Elasticsearch.

    Java 9 Apache-2.0 6 9 0 Updated Jan 4, 2022

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…