#
information-retrieval
Here are 1,281 public repositories matching this topic...
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
python
machine-learning
information-retrieval
data-mining
ocr
deep-learning
image-processing
cnn
pytorch
lstm
optical-character-recognition
crnn
scene-text
scene-text-recognition
easyocr
-
Updated
Mar 3, 2021 - Python
Apache Lucene and Solr open-source search software
-
Updated
Mar 7, 2021 - Java
Fetches system/theme information in terminal for Linux desktop screenshots.
-
Updated
Mar 6, 2021 - Shell
Accelerated deep learning R&D
python
infrastructure
machine-learning
natural-language-processing
information-retrieval
research
reinforcement-learning
computer-vision
deep-learning
text-classification
distributed-computing
image-processing
pytorch
image-classification
metric-learning
recommender-system
object-detection
image-segmentation
reproducibility
text-segmentation
-
Updated
Mar 7, 2021 - Python
Learning to Rank in TensorFlow
-
Updated
Feb 4, 2021 - Python
Deep neural network to extract intelligent information from invoice documents.
information-retrieval
deep-neural-networks
deep-learning
invoices
keras
information-extraction
classification
invoice
billing
deeplearning
keras-neural-networks
invoice-pdf
invoice-management
keras-tensorflow
invoice-software
invoice-insight
invoice-parser
-
Updated
Feb 25, 2021 - Python
tholor
commented
Feb 26, 2021
Is your feature request related to a problem? Please describe.
If you work on domain corpora, collecting additional training data to improve your reader or retriever models is very helpful.
For collecting training data, there are two main options:
a) Manual Labelling
b) User feedback for "live predictions"
b) is particularly promising if you don't have enough time or resources for a)
Python Keyphrase Extraction module
python
natural-language-processing
information-retrieval
keyword
computational-linguistics
keyword-extraction
keyphrase-extraction
keyphrase
-
Updated
Feb 2, 2021 - Python
A collection of research on knowledge graphs
natural-language-processing
information-retrieval
paper
survey
knowledge-graph
question-answering
representation-learning
cross-modal
knowledge-graph-completion
ner
dialogue-systems
reasoning
relation-extraction
commonsense
temporal-knowledge-graph
recommendation-systems
meta-relational-learning
-
Updated
Nov 24, 2020 - JavaScript
A curated list of papers dedicated to neural text (semantic) matching.
-
Updated
Dec 6, 2020 - HTML
Straightforward fuzzy matching, information retrieval and NLP building blocks for JavaScript.
machine-learning
natural-language-processing
information-retrieval
clustering
record-linkage
fuzzy-matching
deduplication
-
Updated
Jan 21, 2021 - JavaScript
A Lucene toolkit for replicable information retrieval research
-
Updated
Feb 24, 2021 - Java
Hardware-accelerated vector-based search engine. Available as a HTTP service or as an embedded library.
search
search-engine
machine-learning
information-retrieval
nlu
vector-space-model
vector-space
search-algorithms
resin
nlu-engine
-
Updated
Jan 18, 2021 - C#
My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.microsoft.com/pubs/226585/cikm2014_cdssm_final.pdf.
-
Updated
Jun 5, 2017 - Python
nlp
natural-language-processing
information-retrieval
deep-learning
transformers
pytorch
artificial-intelligence
question-answering
reading-comprehension
bert
-
Updated
Apr 30, 2020 - Python
PISA: Performant Indexes and Search for Academia
-
Updated
Feb 27, 2021 - C++
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
natural-language-processing
information-retrieval
corpus
language-detection
embeddings
named-entity-recognition
normalizer
spell-check
persian-language
stemmer
dependency-parser
persian-nlp
part-of-speech-tagger
morphological-analysis
persian-stemmer
shallow-parser
-
Updated
Feb 7, 2021
telegram group scraper tool. fetch all information about group members
linux
information-retrieval
telegram
python3
promotion
termux
information-gathering
smsbomber
termux-tool
telegram-scraper-bot
telegram-scraper
-
Updated
Feb 25, 2021 - Python
Tools and recipes to train deep learning models and build services for NLP tasks such as text classification, semantic search ranking and recall fetching, cross-lingual information retrieval, and question answering etc.
-
Updated
Dec 24, 2018 - Python
Track any ip address with IP-Tracer. IP-Tracer is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracer.
linux
information-retrieval
ip-location
ip-geolocation
termux
hacking-tool
linux-tools
information-gathering
hacking-tools
termux-tool
termux-hacking
ip-tracer
gnuroot-debian
-
Updated
Nov 13, 2020 - PHP
RMDL: Random Multimodel Deep Learning for Classification
machine-learning
information-retrieval
text-mining
data-mining
deep-neural-networks
deep-learning
text-classification
tensorflow
keras
cnn
dnn
recurrent-neural-networks
classification
rnn
image-classification
ensemble-learning
convolutional-neural-networks
multimodel
-
Updated
Feb 19, 2021 - Python
Information Gathering Instagram.
python
linux
instagram
information-retrieval
scraper
osint
python3
instagram-scraper
termux
information-gathering
termux-tool
-
Updated
Feb 25, 2021 - Python
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
search
search-engine
distributed-systems
information-retrieval
big-data
spark
solr
web-crawler
nutch
tika
sparkles
-
Updated
Mar 6, 2021 - Java
word2vec, sentence2vec, machine reading comprehension, dialog system, text classification, pretrained language model (i.e., XLNet, BERT, ELMo, GPT), sequence labeling, information retrieval, information extraction (i.e., entity, relation and event extraction), knowledge graph, text generation, network embedding
information-retrieval
text-classification
word2vec
text-generation
information-extraction
knowledge-graph
network-embedding
sequence-labeling
dialogue-systems
sentence2vec
machine-reading-comprehension
pretrained-language-model
-
Updated
Jan 11, 2021 - OpenEdge ABL
Extract subdomains from SSL certificates in HTTPS sites.
dns
ssl
information-retrieval
tool
https
certificates
discovery
subdomain
ssl-certificate
infosec
pentesting
pentest
ssl-certificates
pentest-scripts
pentest-tool
extract-subdomains
-
Updated
Oct 11, 2020 - Python
Scrape any website, article or RSS/Atom Feed with ease!
-
Updated
Jul 25, 2020 - Elixir
A math-aware search engine.
-
Updated
Mar 7, 2021 - C
BitMagic Library
c
c-plus-plus
information-retrieval
cmake
algorithm
avx
bit-manipulation
simd
integer-compression
sparse-vectors
sparse-matrix
bit-array
indexing-engine
bit-vector
adjacency-matrix
associative-array
sparse-vector
-
Updated
Feb 18, 2021 - C++
allRank is a framework for training learning-to-rank neural models based on PyTorch.
python
machine-learning
information-retrieval
deep-learning
pytorch
transformer
ranking
learning-to-rank
ndcg
click-model
-
Updated
Dec 10, 2020 - Python
Improve this page
Add a description, image, and links to the information-retrieval topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the information-retrieval topic, visit your repo's landing page and select "manage topics."


Not a high-priority at all, but it'd be more sensible for such a tutorial/testing utility corpus to be implemented elsewhere - maybe under
/test/or some other data- or doc- related module – rather than ingensim.models.word2vec.Originally posted by @gojomo in RaRe-Technologies/gensim#2939 (comment)