Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
The code used for the experiments in the paper "On Optimally Partitioning Variable-Byte Codes", by Giulio Ermanno Pibiri and Rossano Venturini, published in IEEE TKDE 2019.
CS 582 Information Retrieval at University of Illinois at Chicago. Multithreaded crawling of UIC domain, inverted index, page rank, SEO with Context Pseudo-Relevance Feedback
The code used for the experiments in the paper "Clustered Elias-Fano Indexes", by Giulio Ermanno Pibiri and Rossano Venturini, published in ACM TOIS 2017.
An alternative to elasticsearch engine written in Go for small set of documents that uses inverted index to build the index and utilizes redis to store the indexes.