COLLECTED BY
Organization:
Internet Archive
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
The Wayback Machine - https://web.archive.org/web/20200825000127/https://github.com/topics/dedupe
Here are
52 public repositories
matching this topic...
Fast, secure, efficient backup program
🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
Updated
Aug 13, 2020
Python
Deduplication tool for yarn.lock files
Updated
Aug 24, 2020
JavaScript
Make CSS easier and more maintainable by using JavaScript
Updated
Aug 1, 2020
TypeScript
A powerful duplicate file finder and an enhanced fork of 'fdupes'.
A toolkit for record linkage and duplicate detection in Python
Updated
Jun 4, 2020
Python
🆔 Command line tool for deduplicating CSV files
Updated
Mar 31, 2020
Python
🆔 Examples for using the dedupe library
Updated
May 6, 2020
Python
Finding and deleting near-duplicate images based on perceptual hash.
Updated
Feb 14, 2020
Python
📧 CLI to deduplicate mails from maildir folders.
Updated
Aug 1, 2020
Python
A simple command line interface to the datamade/dedupe library.
Updated
Oct 22, 2019
Jupyter Notebook
Self-contained C# library for data deduplication using Sqlite
Utilities for de-duping Django model instances
Updated
Aug 21, 2020
Python
Duplicate code finder for Elixir
Updated
Jan 26, 2017
Elixir
👓 @tipe/apollo-dedup-batch-http-link: batches multiple operations into a single HTTP dedup request. Instead of sending a single operation, it sends an array of operations to the server.
Updated
Apr 14, 2018
JavaScript
Address Variable Type for dedupe
Updated
Mar 31, 2020
Python
name variable type for dedupe
Updated
Jun 30, 2020
Python
Updated
Mar 31, 2020
Python
Remove duplicates from your Pocket list
A filesystem for reading Android dedupe backup
UI for WatsonDedupe library
Dedupe Variable for Fuzzy Categories
Updated
Mar 31, 2020
Python
Deduplicate strings from javascript files
Updated
Jul 18, 2020
TypeScript
Improved and modern fdupes alternative
A command-line tool for deduplicating entries in a file or stream with constant memory usage
Sort, uniq, reverse, and randomize data
Updated
Jul 8, 2020
JavaScript
Improve this page
Add a description, image, and links to the
dedupe
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
dedupe
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
If somebody has some time for FUSE benchmarking: