deduplication

the borg files cache can be rather large, because it keeps some information about all files that have been processed recently.

lz4 is a very fast compression / decompression algorithm, so we could try to use it to lower the in-memory footprint of the files cache entries.

before implementing this, we should check how big the savings typically are - to determine whether it is worthwhile doing

I am synching my local repo to a Wasabi S3 bucket daily. I am capturing the output from the cron job and have that one sent as an e-mail to myself. The output of the sync-to command is reallly… lengthy and I am actually not interested in that output.

How about adding the --no-progress option to this command, so it only outputs any errors ans not the whole progress?

Is your feature request related to a problem? Please describe.
Currently, MapType are not supported for Spark DataFrames

Describe the solution you'd like
Add support for MapType Spark DataFrame columns

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other co

Right dduper has minimal test script to check basic functionality See ci/gitlab/*.sh . Enhance it add RAID tests.

Apr	MAY	Jun
	03
2020	2021	2022

deduplication

Here are 230 public repositories matching this topic...

restic / restic

borgbackup / borg

prometheus / alertmanager

openvenues / libpostal

arsenetar / dupeguru

sahib / rmlint

witten / borgmatic

jbruchon / jdupes

dpc / rdedup

kopia / kopia

Yomguithereal / talisman

J535D165 / recordlinkage

J535D165 / data-matching-software

mattilyra / LSH

cupcakearmy / autorestic

dm-vdo / kvdo

F483 / dejavu

dm-vdo / vdo

zouzias / spark-lucenerdd

kdeldycke / mail-deduplicate

tsileo / blobstash

alephdata / fingerprints

jvirkki / dupd

fake-name / IntraArchiveDeduplicator

moj-analytical-services / splink

elemental-lf / benji

usc-isi-i2 / rltk

openvenues / lieu

Ghatage / horcrux

Lakshmipathi / dduper

Improve this page

Add this topic to your repo