The Wayback Machine - https://web.archive.org/web/20210503022515/https://github.com/topics/deduplication
Skip to content
#

deduplication

Here are 230 public repositories matching this topic...

ThomasWaldmann
ThomasWaldmann commented Jan 27, 2021

the borg files cache can be rather large, because it keeps some information about all files that have been processed recently.

lz4 is a very fast compression / decompression algorithm, so we could try to use it to lower the in-memory footprint of the files cache entries.

before implementing this, we should check how big the savings typically are - to determine whether it is worthwhile doing

budachst
budachst commented Apr 13, 2021

I am synching my local repo to a Wasabi S3 bucket daily. I am capturing the output from the cron job and have that one sent as an e-mail to myself. The output of the sync-to command is reallly… lengthy and I am actually not interested in that output.

How about adding the --no-progress option to this command, so it only outputs any errors ans not the whole progress?

zouzias
zouzias commented Apr 21, 2019

Is your feature request related to a problem? Please describe.
Currently, MapType are not supported for Spark DataFrames

Describe the solution you'd like
Add support for MapType Spark DataFrame columns

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other co

Improve this page

Add a description, image, and links to the deduplication topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the deduplication topic, visit your repo's landing page and select "manage topics."

Learn more