Copy a rolling "last 2 months" of files

Question

I have a friend who is a photographer and she has a lot of large files as a result. Our current solution is where some files are delivered to my house. I don't have the storage capability to store all of her files.

Because we are talking Terabytes of data, a full backup can't be kept at her place. We do currently have backups of the post-processed files.

What I would like to do is keep a rolling, x-month backup of these files at her place. I’ve got space on a drive there, so I just need a script. This script needs to run at least daily and make sure that the 250GB of available space on this disk is filled with the most recent raw files such that the disk is nearly full.

I’ve been trying tar, as it has a built-in --newer option. However, then I’m just creating a massive tarfile every day. When I run the same job tomorrow, I’m creating another massive tarfile, which may well be exactly the same as yesterday's. And so on. It seems very inefficient.

My initial thought was rsync, but it doesn’t seem to have its own built in time options. There are ways you can faff about using a find command and piping in the results, but apparently this doesn’t preserve the directory structure at the target.

How is this not just “a thing” in Unix? Am I missing something? Ironically, this would be trivial in Windows: robocopy <source> <destination> /mir /maxage:<date>.

To summarise:

We have a source tree full of large files
At any time we may get a new folder of large files
We have a hard drive which is not as big as the tree of files, but is big enough to contain the last 2 months’ of files
I want those files copied, as frequently as I choose, to the drive
The folder structure needs to be retained
When a file becomes 2 months + 1 day old, delete it
Net result: I always have the last 2 months’ of files on the drive, no matter what’s added to the source

You might want to check out Photograpy. We tend to have this problem and someone might have a faster solution. — SailorCire
– SailorCire, Commented May 6, 2015 at 14:06
Welcome to U&L. Please take the time to read the help→tour. Especially the part about no distractions, no chit-chat (I already removed most of it). I am not sure what "bestie" is, although I think I can deduct it from the context. I am not a native speaker but I have lived in the USA for 5 years and never heard that word before, nor is it in my dictionary. Please considerate with non-native speakers and don't use slang in your posts. — Anthon
– Anthon, Commented May 6, 2015 at 14:09
Could you clarify the reason that you want to restrict backups to (say) the most recent two months, please. Specifically, is it because keeping even one copy of older data on the backup system is overkill, or is it because your current method keeps multiple copies and as such is consuming disk space for redundant copies of older data. — Chris Davies
– Chris Davies, Commented May 6, 2015 at 14:26
Thanks SailorCire, I'll try a post there. And apologies Anthon, and thanks for taking the time to do that. — John Noble
– John Noble, Commented May 6, 2015 at 22:28
Roaima, it's because the space on this particular backup drive is very limited: 250GB. It doesn't matter if I keep old data, I just don't have room for it. — John Noble
– John Noble, Commented May 6, 2015 at 22:29

John · Accepted Answer · 2015-05-06 13:56:42Z

0

Look at rsnapshot. I'm pretty sure it will do everything you want, and as a bonus can be expanded to keep files for longer than two months when you get a bigger drive.

answered May 6, 2015 at 13:56

John

17.4k2 gold badges36 silver badges44 bronze badges

While rsnapshot can be configured to keep backups on a "long tail" basis I don't think it can select files by age.

Chris Davies
– Chris Davies

2015-05-06 14:29:06 +00:00
Commented May 6, 2015 at 14:29

Add a comment |

Stack Exchange Network

Copy a rolling "last 2 months" of files

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Copy a rolling "last 2 months" of files

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions