(Almost-)Atomic way to merge 2 folders

Question

I'm exploring the different ways to add some consistency to a file deployment operation. The current situation is :

A current version folder which contains approximately 100K different files

/current
- /path1
  - file1
  - file2
  - ...
- /path2 - fileX
- ...

An update folder which contains around 100 files

/update
- /path1
  - file1
- /path2 - fileX
- ...

The final goal is to send all the files from the update to the current folder, I insist on the "all". Either none of the files should be replicated if there's an error during the operation, or all the files should be deployed in order to flag the operation as successful.

In an ideal word, the scenario I'm looking for would be an "atomic" rsync which would return either a fail error code or a success error code depending on what happened during the operation and would ensure that the original current directory would be seen by the system instantly (after the rsync operation) as a newer version (= no intermediary state during the copy because of a potential electrical cut or whatsover..).

From my understanding, atomic operations are not available on most UNIX systems, so I can consider that the ideal case will clearly not be reached. I'm trying to approximate this behavior as much as possible.

I explored different solutions for this :

cp -al to mirror the current directory to a tmp directory, then copy all the files from the update directory in it, then removing current and renaming tmp to current
rsync (so far the most pertinent) using the --link-dest option in order to create an intermediary folder with hard links to the current directory files. Basically same as the previous case but probably much cleaner as it doesn't require any cp.
atomic-rsync I encountered an existing perl script Perl atomic-rsync which supposedly does this kind of operations, but which results in only taking into account the files present in the update directory and getting rid of the "delta files' current folder.

Both solutions seem to work but I have no confidence in using any of them in a real production use case, the problem being that it might be very slow or somehow costly/useless to create 100K hard links.

I also know that a very consistent solution would be to use snapshots and there are plenty of options for that, but it's not acceptable in my case due to the disk size (~70GB and the current folder already takes ~60GB).

I ran out of options in my knowledge, would there be any (better) way to achieve the expected goal?

Or ZFS and snapshots. How compressible is the data? ZFS may be able to use your small drive more efficiently. If current is 60GB, how large is update? Is it possible to simply use a larger drive? — Jim L.
– Jim L., Commented Jun 20, 2021 at 11:40
Going on a bit of a tangent with the 'deployment' aspect - would a blue-green deployment strategy be possible? Having different systems would take away the atomicity requirement, and you can freely switch back and forth between the old and new versions from the network layer. — Haxiel
– Haxiel, Commented Jun 20, 2021 at 12:36
@dirkt This is exactly what I was looking for. It works like a charm for my purpose and I couldn't find a better option. Thanks a lot. Please add your comment as an answer create an answer so that I can set it as the accepted one. Jim L : Regarding the ZFS option, it seems that for licensing concerns it's not as spread as the BTRFS is, I will then go for BTRFS option. Haxiel: It was definitely a very good idea but it doesn't fit my use case (which is an un-networked device). — Bil5
– Bil5, Commented Jun 20, 2021 at 19:19

dirkt · Accepted Answer · 2021-06-21 04:33:21Z

1

Consider using BTRFS and snapshots.

answered Jun 21, 2021 at 4:33

dirkt

33.4k4 gold badges53 silver badges81 bronze badges

This solution works well in my use case since it has the concept of snapshots and snapshots only take the necessary place (unchanged files are not duplicated).

Bil5
– Bil5

2021-06-21 07:08:18 +00:00
Commented Jun 21, 2021 at 7:08

Add a comment |

Stack Exchange Network

(Almost-)Atomic way to merge 2 folders

1 Answer 1

You must log in to answer this question.

Hot Network Questions

(Almost-)Atomic way to merge 2 folders

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions