I'm exploring the different ways to add some consistency to a file deployment operation. The current situation is :
A current version folder which contains approximately 100K different files
- /current
- /path1
- file1
- file2
- ...
- /path2 - fileX
- ...
- /path1
An update folder which contains around 100 files
- /update
- /path1
- file1
- /path2 - fileX
- ...
- /path1
The final goal is to send all the files from the update to the current folder, I insist on the "all". Either none of the files should be replicated if there's an error during the operation, or all the files should be deployed in order to flag the operation as successful.
In an ideal word, the scenario I'm looking for would be an "atomic" rsync which would return either a fail error code or a success error code depending on what happened during the operation and would ensure that the original current directory would be seen by the system instantly (after the rsync operation) as a newer version (= no intermediary state during the copy because of a potential electrical cut or whatsover..).
From my understanding, atomic operations are not available on most UNIX systems, so I can consider that the ideal case will clearly not be reached. I'm trying to approximate this behavior as much as possible.
I explored different solutions for this :
cp -alto mirror thecurrentdirectory to atmpdirectory, then copy all the files from theupdatedirectory in it, then removingcurrentand renamingtmptocurrentrsync(so far the most pertinent) using the--link-destoption in order to create an intermediary folder with hard links to thecurrentdirectory files. Basically same as the previous case but probably much cleaner as it doesn't require anycp.atomic-rsyncI encountered an existing perl script Perl atomic-rsync which supposedly does this kind of operations, but which results in only taking into account the files present in theupdatedirectory and getting rid of the "delta files'currentfolder.
Both solutions seem to work but I have no confidence in using any of them in a real production use case, the problem being that it might be very slow or somehow costly/useless to create 100K hard links.
I also know that a very consistent solution would be to use snapshots and there are plenty of options for that, but it's not acceptable in my case due to the disk size (~70GB and the current folder already takes ~60GB).
I ran out of options in my knowledge, would there be any (better) way to achieve the expected goal?
currentis 60GB, how large isupdate? Is it possible to simply use a larger drive?