19

I cloned several 32Gb pendrives with a linux system installed using dd. Later I did shrink a partition and I did some little more changes (in the "Master"). Is there any tool for transferring only the blocks changed for evoiding the slow full re-cloning with dd?

I thought in rsync, but it only seems to work with files.

1
  • 2
    block devices are files too. Did you try? Commented Sep 6, 2023 at 14:35

8 Answers 8

23

Some versions of rsync have this capability (it depends on your distro). There are 2 patches that distros commonly apply to rsync. One for reading from block devices (provides the --copy-devices flag), and another for writing to block devices (provides the --write-devices flag). However even with these, there are a few other flags & caveats needed to use rsync in this manner.

Lets look at the command and then break it down:

rsync -I --copy-devices --write-devices --no-whole-file --inplace \
  "$(readlink -f "/dev/vg_src/lv_src")" "$(readlink -f "/dev/vg_dst/lv_dst")"

The -I is because rsync will look at the timestamp & size of the block device file (filesystem entry representing the device) rather than the block device contents, and potentially skip the sync. This flag forces rsync to evaluate the content of the block devices.

The --copy-devices tells rsync to synchronize from the contents of the source block device rather than the /dev file.

The --write-devices tells rsync to synchronize to the contents of the destination block device rather than replacing the /dev file.

The --no-whole-file tells rsync to only transfer the blocks that have changed. The block size can be controlled by --block-size if necessary.

The --inplace tells rsync to update the destination block device rather than create a temporary file and rename it into place.

The $(readlink ...) calls are because rsync will normally look at the paths and see they're symlinks, and not recognize them as block devices. So the readlink dereferences the symlinks.

1
  • You could skip the $(readlink) wrapper by adding the --copy-links flag--at least when the source is a block device. I don't know how well this works when the destination is a block device though... Commented Jun 20, 2024 at 18:42
14

Nowadays, rsync supports copying devices, with:

rsync --copy-devices --write-devices <FROM> <TO>

Beware that it only writes to a device if it finds an actual device node. So don't use /dev/mapper/<VG>-<LV> as the destination since it will just (try to) replace the symlink with a file containing the content of the source device.

[ One of its --*link* options can probably tell it to follow those symlinks, but I just use /dev/dm-<NN> instead. ]

11

According to the description: Bscp copies a single file or block device over an SSH connection, transferring only the parts that have changed.

2
  • This just saved me! Thank you. I wonder why Rsync couldn't do it, since it clearly has a similar algorithm, where it only copies the blocks that changed in a large file. Commented May 29, 2017 at 7:31
  • Can you provide better help on how this works. I'm to stupid to understand "bscp SRC HOST:DEST" - which is what the help amounts to. Is the SRC an IP? How are block devices specified? Commented Feb 21, 2020 at 6:32
3

blocksync provides all the options of bscp and many more.

Local sage:

# python blocksync.py /dev/source/file localhost /path/to/destination/file

Network usage:

# python blocksync.py /dev/source/file [email protected] /path/to/destination/file`
3

A similar alternative is diskrsync, which is a go code doing block deduplication (and optional compression for backups), using ssh as transport. It requires a stable source image (due to the Merkle-tree and hashing), which is usually ensured by using a snapshot.

Basic usage is fairly simple as

diskrsync --verbose --no-compress /dev/vg0/lv_snap ruser@otherhost:/dev/sdd7

It works well without installing, just put diskrsync into the home of both users and use as

./diskrsync …

The compression could be used to either keep a compressed backup on the target or use spgz to uncompress it into a local device.

1

With regret, it seems the answer at present is "No": I have not found any tool that brings the rsync approach to block-device mirroring.

I'm surprised at this, as I'd expect there to be sufficient demand for such a tool, for maintaining regularly-updated clones of large-capacity, low-change drives efficiently. It's not like write speeds (especially for external hard drives) have reached such speeds as to make it moot yet.

Your thumbdrives are actually quite small in modern terms: imagine you wanted to keep a "hot spare" of a 1TB laptop HDD (not uncommon today). Cloning the whole drive (to another HDD) would take several hours even over USB3, and the amount of gratuitous writing when only perhaps a couple of GB are actually changed, is just crazy.

Sequentially checksum-comparing the drives would save hugely on time and wear in these circumstances. Even better would be a stateful tool, that retained the checksums from the last run: it would assume that the destination drive is unchanged since then, so would only need to checksum the source drive.

If only such a tool existed!

2
  • I have used ZFS for this successfully in the past. Simply add a vdev, wait for resilver, remove the vdev and then take it off-site. The only drawback is that the pool seems to be degraded in its nominal state. Commented Dec 20, 2022 at 2:55
  • The use case is quite small on this one. One need to have fast read devices separated by a slower network, so each side would read through the device, calculate checksums per block, compare checksums and then transfer needed blocks. If, for example, both USB devices were in the same host it would not be useful as without catalog you'd need to read through both devices which as good as dd. Commented Dec 20, 2022 at 4:26
-1

Just out of curiosity, What happens if you rsync /dev/sdb to /dev/sdc ? You might want to it out with virtual disks like this:

sudo qemu-img create -f raw disk1.raw 40G
sudo qemu-img create -f raw disk2.raw 40G

to mount disks and copy and check if they synced

sudo modprobe nbd 
sudo qemu-img --connect=/dev/nbd0 disk1.raw --cache=unsafe --discard=unmap
sudo qemu-img --connect=/dev/nbd0 disk2.raw --cache=unsafe --discard=unmap

sudo gparted 

create ext4 partition to first disk and copy things ...

link /dev/nbd1 to a folder as nbd0 try to rsync /dev/nbd0 to linkedFolder/

Do not forget that it is just an idea.

Bu more valid solution to your answer is :

Easy way is to use virtual disks take them offline sync them take them back online ...

You can use on real or virtual eth adapters with ISCSI ISER NBD.

wnbd-client.exe is established in ceph.

ISER requires CNA network card (cheap Intel 10Gbe's are all around).

ISCSI requires many connections to work properly , there are workarounds and cache strategies which works flawlessly but details are out of scope here ...

1
  • 2
    "What happens if you rsync /dev/sdb to /dev/sdc ?" - it depends entirely on what flags you use Commented Oct 30, 2022 at 20:09
-4

No, and there can't be. Rsync uses file timestamps to determine what to copy and what to skip. There's nothing similar at a lower level than files. The data on the disk doesn't remember that there used to be a different partition arrangement.

In order to make sure that two disks are identical, a tool working underneath the filesystem would have to read every block on both sides and copy the source block onto the target block if they differ. That's usually slower than unconditionally doing the copy. It might be faster if writes to the target disk are a lot slower than reads (but still a far cry from what rsync gains by completely skipping unchanged files); I think I've seen a tool that did this but I can't find it now.

If you made a change to the partition setup on one side, make the same change on the other side, then call rsync on the individual filesystems.

11
  • 1
    It takes 10 minutes to read 32Gb, but 3 hours to write the full pendrive with dd. All changes will not be more than 10Mbs in total. It would be great an utility for reading and comparing on the fly two block devices. Commented Feb 14, 2017 at 9:35
  • 1
    Oh, --devices is only for copying the device node file. Commented Jul 1, 2018 at 18:51
  • 4
    No, rsync may use timestamps but doesn't need to. You can tell rsync to ignore any timestamps which is often needed if files or whole directory trees are moved around without changing their timestamps. And according to arstechnica.com/civis/viewtopic.php?t=1173708 there exist patches to add the proper commandline switches to tell rsync to sync contents of block devices block by block if the blocks are different. Commented Nov 13, 2018 at 0:32
  • 2
    There is a case where rsync on the raw block devices is a lot faster and uses less resources than rsyncing the whole file system — if the file system contains many (millions) of hardlinks. This is common with e.g. BackupPC pools (at least with BackupPC 3.x). Also common with BackupPC pools is that for quite a while only additional blocks are used while most of the already used blocks stay the same. Depending on the network speed, it can be much quicker to just compare block hashsums instead of transfering TB of data again. Commented Nov 13, 2018 at 0:36
  • 1
    "That's usually slower than unconditionally doing the copy" - for local copies, yes sure. But for network copies it can still be faster to use rsync than cp/cat/dd, etc. Commented Oct 30, 2022 at 20:14

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.