Skip to main content
added 752 characters in body
Source Link
unfa
  • 1.8k
  • 4
  • 26
  • 36

I'm using du to continuously monitor the amount of data written to USB drives that I'm duplicating.

I compare disk usage of source and target drives and display copying progress to the user.

The problem is that du reports 100% data present on the target drive, even though I see lots of data is still in the system cache, the drive's LED is blinking, and the drives are not ready to be removed.

I run rsync, sync and umount in sequence to ensure the data is really there before letting the user remove the target drive. I can't monitor the sync progress however. So the user will see 100% long before the drives are really synced.

I'd love to be able to monitor the "real" copying progress, as it's what really matters - there's no use to see rsync complete copying 1 GB file in 25 seconds, while I'll have to wait another 5 minutes while sync flushes that to drive (I'm exaggerating, but you get the idea).

This is how I monitor rsync progress in a loop for each drive:

PROGRESS="$(echo "$(du -s "/MEDIA/TARGET" 2>/dev/null  | cut -f 1) / $(du -s "/MEDIA/SOURCE" 2>/dev/null | cut -f 1) " | bc -l)"

$PROGRESS is a float between 0 and 1, indicating the ratio between source drive usage and target drive usage.

How can I modify this so it'll consider only data that is already synced to drive, and not just waiting in system cache?

Edit:

I found that dd can perform writes omitting the system cache. I made a test and indeed copying a file this way makes du report actual values, and my progress indications would finally be accurate:

dd if=/media/SOURCE/file of=/media/TARGET/file bs=4M oflag=direct

This uses the read cache, but disabled the write cache, making the proress easier to track, without performing excessive reads. The problem is, to use dd instead of rsync I need to manually recreate the directory structure. I don't need to take care of the file attributes or modification dates.

I guess I could use a combination of find, mkdir and dd to first recreate the directory tree and then copy the files one by one. I wonder - if there are any downsides to this approach?

I'm using du to continuously monitor the amount of data written to USB drives that I'm duplicating.

I compare disk usage of source and target drives and display copying progress to the user.

The problem is that du reports 100% data present on the target drive, even though I see lots of data is still in the system cache, the drive's LED is blinking, and the drives are not ready to be removed.

I run rsync, sync and umount in sequence to ensure the data is really there before letting the user remove the target drive. I can't monitor the sync progress however. So the user will see 100% long before the drives are really synced.

I'd love to be able to monitor the "real" copying progress, as it's what really matters - there's no use to see rsync complete copying 1 GB file in 25 seconds, while I'll have to wait another 5 minutes while sync flushes that to drive (I'm exaggerating, but you get the idea).

This is how I monitor rsync progress in a loop for each drive:

PROGRESS="$(echo "$(du -s "/MEDIA/TARGET" 2>/dev/null  | cut -f 1) / $(du -s "/MEDIA/SOURCE" 2>/dev/null | cut -f 1) " | bc -l)"

$PROGRESS is a float between 0 and 1, indicating the ratio between source drive usage and target drive usage.

How can I modify this so it'll consider only data that is already synced to drive, and not just waiting in system cache?

I'm using du to continuously monitor the amount of data written to USB drives that I'm duplicating.

I compare disk usage of source and target drives and display copying progress to the user.

The problem is that du reports 100% data present on the target drive, even though I see lots of data is still in the system cache, the drive's LED is blinking, and the drives are not ready to be removed.

I run rsync, sync and umount in sequence to ensure the data is really there before letting the user remove the target drive. I can't monitor the sync progress however. So the user will see 100% long before the drives are really synced.

I'd love to be able to monitor the "real" copying progress, as it's what really matters - there's no use to see rsync complete copying 1 GB file in 25 seconds, while I'll have to wait another 5 minutes while sync flushes that to drive (I'm exaggerating, but you get the idea).

This is how I monitor rsync progress in a loop for each drive:

PROGRESS="$(echo "$(du -s "/MEDIA/TARGET" 2>/dev/null  | cut -f 1) / $(du -s "/MEDIA/SOURCE" 2>/dev/null | cut -f 1) " | bc -l)"

$PROGRESS is a float between 0 and 1, indicating the ratio between source drive usage and target drive usage.

How can I modify this so it'll consider only data that is already synced to drive, and not just waiting in system cache?

Edit:

I found that dd can perform writes omitting the system cache. I made a test and indeed copying a file this way makes du report actual values, and my progress indications would finally be accurate:

dd if=/media/SOURCE/file of=/media/TARGET/file bs=4M oflag=direct

This uses the read cache, but disabled the write cache, making the proress easier to track, without performing excessive reads. The problem is, to use dd instead of rsync I need to manually recreate the directory structure. I don't need to take care of the file attributes or modification dates.

I guess I could use a combination of find, mkdir and dd to first recreate the directory tree and then copy the files one by one. I wonder - if there are any downsides to this approach?

Source Link
unfa
  • 1.8k
  • 4
  • 26
  • 36

How to get physical (synced) disk usage, ignoring system cache?

I'm using du to continuously monitor the amount of data written to USB drives that I'm duplicating.

I compare disk usage of source and target drives and display copying progress to the user.

The problem is that du reports 100% data present on the target drive, even though I see lots of data is still in the system cache, the drive's LED is blinking, and the drives are not ready to be removed.

I run rsync, sync and umount in sequence to ensure the data is really there before letting the user remove the target drive. I can't monitor the sync progress however. So the user will see 100% long before the drives are really synced.

I'd love to be able to monitor the "real" copying progress, as it's what really matters - there's no use to see rsync complete copying 1 GB file in 25 seconds, while I'll have to wait another 5 minutes while sync flushes that to drive (I'm exaggerating, but you get the idea).

This is how I monitor rsync progress in a loop for each drive:

PROGRESS="$(echo "$(du -s "/MEDIA/TARGET" 2>/dev/null  | cut -f 1) / $(du -s "/MEDIA/SOURCE" 2>/dev/null | cut -f 1) " | bc -l)"

$PROGRESS is a float between 0 and 1, indicating the ratio between source drive usage and target drive usage.

How can I modify this so it'll consider only data that is already synced to drive, and not just waiting in system cache?