1

My aging computer will need a replacement. That will happen in a couple of months but, in the mean time, I would like to learn techniques related to improving performance and resilience of data.

I also have an empty 2 TB usb external HDD, which is like 3-4 times the size of my current laptop's HDD. So.... I would like to do two things:

  • use the external HDD to mirror the data from my current laptop (assume /home is in a separate partition which is what I would like to experiment with.... I could setup a partition on the external HDD to hold a copy before I start the experiment) and use it along the laptop's HDD to improve performance when both HDDs are connected. Should be able to recover from disconnecting the external HDD and be able to sync up when I reconnect the external HDD.

  • when I get my new computer, be able to connect the external HDD and mirror the data into the new computer and be able to use it to improve performance while the external HDD is connected and be able to continue working if the external HDD is disconnected and sync up if I connect it again.

We could assume that once I start using the new computer, the external HDD won't be connected to the old computer, if that makes the scenario a little simpler to handle.

What would be possible ways to achieve it? If you need details about the current setup of the /home partition or about the drives involved, let me know, though I think that it could be laid down in the air.

As a tip, partitions in my laptop's HDD are setup using LVM (of which I know some things but want to take advamtage of this experiment to learn more).

1

1 Answer 1

1

Unfortunately, this problem can be solved literally in millions of ways.

KISS approach would be:

  • format the new drive as completely new device (you don't need to bother with lvm really)
  • use some "manual" syncing mechanism like cp -a, tar|tar or rsync, to sync the old /home to new drive manually, or something like lsyncd to sync in near-real-time while external drive is connected
  • once new computer arrives ensure uid:gid of you user is same as on old computer and cp -a/rsync/tar|tar from external drive back into the home.

All the user level application data will get transferred this way, but you should not forget you also have backup a list of application packeges so you can recreate original application "loadout".

More complex solution is to go at it at block level:

  • dmraid/mdraid (depending who you ask - userland control program is mdadm)
  • lvm

Because you already have a LVM setup, I believe mdadm RAID is inapplicable to your situation. But dmraid would be a simple "no crap" setup.

Most complex setup would be to mess with LVM where you can go about it two ways (I think - if you count LVM snaphosts): mirrors and snaphots. In my experience both LVM ways are sub-optimal for you. Probably saner one would LVM mirror. Unfortunately due to my dislike of LVM, I will refrain from any advice in that matter.

Keep in mind, that KISS options listed have added benefit, that they will re-defragment (not much of an issue on SSDs - but still effective) the files during transfer. Files will be recreated on target volume in linear fashion, getting defragmented in process.

On the other hand, you might lose some space like "holes" in sparse files (but if I am not mistaken, modern rsync should have options even for that). If you know, that you are using lot of sparse files, you can use du --apparent-size to calculate worst maximum size for the volume in question.

Unfortunately, these days, it's very hard to say, just like with "virtual memory" how much space will be effectively consumed by your files on new volume media (the size actually used depends on too great numbers of factors).

Sadly, block level transfers like LVM mirror, are suboptimal as "new" volume will inherit "bad state", if there was any, from the mirrored volume, so in this case I would be against it.

It's always safer and saner to recreate filesystem, than to drag it around.

If you are feeling paranoid, you should ensure /home is not accessed during transfer process (you can even go into single user mode) and if you are really mad, you can calculate sha256sum hashes of all the files (beware this can take days, and files cannot be written into!) and compare them with transfer.

As always, archlinux wiki has pretty good introductory articles:

I would suggest against exprimenting with redundancy on your own crucial data, use VMs for such experimentations, it is much simpler, and you can break and learn much more.

1
  • Great feedback. Will need some time to digest. Commented Jan 28, 2023 at 7:16

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.