5

Currently I have a file server running Centos 9 Stream. The operating system is on its own disk and own logical volume group and logical volumes. The are a further 3 disk attached to the motherboard which are used for the storage for the network shares. I don't have spare sata ports

# df -h
Filesystem                          Size  Used Avail Use% Mounted on
devtmpfs                            4.0M     0  4.0M   0% /dev
tmpfs                               3.7G     0  3.7G   0% /dev/shm
tmpfs                               1.5G   38M  1.5G   3% /run
/dev/mapper/cs-root                  25G  5.9G   20G  24% /
/dev/mapper/cs-home                  10G  104M  9.9G   2% /home
/dev/mapper/cs-var_log               14G  191M   14G   2% /var/log
/dev/sda1                           974M  366M  541M  41% /boot
/dev/mapper/vg_storage-lv_database   98G   23G   71G  25% /storage/database
/dev/mapper/vg_storage-lv_data      1.7T  1.6T   39G  98% /storage/data
tmpfs                               756M     0  756M   0% /run/user/0
# pvs -o pv_name,pv_size,pv_free -O pv_free
  PV         PSize   PFree
  /dev/sdb   931.51g    0 
  /dev/sdc   931.51g    0 
  /dev/sdd   931.51g    0 
  /dev/sda2  <54.60g 4.00m 
# lvs -o lv_name,vg_name,lv_size,lv_layout
  LV          VG         LSize   Layout             
  home        cs          10.00g linear             
  root        cs          25.00g linear             
  swap        cs           5.59g linear             
  var_log     cs          14.00g linear             
  lv_data     vg_storage   1.72t raid,raid5,raid5_ls
  lv_database vg_storage 100.00g raid,raid5,raid5_ls

From the outputs above the logical volumes lv_data and lv_database in the vg_storage group are raid5.

I read the following Red Hat documentation about replacing disks in a RAID array. Currently the disks are 1TB which I would like to replace with 8TB disks. After reading the documentation I have the following questions:

  1. Can replace the disks in LVM disks with larger ones?
  2. The documentation talks about raid1. Is this also valid for raid5?
  3. How should I physically remove one of the disks from the array before adding the new disk?
  4. In the examples it shows just one logical volume. In my case I have 2 logical volumes. Is it question of repeating the operation?

2 Answers 2

8

Yes, this is possible, but you need to follow the instructions in the chapter on Replacing a failed RAID device in a logical volume:

  1. Back up your data.

  2. Power down the system, remove one disk, add a new larger disk, power the system back up; if the system doesn’t come back up correctly, you can revert the replacement (and ignore the rest of these instructions…).

  3. Create a PV on the new disk (pvcreate) and add it to the volume group (vgextend).

  4. Repair the logical volumes and the volume group:

    lvconvert --repair vg_storage/lv_data
    lvconvert --repair vg_storage/lv_database
    vgreduce --removemissing vg_storage
    
  5. Repeat with the remaining disks until the entire array has been migrated.

The documentation you looked at assumes the old and new disks are connected simultaneously.

0
0

While RedHat/IBM may have published some recipes, other than the contents of RedHat support contracts or personal trust in that company, real world procedures need not follow those recipes directly.

In general, the procedure would go like this. The procedure below is written to always have the full set of redundant data, sometimes even more.

  1. Having up to date backups is generally good, but not the subject of this question.

  2. Before doing major changes to a RAID array with no failing disks, I would suggest to run an "Integrity Scan" (generic term), i.e. a process that reads all logical blocks of the raid array in a special way which checks that the redundant blocks (in this case the parity blocks spread across the drives) match the data on the other drives, and if not alerts the sysadmin (you) and optionally takes a chosen corrective action (such as trust /dev/sdb less than the others, thus in case of inconsistency overwrite the /dev/sdb sector). RAID 6 or above might be able to take corrective actions based on which fix would need to distrust the affected block on only one drive, but in your setup there is only single redundancy. An integrity scan can be run in the background, taking only a portion of the drive bandwidth while allowing normal use to continue at a somewhat reduced speed (as the heads will need to seek back and forth between the track being scanned and the track where the file system needs to read or write).

There are persistent rumors (such as on ServeTheHome.com) that Linux RAID doesn't do integrity checks automatically by default, so you probably have to do it manually with some utility.

  1. Temporarily take the server offline.

  2. Use a different computer (or a different boot CD on the server computer) to dd the LVM partition metadata and content from an old 1TB drive to a chosen location on a 8TB drive, probably the beginning. If time allows compare the copy to make sure it is identical. Finally adjust the partition tables and LVM tables on the new 8TB drive to be consistent with the larger drive size but still valid as part of the existing 3x1TB array, as required for step 6. Copying 1TB from one mechanical drive to another will take significant amount of time (1TB divided by the "sustained sequential R/W speed" of the slowest of the 2 drives). Comparing will take about the same amount of time.

  3. Put the new 8TB drive into the server where the old 1TB drive was.

  4. Use whatever incantations to make the LVM administration daemons accept the filled part of the new 8TB drive as part of the original 3x1TB array, thus giving you a still valid 2TB (usable) fully redundant array and an online server.

  5. Repeat steps 2 to 6 for the other 2 drives.

  6. Now you have a 3x1TB array occupying only part of the space on a physical 3x8TB array.

  7. Somehow modify the drive partition tables and LVM metadata to enlarge the array to the full size of the drives, giving you (at block device level) a 3x8TB RAID 5 array with 16TB usable. I have no information on how to do this with LVM or other tools. Many GUI tools or other "idiot-proof" tools from 15 years ago lacked this ability, so research and hex editing may be needed, plus of cause unmounting and remounting the array.

  8. unmount the logical drive, Use e2fsresize to resize the file system size from 2TB to 16TB, fsck and remount the logical drive.

  9. Enjoy your new 16TB logical drive .

5
  • 1
    There are a few gotchas in your procedure… In step 4, you need to make sure the (partial) VGs aren’t activated. In steps 2, 6, 9, and 10, you should provide the incantations; as it is your answer is largely useless for someone like the OP who doesn’t already know how to do all this. If you check the question in detail, you’ll also see that the partition manipulation is unnecessary, which is just as well in your case since going from a 1TiB drive to an 8TiB drive might involve converting partition tables from MBR style to GPT. Commented Aug 29 at 9:19
  • Unfortunately, I don't know the specifics of the "LVM" toolset commands (let alone whatever version the OP has available), so I provided the information that I do have based on experience with other RAID stacks, allowing other contributors to expand on missing details. The main point of my answer was to provide a procedure that doesn't work by throwing away the contents of one physical disk at a time, merely hoping it would be recreated. Commented Sep 2 at 12:56
  • Also note that for step 4, the OS that looks for LVM Volume groups is NOT running, either the server is rebooted into a utility OS, or the server is shut down and not running at all. So in neither case can the not running server OS do anything right or wrong to the LVM Volume Groups. Commented Sep 4 at 15:46
  • Most distros activate any volume groups they find; there’s no notion of a VG being reserved for a specific OS installation. So it is necessary to make sure the VGs on the disks discussed here are not activated. This is why it is dangerous to offer advice based on other RAID setups… Commented Sep 4 at 17:50
  • The point of step 4 was to boot something primitive like gparted live or a rescue disk. While raid partitions are obviously not restricted to a single OS, the aspect of an OS knowing or detecting the partition layers IS OS specific. A basic disk manipulation boot disk would normally not do anything to the disks unless/until told to do so, while a big CentOS/IBMHat setup might . Commented Sep 8 at 10:28

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.