RAID array reverts to the old disk after reboot

Question

I have replaced my RAID array from RAID 1 with 2 x 4TB disks to 2 x 10TB disks using mdadm. The process can be summarized as follows: add the 2 new disks to the RAID array, wait for sync, remove the 2 old disks from array, grow and extend the file system. Everything worked fine. However, I did not unplug or wipe the old disks.

A few days later, one disk is failed, I remove one disk from array (mdadm --remove /dev/md125 /dev/sdf1) and I rebooted the server, but the RAID reverted to the old configuration (the RAID array consisted of the 2 old disks, with data and mount points reverting to their state before the change raid array). Can I re-create md125 to fix this?

Summary: Old: /dev/md125 (sdc1 + sdd1)

New: /dev/md125 (sdf1 + sde1)

Remove sdf1 and reboot server.

After Reboot: /dev/md125 (sdc1 + sdd1)

OS: Centos 7

Disk (lsblk) when replace raid array:

sdc         8:32   0   3.7T  0 disk
└─sdc1      8:33   0   3.7T  0 part
sdd         8:48   0   3.7T  0 disk
└─sdd1      8:49   0   3.7T  0 part
sde         8:64   0  10.9T  0 disk
└─sde1      8:65   0  10.9T  0 part
  └─md125   9:125  0   3.7T  0 raid1 /data
sdf         8:80   0  10.9T  0 disk
└─sdf1      8:81   0  10.9T  0 part
  └─md125   9:125  0   3.7T  0 raid1 /data

I expected the configuration to remain the same after the reboot, but it turned out like this:

sdc         8:32   0   3.7T  0 disk
└─sdc1      8:33   0   3.7T  0 part
    └─md125   9:125  0   3.7T  0 raid1 /data
sdd         8:48   0   3.7T  0 disk
└─sdd1      8:49   0   3.7T  0 part
    └─md125   9:125  0   3.7T  0 raid1 /data
sde         8:64   0  10.9T  0 disk
└─sde1      8:65   0  10.9T  0 part
sdf         8:80   0  10.9T  0 disk
└─sdf1      8:81   0  10.9T  0 part

mdstat before reboot:

Personalities : [raid1]  
md125 : active raid1 sde1[2]  
      11718752256 blocks super 1.2 [2/1] [_U]  
      bitmap: 22/22 pages [88KB], 262144KB chunk

blkid and mdstat.conf now:

# grep a5c2d1ec blkid.txt 
/dev/sdc1: UUID="a5c2d1ec-fa7f-bba4-4c83-bfb2027ab635" UUID_SUB="a411ac50-ac7c-3210-c7f9-1d6ab27926eb" LABEL="localhost:data" TYPE="linux_raid_member" PARTUUID="270b5cba-f8f4-4863-9f4a-f1c35c8088bf"
/dev/sdf1: UUID="a5c2d1ec-fa7f-bba4-4c83-bfb2027ab635" UUID_SUB="4ac242ae-d6a2-0021-cd71-a9a7a357a3bb" LABEL="localhost:data" TYPE="linux_raid_member" PARTLABEL="Linux RAID" PARTUUID="4ca94453-20e7-4bde-ad3c-9afedbfa8cdb"
/dev/sde1: UUID="a5c2d1ec-fa7f-bba4-4c83-bfb2027ab635" UUID_SUB="1095abff-9bbf-c705-839d-c0e9e8f68624" LABEL="localhost:data" TYPE="linux_raid_member" PARTLABEL="primary" PARTUUID="7bd23a9e-3c6f-464c-aeda-690a93716465"
/dev/sdd1: UUID="a5c2d1ec-fa7f-bba4-4c83-bfb2027ab635" UUID_SUB="0d6ef39a-03af-ffd4-beb8-7c396ddf4489" LABEL="localhost:data" TYPE="linux_raid_member" PARTUUID="be2ddc1a-14bc-410f-ae16-ada38d861eb3"
# cat /etc/mdadm.conf
ARRAY /dev/md/boot metadata=1.2 name=localhost:boot UUID=ad0e75a7:f80bd9a7:6fea9e4d:7cf9db57
ARRAY /dev/md/root metadata=1.2 name=localhost:root UUID=266e79a9:224eb9ed:f4a11322:025564be
ARRAY /dev/md/data metadata=1.2 spares=1 name=localhost:storage UUID=a5c2d1ec:fa7fbba4:4c83bfb2:027ab635

Thanks

frostschutz · Accepted Answer · 2024-11-27 12:20:11Z

Once a drive is removed from an array, its metadata is no longer updated. So the last metadata state is that of before it was removed, so it "looks good" unless another drive claims a newer update time and the like.

So in the case where you remove RAID-1 drives and expect the array to continue on two other drives, can lead to some confusion. Inherently this is a weakness of mdadm metadata format.

Furthermore, your mdadm.conf is too verbose. Remove metadata=, name=, and spares= in particular. Such entries might cause your new array to be ignored (in this example, if no spares are present).

If you wish to keep both arrays around, at minimum you should change the UUID for one (mdadm --assemble --update=uuid) and also change UUIDs of whatever is on that array. You should also consider --update=super-minor and/or --update=name.

If you do not wish to keep the array for the old drives around, --stop the wrong array then --zero-superblock its drives.

You can also re-create, however, do not expect data on these drives to be available afterwards. This is only the case under the right conditions, see Should I use mdadm --create to recover my RAID?

Thank you for your response. I made a mistake by keeping two old drive on server with do nothing. Can I erase the data on the old drives, then reboot, and the RAID will automatically read the metadata from the drives that still have the information and return to normal? Or updating the UUID, how can I update /dev/md125 so that it points to my new drives? I want /dev/md125. I want my /dev/md125 to revert to the new drives and have no connection to the old ones in this situation. Once it is restored, I will remove those two old drives from the server. — Khang Nguyen Phuc
– Khang Nguyen Phuc, Commented Nov 27, 2024 at 13:20

penguin359 · Accepted Answer · 2024-11-27 21:56:55Z

In addition to the excellent answer @frostschutz gave, you may also have an issue if your initrd image is out-of-date. Anytime mdadm.conf is modified, it's usually a good idea to update the initrd image as it will have an embedded copy of it so it can bring up the RAID array before it has to mount the root file system. For Red Hat derived systems like CentOS, you'll want to run:

dracut -f

And for Debian derived distributions like Ubuntu, you'd want to run:

update-initramfs -u -kall

Also, you should remove unnecessary parameters from mdadm.conf like spares=, but I would keep the metadata= as it tells mdadm where to find the metadata block which can be in one of three different locations on a partition. Finally, anytime a disk is removed from the RAID array, it's generally a good idea to use mdadm --zero-superblock on it so it won't be re-detected later with out-of-date information.

Stack Exchange Network

RAID array reverts to the old disk after reboot

2 Answers 2

You must log in to answer this question.

Linked

Hot Network Questions

RAID array reverts to the old disk after reboot

2 Answers 2

You must log in to answer this question.

Linked

Related

Hot Network Questions