Recently, I replaced a device on two Btrfs RAID-1 filesystems because disks needed to be replaced.
I did it like this:
mount -o noatime,degraded /dev/sda3 /mnt/tmp
btrfs fi show /dev/sda3
btrfs replace start -B 1 /dev/nvme0n1p3 /mnt/tmp
btrfs replace status /mnt/tmp
btrfs fi show /dev/sda3
btrfs fi show /mnt/tmp
btrfs scrub start -B /mnt/tmp
mount -o noatime /dev/sda3 /mnt/tmp
ls /mnt/tmp
btrfs fi show /mnt/tmp
umount /mnt/tmp
That means:
- I mounted the device because
btrfs replaceonly seems to support replacing a device on a mounted filesystem - I checked with
btrfs fi showwhat device id is missing (1in this example) - I thus replaced the missing device
1with thenvme0n1p3device on the new disk - the final show looked good and the scrub didn't complain
- and umount and mount (without degraded) after the procedure worked fine
- no errors were reported by these commands or in the kernel log
However, after I've now removed the old leg (i.e. /dev/sda3) the filesystem can't be mounted, anymore:
mount -o noatime,degraded /dev/nvme0n1p3 /mnt/tmp
mount: /mnt/tmp: wrong fs type, bad option, bad superblock on /dev/nvme0n1p3, missing codepage or helper program, or other error.
fails and yields:
Dec 22 09:41:34 BTRFS info (device nvme0n1p3): allowing degraded mounts
Dec 22 09:41:34 BTRFS info (device nvme0n1p3): disk space caching is enabled
Dec 22 09:41:34 BTRFS info (device nvme0n1p3): has skinny extents
Dec 22 09:41:34 BTRFS warning (device nvme0n1p3): devid 2 uuid f9c9c081-0fdc-4b61-8329-c1addb51e3fe is missing
Dec 22 09:41:34 BTRFS error (device nvme0n1p3): failed to read chunk root
Dec 22 09:41:34 BTRFS error (device nvme0n1p3): open_ctree failed
So my expectations were that:
- a
btrfs replacecommand is sufficient to replace a missing device on a RAID-1 btrfs filesystem - especially,
btrfs replacewould copy all data/metadata from the remaining to the newly added device
Since this didn't work I'm not sure anymore whether a btrfs replace is always sufficient.
Does a btrfs replace perhaps needs to be followed by an explicit balance?
For example by something like this?
btrfs balance start -dconvert=raid1,soft -mconvert=raid1,soft /mnt/tmp
Additional info:
- so the overall objective was to replace both legs of RAID-1 btrfs filesystems, i.e. do the replacements in 2 steps - first the left leg then the right leg
- Replacements done on Fedora 33 (kernel 5.8.18-300.fc33.x86_64 and btrfs-progs-5.7-5.fc33.x86_64)
- Btrfs filesystems were created on Ubuntu 20.04
btrfs rescue chunk-recoverdidn't help
Now the btrfs fi show reports:
btrfs fi show /dev/nvme0n1p3
warning, device 2 is missing
warning, device 2 is missing
bad tree block 934674432, bytenr mismatch, want=934674432, have=0
ERROR: cannot read chunk root
Label: none uuid: 1c1a03db-38c2-4b08-a2ec-47d200f98b0a
Total devices 2 FS bytes used 196.62MiB
devid 1 size 1.00GiB used 758.38MiB path /dev/nvme0n1p3
*** Some devices missing
I don't know why the warning is repeated two times.
Second Example
On the same system, replacing a missing disk in another Btrfs RAID-1 filesystem failed in a similar way.
Replacement procedure:
mount -o noatime,degraded /dev/mapper/new-root-1 /mnt/tmp
btrfs fi show /mnt/tmp
btrfs replace start -B 1 /dev/mapper/new-root-0 /mnt/tmp
journalctl -fk
btrfs fi show /mnt/tmp
btrfs scrub start -B /mnt/tmp
umount /mnt/tmp
Mount fails after the other leg is removed:
mount -o noatime,degraded /dev/mapper/new-root-0 /mnt/tmp
mount: /mnt/tmp: wrong fs type, bad option, bad superblock on /dev/mapper/new-root-0, missing codepage or helper program, or other error.
Logged errors during mount:
Dec 22 09:57:12 BTRFS info (device dm-1): allowing degraded mounts
Dec 22 09:57:12 sos.lru.li kernel: BTRFS info (device dm-1): disk space caching is enabled
Dec 22 09:57:12 BTRFS info (device dm-1): has skinny extents
Dec 22 09:57:12 BTRFS warning (device dm-1): devid 2 uuid 3093e508-17e0-4f5c-af13-642954e6fd9b is missing
Dec 22 09:57:12 BTRFS warning (device dm-1): devid 2 uuid 3093e508-17e0-4f5c-af13-642954e6fd9b is missing
Dec 22 09:57:12 BTRFS info (device dm-1): bdev (efault) errs: wr 0, rd 8, flush 0, corrupt 0, gen 0
Dec 22 09:57:12 BTRFS warning (device dm-1): chunk 69823627264 missing 1 devices, max tolerance is 0 for writable mount
Dec 22 09:57:12 BTRFS warning (device dm-1): writable mount is not allowed due to too many missing devices
Dec 22 09:57:12 BTRFS error (device dm-1): open_ctree failed
Missing device warning is repeated even four times here:
btrfs fi show /dev/mapper/new-root-0
warning, device 2 is missing
warning, device 2 is missing
warning, device 2 is missing
warning, device 2 is missing
Label: none uuid: 3e861d70-9a98-402d-8bbc-ddec6f869433
Total devices 2 FS bytes used 62.81GiB
devid 1 size 231.67GiB used 65.01GiB path /dev/mapper/new-root-0
*** Some devices missing
mkfs.btrfs --data raid1 --metadata raid1 /dev/... /dev/.... In contrast to the other commands, I don't have the mkfs commands recorded in a history file, anymore. However, I have some notes that feature that form and I even created one filesystem also with the--mixedswitch which makes--metadatamandatory. (If not specified the command fails.) Since I wasn't sure whether metadata also defaults to raid1 or not if one only specifies--data raid1, I explicitly also specified--metadata raid1.sudo btrfs filesystem usage /mnt/tmp. It should you how the data/metadata is distributed on each device.