mdadm RAID1 array only shows 1 drive in /proc/mdstat but --examine shows both drives as "Active" — is the array actually fine? Or has a drive failed?

Question

I received a warning during bootup that a hard drive has failed.

My first thought was it's probably one of the two older HDDs I have configured as a software RAID1 array.

So as soon as the boot finished I opened a terminal window and checked /proc/mdstat — sure enough it's only showing one drive.

But when I opened the "Disks" GUI utility (on Linux Mint - MATE) to find out which drive had failed, it reports both RAID members are OK and healthy.

When I run mdadm --examine on one of the drives it seems to indicate that both drives are active in the array, but when I run mdadm --examine on the other drive it reports the array as "A." (ie, one drive active one drive missing).

What's extra odd is /proc/mdstat only shows /dev/sdc1 so my assumption is /dev/sdb1 has failed(?), but when I run --examine on each drive, the array shows as "AA" (both drives active) when I examine /dev/sdb1 and it shows as "A." (one drive missing) when I examine /dev/sdc1 — so that makes me wonder if maybe /dev/sdb1 is the one that's failed?

Or am I just misinterpreting this output altogether? Is the array actually perfectly fine with two active drives, and just only showing one drive in /prod/mdstat for some reason?

Here's the output from /proc/mdstat and from --examine:

me@myhost:~$ cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sdc1[0]
      976630336 blocks super 1.2 [2/1] [U_]

unused devices: <none>

me@myhost:~$ sudo mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 77c520b4:61c7fbfc:c6747e05:39d9497e
           Name : myhost:0
  Creation Time : Tue Apr 26 15:28:37 2016
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953260976 sectors (931.39 GiB 1000.07 GB)
     Array Size : 976630336 KiB (931.39 GiB 1000.07 GB)
  Used Dev Size : 1953260672 sectors (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=261864 sectors, after=304 sectors
          State : clean
    Device UUID : c2a9f4c7:a5d1e0a1:b194b9b5:f4f5e15d

    Update Time : Tue Feb 21 20:14:46 2023
  Bad Block Log : 512 entries available at offset 264 sectors
       Checksum : a5274f16 - correct
         Events : 2622


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing, 'R' == replacing)

me@myhost:~$ sudo mdadm --examine /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 77c520b4:61c7fbfc:c6747e05:39d9497e
           Name : myhost:0
  Creation Time : Tue Apr 26 15:28:37 2016
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953260976 sectors (931.39 GiB 1000.07 GB)
     Array Size : 976630336 KiB (931.39 GiB 1000.07 GB)
  Used Dev Size : 1953260672 sectors (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
   Unused Space : before=262064 sectors, after=304 sectors
          State : clean
    Device UUID : 9a573ffb:6749773a:966affd9:f3415a64

    Update Time : Sun Jul  7 14:23:08 2024
       Checksum : 9e59e6d9 - correct
         Events : 322313


   Device Role : Active device 0
   Array State : A. ('A' == active, '.' == missing, 'R' == replacing)

me@myhost:~$ sudo mdadm --examine --scan
ARRAY /dev/md/0  metadata=1.2 UUID=77c520b4:61c7fbfc:c6747e05:39d9497e name=myhost:0

Stephen Harris · Accepted Answer · 2024-07-08 17:46:49Z

1

/dev/sdb1 data is incorrect and is over a year out of date.

You can see this by the Update time

    Update Time : Tue Feb 21 20:14:46 2023

answered Jul 8, 2024 at 17:46

Stephen Harris

49.3k7 gold badges115 silver badges138 bronze badges

Thanks, so just to confirm — that seems to indicate /dev/sdb has failed?

1337ingDisorder
– 1337ingDisorder

2024-07-09 03:28:15 +00:00
Commented Jul 9, 2024 at 3:28
@1337ingDisorder failed or kicked from the array for some other reason (cable problem, timeout, etc.). If you have system logs... from that long ago... might be worth checking.

frostschutz
– frostschutz

2024-07-09 11:57:13 +00:00
Commented Jul 9, 2024 at 11:57
Might also be worth doing a smartctl -a /dev/sdb and seeing if anything looks bad (eg reallocated sectors, or errors in the log).

Stephen Harris
– Stephen Harris

2024-07-09 14:24:19 +00:00
Commented Jul 9, 2024 at 14:24

Add a comment |

frostschutz · Accepted Answer · 2024-07-09 11:49:05Z

Unfortunately mdadm does not update metadata for kicked drives anymore. That's how you get this confusing situation where mdadm --examine of the bad drive somehow looks good...

Even mdadm does not know if a drive failed before. If there is no other drive that declares it as missing in its own metadata; it might start an array off a previously failed drive. And suddenly your filesystem travels back in time.

When looking at mdadm --examine you have to consider output as a whole across all members; not just a single one. Check Update Time and Events and Array State as shown by other drives.

Since parsing mdadm --examine output of multiple drives is difficult, you can kind of cheat and use mdadm --examine /dev/... | sort and see more directly if Array UUID, Update Time, State and other values match across drives.

When a drive fails you are supposed to react in a timely manner. If you are running mdadm monitor and have a mail system, mdadm should have sent you failure notification mails as well.

Stack Exchange Network

mdadm RAID1 array only shows 1 drive in /proc/mdstat but --examine shows both drives as "Active" — is the array actually fine? Or has a drive failed?

2 Answers 2

You must log in to answer this question.

Hot Network Questions

mdadm RAID1 array only shows 1 drive in /proc/mdstat but --examine shows both drives as "Active" — is the array actually fine? Or has a drive failed?

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions