Linux VM's disk went read-only = no choice but reboot?

Question

I have several Linux VMs on VMware + SAN.

What happened

A problem occured on the SAN (failed path) so that for some time, there were I/O errors on the Linux VMs drives. When the path failover had been done, it was too late: every Linux machine considered most of its drives as not "trustworthy" anymore, setting them as read-only devices. The root filesystem's drives were also impacted.

What I tried

mount -o rw,remount / without success,
echo running > /sys/block/sda/device/state without success,
dug into /sys to find a solution without success.

What I may have not tried

blockdev --setrw /dev/sda

Finally...

I had to reboot all my Linux VMs. The Windows VMs were fine...

Some more info from VMware...

The problem is described here. VMware suggests to increase the Linux scsi timeout to prevent this problem to happen.

The question!

However, when the problem does eventually happen, is there a way to get the drives back to read-write mode? (once the SAN is back to normal)

I have been able to get a disk back to read-write with mount -o remount /mountpoint on a real system. Perhaps that would work inside a VM too. — Michael Suelmann
– Michael Suelmann, Commented Nov 2, 2013 at 17:08
Thanks, but I already tried that, and it didn't work... I've edited my question accordingly. — Totor
– Totor, Commented Nov 2, 2013 at 17:35

doneal24 · Accepted Answer · 2014-03-19 12:51:55Z

1

We have had this problem here a couple of times, usually due to the network going down for an extended period. The problem is not that the file system is read-only but that the disk device itself is marked read-only. No option here other than reboot. Increasing the scsi timeout will work for transient glitches such as a path failover. Won't work well for a 15-minute network outage.

answered Mar 19, 2014 at 12:51

doneal24

5,9432 gold badges21 silver badges42 bronze badges

Hmm, this is a Linux kernel limitation then. It deserves a bug report...

Totor
– Totor

2014-03-20 14:34:15 +00:00
Commented Mar 20, 2014 at 14:34
Did you try blockdev --setrw /dev/sda? I edited my question accordingly.

Totor
– Totor

2014-04-08 17:37:01 +00:00
Commented Apr 8, 2014 at 17:37
That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.

doneal24
– doneal24

2014-04-10 12:34:00 +00:00
Commented Apr 10, 2014 at 12:34

Add a comment |

Pierre.Sassoulas · Accepted Answer · 2017-07-06 09:19:32Z

From the man of mount :

   errors={continue|remount-ro|panic}
              Define the behavior  when  an  error  is  encountered.   (Either
              ignore  errors  and  just mark the filesystem erroneous and con‐
              tinue, or remount the filesystem read-only, or  panic  and  halt
              the  system.)   The default is set in the filesystem superblock,
              and can be changed using tune2fs(8).

So you should mount your VM with the continue option instead of remount-ro.

mount -o errors=continue
mount -o remount

G_Style · Accepted Answer · 2017-08-04 15:58:45Z

1

I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.

vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name

Then you must reactivate them.

vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name

Then just try and remount everything with a mount -a.

answered Aug 4, 2017 at 15:58

G_Style

2712 silver badges5 bronze badges

Probably doesn't work for the root / filesystem, which was the problematic fs in my case...

Totor
– Totor

2017-08-10 13:55:43 +00:00
Commented Aug 10, 2017 at 13:55
Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.

G_Style
– G_Style

2017-08-10 20:18:11 +00:00
Commented Aug 10, 2017 at 20:18

Add a comment |

gerardw · Accepted Answer · 2018-11-19 16:20:27Z

0

Having run test cases using a test VM running on an NFS datastore that I've been intentionally disabling, I haven't found anything that worked. The blockdev command didn't work, and the vg / lv commands do refuse to work on a mounted root / system.

At this point, the best option seems to be to set errors=panic in /etc/fstabso the VM just hard fails.

answered Nov 19, 2018 at 16:20

gerardw

1831 silver badge5 bronze badges

Add a comment |

Stack Exchange Network

Linux VM's disk went read-only = no choice but reboot?

What happened

What I tried

What I may have not tried

Finally...

Some more info from VMware...

The question!

4 Answers 4

You must log in to answer this question.

Hot Network Questions

Linux VM's disk went read-only = no choice but reboot?

What happened

What I tried

What I may have not tried

Finally...

Some more info from VMware...

The question!

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions