How to get error detection and correction on a single hard drive on linux (with btrfs or other methods)

Question

One of the cool things about btrfs on linux is that it can correct bit rot if it has redundant data because of its per-block checksumming. I can get redundant data by setting up a raid1 with two disks. However, can I also get redundant data to prevent bit rot on a single disk?

I see that btrfs has a DUP option for metadata (-m dup) that stores two copies of the metadata on each drive. However, the documentation says that dup is not an option for data (i.e. -d dup is not an option). Is there a good way around this? Partition a single disk into two equal parts and raid1 them together?

Alternatively, is there another simple way to get file system level error detection and correction on linux (something like an automatic parchive for file systems)?

(I'm not interested in answers suggesting that I use two drives.)

EDIT: I did find this, which is a FUSE filesystem that mounts files with error correction as normal files. That said, it's a little hack/proof of concept the someone put together in 2009 and hasn't really touched since.

wazoox · Accepted Answer · 2016-03-09 15:10:31Z

1

Btrfs supports duplicated data blocks if you enable mixed block groups:

mkfs.btrfs --mixed --metadata dup --data dup /dev/<device>

EDIT: Notice that there is a patch so that we can do this without using mixed mode. Following that thread of Nov. 2015, it appears it is being added to the mainline btrfs code.

edited Mar 9, 2016 at 15:10

wazoox

1,37412 silver badges17 bronze badges

answered Jul 6, 2015 at 10:27

Vincent Yu

5064 silver badges11 bronze badges

--mixed is not recommended above 1GB filesystems.

Tom Hale
– Tom Hale

2019-09-17 16:04:52 +00:00
Commented Sep 17, 2019 at 16:04

Add a comment |

Franziscus · Accepted Answer · 2019-08-29 21:08:09Z

1

What about partitioning the drive with, eg., LVM into, say, 10 partitions and raid-5-ifying them on the block level? Wouldn't this give around 10% of redundancy for error correction?

answered Aug 29, 2019 at 21:08

Franziscus

111 bronze badge

2

It would also give you bad performance.

RalfFriedl
– RalfFriedl

2019-08-29 21:47:12 +00:00
Commented Aug 29, 2019 at 21:47
For storing files that aren't read/written often, that's a really creative idea!

Rucent88
– Rucent88

2022-09-19 18:51:52 +00:00
Commented Sep 19, 2022 at 18:51
RAID5 is able to: 1. Recreate lost data from one data member. 2. Detect corruption on one member (but not be able to tell you which one is corrupted). ... as such you could argue that it doesn't offer any meaningful error correction in the context that the author is talking about.

Tim Small
– Tim Small

2023-03-27 09:09:55 +00:00
Commented Mar 27, 2023 at 9:09
A big issue with RAID5 these days is that recovering a failed disk actually destroys the other "healthy" disks very often. It's especially prevalent with spinning disks over 2TB. A restore is basically guaranteed to destroy at least one other disk if each disk is 10TB spinning. I've actually seen 3 disks fail in sequence on a RAID5 restore, causing nearly a week of downtime. Source: my own experience (though I have seen some write-ups about it). Unsure how SSD is affected.

aggregate1166877
– aggregate1166877

2023-04-25 03:52:54 +00:00
Commented Apr 25, 2023 at 3:52

Add a comment |

phuclv · Accepted Answer · 2021-04-19 10:07:19Z

Similar to Btrfs's dup --data, ZFS allows you to store multiple data block copies with the zfs set copies command

zfs set copies=2 users/home

See Storing Multiple Copies of ZFS User Data

When a block is accessed, regardless of whether it is data or meta-data, its checksum is calculated and compared with the stored checksum value of what it "should" be. If the checksums match, the data are passed up the programming stack to the process that asked for it; if the values do not match, then ZFS can heal the data if the storage pool provides data redundancy (such as with internal mirroring), assuming that the copy of data is undamaged and with matching checksums. It is optionally possible to provide additional in-pool redundancy by specifying copies=2 (or copies=3 or more), which means that data will be stored twice (or three times) on the disk, effectively halving (or, for copies=3, reducing to one third) the storage capacity of the disk. Additionally some kinds of data used by ZFS to manage the pool are stored multiple times by default for safety, even with the default copies=1 setting.

https://en.wikipedia.org/wiki/ZFS#Data_integrity

Stack Exchange Network

How to get error detection and correction on a single hard drive on linux (with btrfs or other methods)

3 Answers 3

You must log in to answer this question.

Hot Network Questions

How to get error detection and correction on a single hard drive on linux (with btrfs or other methods)

3 Answers 3

You must log in to answer this question.

Related

Hot Network Questions