I’ve recently read this post:
tldr; It basically says that the error rate of a RAID system is tied to the hardware, and you need disks with a higher URE to make a RAID more stable.
We run a RAID 5 with these details:
- 6 x 2TB drives = 10TB useable
- URE <1 in 10^14 (each drive)
Understanding the article above I would like to check my numbers with someone else
- 10^14 bits gives me 12,500GB’s per chance of a URE
- 12,500GBs / 10TBs = 1.25
Does that mean If one of my drives fails (lost 2 in the last 3 months) and I replace it, I have a 1 in 1.25 chance of getting a URE on one of my other drives?
I’m no expert so please feel free to tear this one apart and help me learn.
Everything on all of the remaining disks has to be read in order to reconstruct the data on the replacement drive.
So you are reading 5x2TB=10TB.
10^14 is 100 trillion bits, or 12.5 trillion bytes.
10/12.5, then, seems to be correct. You have approximately an 80% chance of having a URE occur during the rebuild. (For statistical reasons I don’t quite fully understand, this is a bit wrong, but it’s close enough to illustrate the risk. I welcome a good explanation of how the math really works.)
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.