Michael Graff asked:
When creating a linux software raid device as a raid10 device, I am confused why it must be initialized. The same question applies for raid1 or raid0, really.
Ultimately most people would put a file system of some sort on top of it, and that filesystem should not assume any state of the disk’s data. Each write will affect both disks in a raid10 or raid1 setup, where the N mirrors are written to. There should be no reason whatsoever for a raid10 to be initialized initially, as it will happen over time.
I can understand why for a raid5/6 setup where there is a parity requirement, but even then it seems like this could be done lazily.
Is it just so people feel better about it?
Remember that RAID 1 is a mirror, and that RAID 10 is a stripe of mirrors.
The question is, on which disk in each mirror is the data valid? In a freshly created array, this cannot be known, as the disks may have different data.
Remember also that RAID operates at a very low level; it knows nothing of filesystems or whatever data might be stored on the disk. There might not even be a filesystem in use.
Thus, initialization in these arrays consists of the data from one disk in each mirror being copied as-is to the other disk.
This also means that the array is safe to use from the moment of creation, and can be initialized in the background; most RAID controllers (and Linux mdraid) have an option for this, or do it automatically.
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.