Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
A 6-drive RAID0 has only a 32% survival probability.

I don't buy the 32% survival rate.

Of course. Accepting this would be mathematical buffoonery.

Lets say there's about a 1 in 10,000 chance that a single drive will suddenly die in such a way as to affect a RAID0 stripe set prior to or within the manufacturer stated specifications. I would contend it's actually much lower than that myself - more like 1 in 100,000 since manufacturer specifications are extremely conservative, but we'll say 1:10,000 for now. All you need do is divide that by the number of drives you're using. So with 6 drives in use (no matter how they are configured) there is a 6 in 10,000 or 1:1666.67 chance your RAID data will be lost due to drive failure. Again I think it's more like 1:100,000 but even if it's just slightly over half that say about 1:60,000... then with six drives in the system it would be about a 1 in 10,000 chance of ruin. That seems about right to me.

And of course no matter what you actually believe the odds are they are almost completely eliminated just by having a backup system like TimeMachine employed. ;)
 
Last edited:
Of course. Accepting this would be mathematical buffoonery.

As I had qualified, the aforementioned failure rates (and calculator I linked to) are based off of published research by Google - who has captured real world statistics of consumer drives, and the 32% number was also based off of a proposed 3-year life of the array.

You should be aware of that research since you pasted their graph data a few posts up (but didn't cite the work).

Failure Trends in a Large Disk Drive Population
 
A fun thing to consider is that disk drives increase in capacity but not in their error rates which are somewhat constant. This will surprise many people with a large raid 5 disk set when they must rebuild parity. If a drive fails, every bit in the raid set must be read without error to recalculate the raid set parity. If the odds of getting an error from the disk are one to 1 million, million, you should calculate the maximum size of a raid set that one could use and be assured the raid set can regenerate parity.

Soon there will be 5TB disks, then 10, and using the HAMR, maybe 60TB disks. Of course, if they don't improve error rates, one won't be able to read the bits back without error.

I am switching to ZFS for archiving data. ZFS solves many problems, and the drives are even OS independent.
 
A fun thing to consider is that disk drives increase in capacity but not in their error rates which are somewhat constant. This will surprise many people with a large raid 5 disk set when they must rebuild parity. If a drive fails, every bit in the raid set must be read without error to recalculate the raid set parity. If the odds of getting an error from the disk are one to 1 million, million, you should calculate the maximum size of a raid set that one could use and be assured the raid set can regenerate parity.

Soon there will be 5TB disks, then 10, and using the HAMR, maybe 60TB disks. Of course, if they don't improve error rates, one won't be able to read the bits back without error.

I am switching to ZFS for archiving data. ZFS solves many problems, and the drives are even OS independent.

I considered ZFS.
Since I use Time Machine instead of redundant drives (as in RAID1, 5, 10, 50, ect., parity is less of an issue.
 
Your data might have more hidden errors than you think.

From the CERN/IT study:

Data integrity
Bernd Panzer-Steindel, CERN/IT Draft 1.3 8. April 2007
We have established that low level data corruptions exist and that they have several origins. The error rates are at the 10-7 level, but with complicated patterns. To cope with the problem one has to implement a variety of measures on the IT part and also on the experiment side. Checksum mechanisms have to implemented and deployed everywhere. This will lead to additional operational work and the need for more hardware.
 
A backup is not something that is used instead of RAID. RAID is not a backup.

Some use RAID that way, I don't.
I have no need to "rebuild" my array.
I use RAID0 for the performance qualities of it.
We are somewhat off topic, and have hijacked the OP's thread.
I apologize OP.
 
A fun thing to consider is that disk drives increase in capacity but not in their error rates which are somewhat constant.

But they do increase in their ability to deal with errors and in the ability to self-rectify. This actually makes newer higher density drives offset favorably. ;)

I am switching to ZFS for archiving data. ZFS solves many problems, and the drives are even OS independent.

Ya ZFS and RAID-Z are nice. Too bad OS X doesn't implement it... I mean what the heck, they implement a bunch of other freeware... :p


I considered ZFS.
Since I use Time Machine instead of redundant drives (as in RAID1, 5, 10, 50, ect., parity is less of an issue.

A backup is not something that is used instead of RAID. RAID is not a backup.

He said instead of redundancy which is a part many RAID levels and is a backup-like feature. Redundancy as in RAID5 and 6 is implemented in order to protect against drive failure. A backup scheme like Time machine includes that and also protection against accidental deletion, file corruption, add roll-back-like features, and oh so much more. Timemachine is an order of magnitude better than redundancy for small systems (anything that can fin on or under your desk) and itself includes redundancy.




Of course. Accepting this would be mathematical buffoonery.
As I had qualified, the aforementioned failure rates (and calculator I linked to) are based off of published research by Google

It doesn't matter to me who the buffoons work for... ;)


.
 
Last edited:
The redundancy that RAID provides is in no way a backup-like feature. RAID is designed for speed and/or uptime no as a substitute for backups.

No. There are two distinct aspects of RAID. One is striping and the other is redundancy. The redundancy aspect of it offers almost no speed increase at all. Even just using common sense with no detail knowledge at all should tell you that much. There are RAID levels with only redundancy, only striping, and combinations of both.

The redundancy of RAID is just as I said, as a backup mechanism. And a very lame and limited backup at that. Because it's a hardware level backup it only ensures hardware level integrity and not data integrity beyond that extent. Thus if file corruption occurs (for example) for almost any reason it offers no remedy or recourse. Same with accidental deletion, malware affected files, the affects of rouge or poorly coded utilities, and software bugs that may affect file or volume structure - etc. etc.. A timed iterative software backup on the other hand does cover all those and more. Don't confuse yourself with the terminology. We can correctly call what TimeMachine does "redundancy" as well. ;)
 
Last edited:
The redundancy of RAID is just as I said, as a backup mechanism. And a very lame and limited backup at that. Because it's a hardware level backup it only ensures hardware level integrity and not data integrity beyond that extent.

If its a backup as you say then recover an earlier version of a file. You can't because it's not a backup. RAID1 is designed for uptime and not for use as a backup. You can call it a backup but it is not, it is a protection against hardware failure.
 
If its a backup as you say then recover an earlier version of a file. You can't because it's not a backup. RAID1 is designed for uptime and not for use as a backup. You can call it a backup but it is not, it is a protection against hardware failure.

It's a hardware level backup. You can restore a previous version of the drive - just prior to it's demise. See how that works? It's a hardware level backup - which includes all the data contained on that drive or drives in the case of RAID6. The purpose of the redundancy in the case of RAID is to be able to extrapolate and/or maintain a clone (ie. backup) of any one (or two in RAID6 and all in RAID1) of the member drives.

Maybe it would be more clear if I said: It's a "Drive Backup" system and not a "File Backup" system? You can restore on the Drive Level (which includes all the files and structures).

When you think about it ya, you can even restore files. Imagine the simple scenario where you have a file in RAID5 which gets partially trashed when one of drives gets a frozen spindle bearing. The missing file(s) are automatically restored by calculating from the distributed parity from the other member units. Since it's a hardware level backup however there's almost no user control. It's not an iterative backup either where previous versions can be pulled and so on it's a backup in clone/mirror type terms where the RAID5 file in question exists both in it's original standard form and in a recalculable (for parity) form - but it's still a backup system - it's just not a very useful one - which is my main point here. I guess I should feel vindicated as it's so un-good, you don't even recognize it as being a backup. :)


C'mon now, not a troll. He is for real, if you don't agree - post up. Calling someone a troll who is a consistent quality contributor is just wrong.

Hey, thanks for that man! That instills a good feeling - as truth usually does.
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.