Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Loa

macrumors 68000
Original poster
May 5, 2003
1,729
78
Québec
Hello,

Last night my Carbon Copy Cloner backup threw up a flag: I have a read error on a single file. I checked with Finder and I can't open that file. I can get that file back from a back-up.

The volume is a software RAID0.

Should I worry about it?

Loa
 
I'd do a surface scan on all drives, make sure you have backups. Raid-0's can also have other errors with software raid, run the first-aid and pray to all the Gods you believe in and most you don't.

I like this utility, I'm sure there are some free ones out there:
http://scsc-online.com/Scannerz.html
 
Hello,

Last night my Carbon Copy Cloner backup threw up a flag: I have a read error on a single file. I checked with Finder and I can't open that file. I can get that file back from a back-up.

The volume is a software RAID0.

Should I worry about it?

Loa
The proper approach is anywhere on the scale from abject panic to mostly ignoring it.

Disk drives do have random sectors that go bad - and almost all of them will do error recovery and remap the bad sector to a spare good sector. If the drive notices that it has trouble with a sector - having to re-read several times to get a correct copy - once it gets a good read it can copy the data to a spare sector. You won't even see a read error.

If it can't get a good read, it marks the sector as bad, and you see the error. The sector remains bad until that sector is overwritten - at which time it will be revectored and the error will be gone.

The S.M.A.R.T. logs for the drive will show you the counters for these events. Here is a good description of how to interpret the numbers. Basically a few revectored blocks are not a worry (but should be watched). More than a few, and especially if they're increasing calls for "abject panic mode". The drive may be slowly failing, or headed for a sudden catastrophic failure.

The S.M.A.R.T. "long test" will do a complete surface scan and update the error logs. Scanning the RAID-0 volume won't check any hidden meta-data used by the RAID software - the S.M.A.R.T. test hits every sector. A filesystem scan of the files will show you the existing files with bad blocks. Bad blocks that aren't currently inside a file will be corrected when a file is put on those blocks. So if a raw disk scan that show bad blocks, but a file scan doesn't show bad blocks - that means that you don't have any corruptions, but you need to watch for the rate at which bad block events happen.

By the way, if your system has occasional (or frequent) hangs where it does nothing for 20 or 30 seconds and then continues - you might have a failing disk. Those 20-30 second hangs can be the disk doing the re-reads I described above.
 
Last edited:
Hello,

I'll give DriveDX a spin and will report back (probably on Tuesday).

Thanks!
 
Hello,

I'll give DriveDX a spin and will report back (probably on Tuesday).

Thanks!
Good luck - look at it as "a bad block or two a year isn't worrisome, but a couple a week is panic time".

S.M.A.R.T. is a great thing. My HPE servers' warranties will replace drives based solely on a S.M.A.R.T. "predicted failure" diagnosis - they'll send a replacement drive even before any visible problem. When a RAID controller notices a "predicted failure", it will replace the drive with a hot spare and send me email to get a replacement.
 
The drives are 5 years old (WD blacks), just out of warranty, but they have been used a lot. If they die and I have to replace them, I won't be pissed: they've served their purposes.
 
Well, DriveDX won't even let me do the extended test. It finds the read error and stops. I have no way of knowing if there is only one or more. A real life copy shows only one, but I'd like to do a complete test...
 
Well, DriveDX won't even let me do the extended test. It finds the read error and stops. I have no way of knowing if there is only one or more. A real life copy shows only one, but I'd like to do a complete test...
I think that you need to get to the raw S.M.A.R.T. tests. Run the hardware level diagnostics without any confusion from software RAID drivers.

When I need to diagnose disk issues like this, I'll boot a standalone Linux from CD or a USB thumb drive. "Hiren's Boot CD" is one good one, or any recent Ubuntu installation disk. (Ubuntu will boot to a menu where you can choose "run from the CD" or "destroy everything and install Ubuntu". Take the first option.) I like Hiren, but it needs to be booted in legacy BIOS mode. Not sure how Apple's bastard old implementation of EFI works for this.
 
In my experience, once you start getting errors, you should start looking to replace the drive. Then again, I'd prefer to replace early and ensure no data loss, instead of trying to recover what I can.
 
I booted in OS X 10.8 and ran DriveDX, which surprisingly worked a lot better on that OS. Something is definitely off with the drive, but I can't get a good idea of the severity of the problem. I'm backing up more regularly and planning on buying a 6TB Black to replace my 4TB RAID.

Loa
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.