Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

mattmower

macrumors regular
Original poster
Aug 12, 2010
119
19
Berkshire, UK
TLDR; How can I find out if I've just been unlucky and corrupted my disk somehow (e.g. we've had several storms recently, perhaps a bad write), or if there's something seriously wrong with it and it's going to fail totally.

I've run out of disk space in my Mac Pro. I decided to buy a new 3TB disk and move TimeMachine to that, freeing up the 1.5TB disk it was using.

I installed the new disk, formatted it, then tried to copy the Backups.backupdb folder across. About half way through it died with an error in Finder. I confess I wasn't paying too much attention at the time.

So I started looking into how to do this without using Finder and came up with the suggestion to use Disk Utility to 'restore' the old disk to the new one. That failed with an error 254 "Could not validate source".

At that point I tried to use disk utility to verify the old disk and it came up with some "invalid node" errors in the catalog, said it was unrecoverable and wouldn't remount it.

This morning I restarted the MP and it mounted the disk readonly and I was able to copy off some files I wanted. I'm going to jettison the TimeMachine backup and start again (coincidentally Backblaze just finished my first full backup).

But, before I reformat the disk and start using it I wondered what I could do that would actually test it, e.g. a full surface scan, to see if there's something more serious wrong with it and I should be thinking about getting it replaced.

FWIW the SMART status is verified.

Any suggestions?

Kind regards,

Matt
 
TLDR; How can I find out if I've just been unlucky and corrupted my disk somehow (e.g. we've had several storms recently, perhaps a bad write), or if there's something seriously wrong with it and it's going to fail totally.
...
FWIW the SMART status is verified.
First, the SMART status isn't as smart as it should be. There appear to be issues that aren't covered by the indicator.

Did you just use Verify in Disk Utility? Or did you actually try the Repair option?
If you just did verify, try the Repair option.

What brand and model is the disk and how old is it?
 
First, the SMART status isn't as smart as it should be. There appear to be issues that aren't covered by the indicator.

Did you just use Verify in Disk Utility? Or did you actually try the Repair option?
If you just did verify, try the Repair option.

What brand and model is the disk and how old is it?

Interesting. I installed smartmontools via homebrew and it reports that SMART is supported but not turned on. For all 3 of my disks. I'm a little puzzled by that. I'm not sure what to make of this:

Code:
Shaddam:~ matt$ smartctl -a /dev/disk2
smartctl 5.42 2011-10-20 r3458 [x86_64-apple-darwin11.3.0] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (Adv. Format)
Device Model:     WDC WD15EARS-00S8B1
Serial Number:    WD-WCAVY3675819
LU WWN Device Id: 5 0014ee 259cb0a99
Firmware Version: 80.00A80
User Capacity:    1,500,301,910,016 bytes [1.50 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sat Apr 28 14:16:36 2012 BST
SMART support is: Available - device has SMART capability.
SMART support is: Disabled

SMART Disabled. Use option -s with argument 'on' to enable it.
Shaddam:~ matt$ man smartctl
Shaddam:~ matt$ smartctl -H /dev/disk0
smartctl 5.42 2011-10-20 r3458 [x86_64-apple-darwin11.3.0] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

SMART Disabled. Use option -s with argument 'on' to enable it.
Shaddam:~ matt$

I wonder why SMART is turned off, and whether there is any reason not to turn it on.

Disk utility was not able to repair the disk. It reported that there were invalid nodes in the catalog that could not be repaired. After which it wouldn't mount it either although, as I mention, OSX mounted it read-only (with a warning about disk problems) after a restart.

I've since reformatted the disk so at this point I am really trying to ascertain whether there's a fault with the disk itself (it's 18 months old and should still be under warranty I think), or whether I just got unlucky with a corrupted file-system.

Kind regards,

Matt
 
Last edited:
TLDR; How can I find out if I've just been unlucky and corrupted my disk somehow (e.g. we've had several storms recently, perhaps a bad write), or if there's something seriously wrong with it and it's going to fail totally.

Just assume that it is going to fail and replace it. Replacing a drive before it fails is quite cheap. If you wait until it fails it gets a lot more expensive. If you replace it now, you can copy all your data overnight when it is most convenient. If it fails, it might happen just after you started work on some really urgent project.
 
Just assume that it is going to fail and replace it. Replacing a drive before it fails is quite cheap.

Thanks, I do appreciate my options. It is worth it to me, if I can find a tool that will do it, to scan the disk before going through removing, RMA'ing, and replacing it.

That's the question: is there such a tool available?

Kind regards,

Matt
 
Not really. You can't SW scan for HW issues if they are not present or reporting. I usually use Diskwarrior and just rebuild the directory if DW starts having slowdowns and malfunctions I replace the drive. Other than that you replace the drive when it stops working:(
Gnasher is right. Assume failure and use it as an (extra) backup or media files until it breaks for RMA or forget RMA and place in closet. Not worth it.
 
...That's the question: is there such a tool available?
...
You could always try and do a 3 pass erase on the drive and see if the drive has any errors from the writes.

Just assume that it is going to fail and replace it. Replacing a drive before it fails is quite cheap. If you wait until it fails it gets a lot more expensive. If you replace it now, you can copy all your data overnight when it is most convenient. If it fails, it might happen just after you started work on some really urgent project.
All things considered, this is rather good advice.
 
Not really. You can't SW scan for HW issues if they are not present or reporting. I usually use Diskwarrior and just rebuild the directory if DW starts having slowdowns and malfunctions I replace the drive. Other than that you replace the drive when it stops working:(
Gnasher is right. Assume failure and use it as an (extra) backup or media files until it breaks for RMA or forget RMA and place in closet. Not worth it.
Medium-length story about RMA:

I have eight Western Digital RE-4 disks in a RAID, and last week one of the drives failed upon startup. I was able to get it to rejoin the RAID after three or four attempts to pull and reinsert it, finally rebuilding and showing up as normal again. I shut down later, and restarted it the next day... and it failed again. I pulled that drive and put it in an external drive caddy (Voyager Q) where I could listen to it closely, and I hear a sort of steady scratching sound in the drive. After starting and stopping it a few times, I noticed the scratching sound would only last a couple seconds, then go silent as the drive comes up to speed. At that point, it boots normally. Seems like it doesn't like being cold or something, since it rejoined the RAID again, and I've left the RAID on ever since with no problems. I can reboot the system and it works fine... it just fails when it's cold after being shut down for a while.

I submitted an online RMA request with Western Digital, and it was accepted for advance RMA since the drive has a 5-year warranty. It should arrive this Monday, when I'll swap the drives and send the scratchy one back.

If your disk is from WD, they make it really simple. You just enter the serial number, and it tells you when the warranty runs out and everything. Took about 60 seconds to fill out the data fields, and I didn't have to talk to anybody or provide any receipts... no hassle at all. I just typed "drive fails to join RAID upon cold start / spin-up" and it was accepted for RMA. Just have to pay for return shipping, but that beats buying a new one.
 
Just for reference it seems TechToolDeluxe (if you can find your AppleCare CD, I can't) has a surface scan option. Presumably TechToolPro also has but I wasn't willing to pay $40 to find out.

However another option is to use the badblocks command from the ext2fs package. You can find out more here.

I ran an erase pass on the disk last night writing 0's and there was no failure. I'm currently running a non-destructive read using badblocks to see what it has to say.

If the drive can't complete it or has significant bad blocks I'll RMA it. Otherwise I think I just got unlucky and a bad write messed up the file system.
 
Just for reference it seems TechToolDeluxe (if you can find your AppleCare CD, I can't) has a surface scan option. Presumably TechToolPro also has but I wasn't willing to pay $40 to find out.

However another option is to use the badblocks command from the ext2fs package. You can find out more here.

I ran an erase pass on the disk last night writing 0's and there was no failure. I'm currently running a non-destructive read using badblocks to see what it has to say.

If the drive can't complete it or has significant bad blocks I'll RMA it. Otherwise I think I just got unlucky and a bad write messed up the file system.

Sounds like a plan. Personally I don't trust Techtool (any version) even for .99. Unless they rewrote the thing. Surface scan is nice but may also not show HW issues if present. I own a copy of Prosoft DriveGenius 3.x but really never use it. Their DataRescue product is top notch. So maybe that is a better alternative. At least the company has a good rep.
 
I own a copy of Prosoft DriveGenius 3.x but really never use it. Their DataRescue product is top notch. So maybe that is a better alternative. At least the company has a good rep.

Hrmm... I got DriveGenius as part of a bundle deal a couple of years back. The one time I tried to use it, the disk got screwed up. Perhaps coincidence but I've never trusted it since.

After doing a full erase with 0's of the disk I ran badblocks and it reported the disk has no bad blocks. I'm keeping an eye on it but I think the disk is okay and I got unlucky with the corrupted file system.

Matt
 
Uh oh. I never really run anything other than Diskwarrior. As I said Prosoft is a great company but anyone can produce a turd. I am not neurotic enough to just use drive utilities whenever so testing this stuff takes me some time. I have to have some failures first and then I want to use what I know works. Thanks for the DG experience I got mine in bundle as well. FWIW I have never had techtool find anything wrong with a drive even when it was in the process of screeching death. Nice blinking lights though:)
 
I'd replace the drive. You are assuming a drive is either good or bad, and that you can simply do a test and move on based on the results. But drives can be intermittently bad, or bad in ways that tests cannot detect.

I've had a drive in the past with intermittent problems. Everything would seem okay for a long time then it would go to hell. Format, restore from backup, thorough testing, everything seems fine again for a couple of weeks, then BLAMMO again. Maybe your time is cheap, but it was definitely not worth my time.
 
A tool can't detect a drive failure until the drive actually fails. And if the drive fails, it's too late.

Best to heed the above advice and just replace the drive if it's suspicious. And keep back ups.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.