Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

cnstoll

macrumors 6502
Original poster
Aug 29, 2010
254
0
Hi

I've got a 2010 Mac Pro and way too much data that I need a better way to store. It's basically all photos (RAW) and videos (compressed). Currently I'm using two internal 2TB mirror RAID arrays, but those are filling up so I need a new system.

I'm considering several options: external raid enclosure via eSATA, external NAS enclosure, Linux file server, or an internal solution like the transintl DX4 that lets you mount SSDs below your drive sleds.

Part of the reason I'm considering the DX4, and if suppose why some of this is so complicated, is that I also want to upgrade my primary drive to SSDs in a RAID-0 stripe configuration. I already backup my main volume to both a superduper clone and a time machine drive.

Out of all the storage options I've listed the only one I've actually tried is the Linux file server. I built a server a while ago to attempt to use for this purpose. It works pretty well, and it can certainly grow to the desired capacity, but I dont think it's the right way to go. The server's motherboard died about a month ago, and while I didn't lose any data, I did learn that I don't like playing sysadmin while my data is on the line. Currently everything on the server is mirrored elsewhere, so maybe this is the right solution to use as a backup a volume for whatever my primary storage device is.

So basically I am looking for advice from anyone who has used some of these mass storage devices to help me figure out what would set suit my needs. So what are my needs?

I'm a developer my day and a photographer by night. I need a fast volume for doing current work (scratch) with Aperture and Xcode, and a massive storage volume for keeping my older projects and pictures. I use Aperture in a referenced library configuration, so the library itself lives on my main drive (hopefully an SSD soon) and the RAW files live on the RAID. If I get a large enough stripped SSD then I would probably keep working files there and move them off to the RAID after editing. So performance on the storage system should be good but it doesnt need to be outstanding. Anything over 90MB/s is probably fine.

I'm also looking for any advice on SSDs, on the DX4, and even any stories from people using Aperture in situations like these. This is a pretty complicated problem I think, because it will also need to include a backup or archival system. We all know that RAID != backup.

Anyways, thanks for reading. Merry Christmas and let me know if you can help.
 
Synology NAS. That is what I use for my file system storage. Fast enough for me on my wired network. I imagine there are some fw800/eSATA enclosures that may be suitable and faster as well.

I am not sold on SSD at this point -- technology is too young and I think a better file system needs to be developed for it (especially for the Windows platform). Traditional magnetic disk is not the speed king, but it is fast and reliable (though because of floods in Thailand, not as cheap as it once was). I now wish I purchased more 3TB WD green drives... Still, have enough for now, just waiting for the prices to come down.

Happy Holidays!
 
As an Amazon Associate, MacRumors earns a commission from qualifying purchases made through links in this post.
I've got a 2010 Mac Pro and way too much data that I need a better way to store. It's basically all photos (RAW) and videos (compressed). Currently I'm using two internal 2TB mirror RAID arrays, but those are filling up so I need a new system.
Is this primary working data, or your backups?

I ask, as backups don't require speed (would allow you to use less expensive options). But if it's your primary data volume, it seems you could benefit from additional speed as well (where you may benefit from a hardware RAID controller running a parity based level, which will give you a nice balance of speed, available capacity, and redundancy).

I'm considering several options: external raid enclosure via eSATA, external NAS enclosure, Linux file server, or an internal solution like the transintl DX4 that lets you mount SSDs below your drive sleds.
Not enough information yet...

You have to be careful with these types of options, as most are software implementations of RAID, which isn't suitable for parity levels due to the lack of a solution for the write hole issue (solution is hardware).

Software wise, you can get around the write hole issue by using ZFS's RAID-Z or RAID-Z2 (may want to create this yourself on an inexpensive system and cheap SATA controller/s).

Part of the reason I'm considering the DX4, and if suppose why some of this is so complicated, is that I also want to upgrade my primary drive to SSDs in a RAID-0 stripe configuration. I already backup my main volume to both a superduper clone and a time machine drive.
Stripe sets (RAID 0), only increases the sequential throughput figures, not random access performance, which is what's needed for booting and loading applications.

So you'd be better off getting a single SSD that has fast random access performance (stripe sets can be a way of getting more capacity at a lower price, but that's not as much of an issue as it was just a year or two ago).

I need a fast volume for doing current work (scratch) with Aperture and Xcode, and a massive storage volume for keeping my older projects and pictures. I use Aperture in a referenced library configuration, so the library itself lives on my main drive (hopefully an SSD soon) and the RAW files live on the RAID. If I get a large enough stripped SSD then I would probably keep working files there and move them off to the RAID after editing. So performance on the storage system should be good but it doesn't need to be outstanding. Anything over 90MB/s is probably fine.
Use an inexpensive SSD for scratch (example).
 
Is this primary working data, or your backups?

I ask, as backups don't require speed (would allow you to use less expensive options). But if it's your primary data volume, it seems you could benefit from additional speed as well (where you may benefit from a hardware RAID controller running a parity based level, which will give you a nice balance of speed, available capacity, and redundancy).

I need a solution for both I suppose. My current solution to backups for my 2TB mirrors is to clone them to an external 2TB single drive. That obviously won't work for backing up a larger volume.

Is there any benefit to archiving data into an offline volume of some sort? Id still need to backmit up even if it was "archived", so id still need the same overall capacity. I think I'd prefer to have all my data available at all times though so that would involve one primary storage system and one backup storage system. Like you said, speed wouldn't be as big of a concern for the backup system.


Not enough information yet...

You have to be careful with these types of options, as most are software implementations of RAID, which isn't suitable for parity levels due to the lack of a solution for the write hole issue (solution is hardware).

Software wise, you can get around the write hole issue by using ZFS's RAID-Z or RAID-Z2 (may want to create this yourself on an inexpensive system and cheap SATA controller/s).

So I guess the main question I have here is about the reliability of hardware RAID solutions and recovery options in the event of a controller failure (not a drive failure). I'm currently leaning more in favor of software options such as Linux software raid (aka RAID 10 as this can be implemented in software) and the various "fake hardware" NAS devices running Linux because it seems more stable in the event of device failure. If the hardware controller dies, and it's RAID implementation was proprietary, what can you do to recover your data?

RAID 5 performance is better than RAID 10 though, and you lose less space to redundancy than with RAID 10, so it seems like the obvious choice. I'd just like to know more about the hardware.

Stripe sets (RAID 0), only increases the sequential throughput figures, not random access performance, which is what's needed for booting and loading applications.

So you'd be better off getting a single SSD that has fast random access performance (stripe sets can be a way of getting more capacity at a lower price, but that's not as much of an issue as it was just a year or two ago).

Use an inexpensive SSD for scratch (example).

Right now my system drive with apps (130GB) and aperture library (just metadata/thumbs, 100GB) is about 260GB. Ive got several games installed so that increases the size of "apps" significantly. The main reason I wanted to stripe was cost, like you mentioned. But maybe just putting apps/system on one drive, and the Aperture library on its own drive would make more sense. I'm open to suggestions here too.

Thanks.
 
I need a solution for both I suppose. My current solution to backups for my 2TB mirrors is to clone them to an external 2TB single drive. That obviously won't work for backing up a larger volume.
So long as a single disk backup is large enough to contain the data, it's fine. But once you get to the point the data is larger than any single drive, another solution needs to be used (whether it's multiple single disks, or something else <JBOD, aka concatenation, to RAID>).

But there are relatively inexpensive ways of doing this, so it's not that bad.

Cost wise, a simple eSATA controller and a Port Multiplier based enclosure running Green disks works fine (have the option of concatenation, or a software RAID level). Example kit (includes the eSATA card and enclosure), that only needs drives. Sans Digital also makes a 4 bay version if you don't need one this large.

Please note, that although they claim they're capable of RAID5, it's a software implementation, so skip that particular level (Disk Utility is fine, and suited to software implementations that don't have the write hole issue = 0/1/10).

So I guess the main question I have here is about the reliability of hardware RAID solutions and recovery options in the event of a controller failure (not a drive failure). I'm currently leaning more in favor of software options such as Linux software raid (aka RAID 10 as this can be implemented in software) and the various "fake hardware" NAS devices running Linux because it seems more stable in the event of device failure. If the hardware controller dies, and it's RAID implementation was proprietary, what can you do to recover your data?
If you replace it with the same model, or at least brand of RAID controller, it can usually pick it up (can read the partition tables on the drives). And I've seen this occur with Areca's in the past (upgrade, not DOA controller).

It's very rare for a hardware RAID controller to die though. Disk failures are what you'd run into most often.

And in an absolute disaster, that's what the backup is for (dead RAID card, dead drives <more than the fault tolerance limit of the level implemented>, or both).

RAID 5 performance is better than RAID 10 though, and you lose less space to redundancy than with RAID 10, so it seems like the obvious choice. I'd just like to know more about the hardware.
RAID 6 is also a better alternative to 10 these days in terms of performance (tested on the ARC 1880 series as faster than a 10 with the same member count), and capacity as well once you're over the minimum member count for both of those levels.

In terms of the hardware, they're designed for server environments, including SAN (as mission critical as it gets). So they're designed quite well (minimum lifespan of 5 years running 24/7 is typical for server environments). As I said before, I've never had an Areca fail on me since I've used them (a little over 5 years now, and no signs of problems so far). I can't say that for other makes...

Right now my system drive with apps (130GB) and aperture library (just metadata/thumbs, 100GB) is about 260GB. Ive got several games installed so that increases the size of "apps" significantly. The main reason I wanted to stripe was cost, like you mentioned. But maybe just putting apps/system on one drive, and the Aperture library on its own drive would make more sense. I'm open to suggestions here too.
I assume you mean SSD's here, going by the capacities listed. ;)

If however, this is a single mechanical drive that's been partitioned, SSD's would make a lot of sense here. Whether or not you stripe is up to you, but there's no advantage to it in terms of performance unless you're working with large files, which I don't think is the case (not sure how large each library file is under Aperture).

Partitioning SSD's makes me nervous due to the impact it has on wear leveling (fewer cells available per partition = increased write cycle frequency per partition).
 
Fellow Aperture User

I'm using a Mac Pro with two 3 TB Hitachi in RAID 0 and two 2 TB in the four internal drive bays. A 128GB boot drive in the Second 5.25" Drive bay. The Aperture library resides on the 6TB RAID. The application runs off of SSD along with the Adobe Creative Suite.

After using an SSD for a whole year I will never go back to a computer that doesn't have one as the primary boot drive. It save so much time booting Photoshop. Aperture isn't greatly improved by the SSD.

I have 10 years woth of Nikon RAW files on the 6TB with a 6Tb Drobo Backup (2 x 2TB) (2 x 1TB). I would only recommend the NAS as backup and never a working drive.

The best purchase I've made to improve Aperture's performance is 32 GB (4 x 8GB) of RAM. I've had no crashes. Larger Libraries that use to brick the system. No longer stalls the system.
 
I don't think putting Aperture on an SSD will make any difference. if I remember all the benchmark results correctly, IO is never a bottleneck.
 
Wirelessly posted (iPhone: Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A405 Safari/7534.48.3)

I second the drobo for a backup system! Not a great working drive though.
 
So long as a single disk backup is large enough to contain the data, it's fine. But once you get to the point the data is larger than any single drive, another solution needs to be used (whether it's multiple single disks, or something else <JBOD, aka concatenation, to RAID>).

But there are relatively inexpensive ways of doing this, so it's not that bad.

Cost wise, a simple eSATA controller and a Port Multiplier based enclosure running Green disks works fine (have the option of concatenation, or a software RAID level). Example kit (includes the eSATA card and enclosure), that only needs drives. Sans Digital also makes a 4 bay version if you don't need one this large.

Please note, that although they claim they're capable of RAID5, it's a software implementation, so skip that particular level (Disk Utility is fine, and suited to software implementations that don't have the write hole issue = 0/1/10).

I like that idea for the backup system. A $200-300 enclosure for backup is definitely within my price range, and the ability to use a software solution for spanning or RAID 10 sounds great. Two questions about this setup:

1) With a port multiplier enclosure do all the drives show up as individual volumes to Disk Utility?

2) Can you expand a spanning (JBOD) array later if you want to add new disks for additional capacity, or would that involve rebuilding the array completely?

If you replace it with the same model, or at least brand of RAID controller, it can usually pick it up (can read the partition tables on the drives). And I've seen this occur with Areca's in the past (upgrade, not DOA controller).

It's very rare for a hardware RAID controller to die though. Disk failures are what you'd run into most often.

And in an absolute disaster, that's what the backup is for (dead RAID card, dead drives <more than the fault tolerance limit of the level implemented>, or both).

Ok, that makes sense. I agree that it's unlikely the controller would die, I just wanted to make sure I understood the risks and options if it were to happen.

RAID 6 is also a better alternative to 10 these days in terms of performance (tested on the ARC 1880 series as faster than a 10 with the same member count), and capacity as well once you're over the minimum member count for both of those levels.

In terms of the hardware, they're designed for server environments, including SAN (as mission critical as it gets). So they're designed quite well (minimum lifespan of 5 years running 24/7 is typical for server environments). As I said before, I've never had an Areca fail on me since I've used them (a little over 5 years now, and no signs of problems so far). I can't say that for other makes...

You're definitely turning me around to be more in favor of a hardware RAID 5 or 6 system. I looked at the Areca ARC 1880 and it looks good. I think this system (card + enclosure + drives) could be a little on the pricy side, but at least it would last as long as my Mac Pro will and have great performance. I do have two questions about this system though.

1) What sort of enclosure would you recommend to go along with the Areca RAID card?

2) Same question as above in terms of expansion. Can you start with 4 or 5 drives and add more later without rebuilding the array completely? That would definitely help in terms of cost/budget if I could add more capacity later (especially with the price of HDDs right now).

I assume you mean SSD's here, going by the capacities listed. ;)

If however, this is a single mechanical drive that's been partitioned, SSD's would make a lot of sense here. Whether or not you stripe is up to you, but there's no advantage to it in terms of performance unless you're working with large files, which I don't think is the case (not sure how large each library file is under Aperture).

Partitioning SSD's makes me nervous due to the impact it has on wear leveling (fewer cells available per partition = increased write cycle frequency per partition).

Right now my primary drive is actually just an unpartioned 1TB volume where I'm only using 260GB. The logic there is trying to keep the data in the fastest sectors of the disk to improve performance slightly. I definitely want to move this data to an SSD in some form, but I'm not sure of the configuration of that yet.


On another note, one reason I was considering the DX4 was to stripe smaller SSDs for boot/apps/scratch on that, wired to the motherboards controller and use an internal RAID card to put 5 or 6 (using the optical bays) internal drives into a Hardware RAID 5 array. The problem this could solve is the need for an external storage device for primary storage, instead solely relying on the RAID card. What do you think of this approach?

Thanks again!
 
The best purchase I've made to improve Aperture's performance is 32 GB (4 x 8GB) of RAM. I've had no crashes. Larger Libraries that use to brick the system. No longer stalls the system.

I've got 13GB of RAM and so far Aperture hasn't maxed this out, even on 2000 picture projects with several hundred thousand images in my library. I agree that adding RAM to that did dramatically improve performance though.

I don't think putting Aperture on an SSD will make any difference. if I remember all the benchmark results correctly, IO is never a bottleneck.

I really think it would make a difference, because honestly I/O is the only bottleneck these days, especially on a Mac Pro. Whenever a project is loaded in Aperture it needs to grab a ton of metadata, small bits in its database for image names, edits and versions, thumbnails, previews, etc. None or these files are even a MB in size but there can be thousands of them for a project. This is completely ignoring the actual master image files. To load the project from a library stored on a physical disk the drive head would need to move back and forth across a large portion of the disk to track down all the small files. This is exactly the scenario where HDDs suck compared to SSDs, so I think putting the library on an SSD would probably help a lot. Having scratch there for the working Projects master images seems like it would also help.
 
I really think it would make a difference, because honestly I/O is the only bottleneck these days, especially on a Mac Pro. Whenever a project is loaded in Aperture it needs to grab a ton of metadata, small bits in its database for image names, edits and versions, thumbnails, previews, etc. None or these files are even a MB in size but there can be thousands of them for a project. This is completely ignoring the actual master image files. To load the project from a library stored on a physical disk the drive head would need to move back and forth across a large portion of the disk to track down all the small files. This is exactly the scenario where HDDs suck compared to SSDs, so I think putting the library on an SSD would probably help a lot. Having scratch there for the working Projects master images seems like it would also help.

you might think that, but that may not actually be how it works, and it may not be random enough for an SSD to produce a significant difference. export is CPU limited, for example. obviously its more sequential than what you're talking about, but a lot of people assume it's IO limited when it's not.

I'm just saying it's not as obvious as it might seem intuitively, and if there are benchmarks out there beyond import/export, that would be pretty helpful, otherwise you should do your own testing to prove to yourself that the SSD is doing something besides loading Aperture two seconds faster.

there was another big long thread about someone else having issues with his workflow, everyone pushed him to buy SSDs, and it made absolutely no difference. he worked through PS though, not Aperture.

if you decide to get an SSD, I believe most of your files will be incompressible, so you should be looking at non-SF-2200-based drives. that rules out OWC and most OCZ drives. most people here recommend Samsung or Intel. Crucial is also a good option.
 
Last edited:
I like that idea for the backup system. A $200-300 enclosure for backup is definitely within my price range, and the ability to use a software solution for spanning or RAID 10 sounds great. Two questions about this setup:

1) With a port multiplier enclosure do all the drives show up as individual volumes to Disk Utility?

2) Can you expand a spanning (JBOD) array later if you want to add new disks for additional capacity, or would that involve rebuilding the array completely?
1. At first, Yes (before you've used Disk Utility to create a 10 or concatenated/spanned volume). Once you've done this however, it will appear as a single volume. :)

Keep in mind, that Lion still has some issues with RAID, and a few members have reported arrays suddenly showing up as broken (Disk Utility shows individual volumes). And others haven't. So if you're running Lion, it could be a mixed bag.

The safe solution is to roll back to 10.6.7 IMO (10.6.8 has instabilities with RAID as well). And pay careful attention to OS updates before you install them. Particularly when the OS is new.

2. In a software implementation, unfortunately you have to backup the data first, break the set, add the new disk, create a new one that includes the new disk/s, and finally restore the data from the backup.

In the case the volume is the backup, the data will remain on the primary RAID volume, so this won't be an issue unless the backup contains data that's previously been removed from the primary volume (i.e. archived data that has no other backup). In such a case, that data would need to be backed up before proceeding with expanding the JBOD volume.

Ok, that makes sense. I agree that it's unlikely the controller would die, I just wanted to make sure I understood the risks and options if it were to happen.
Not a problem at all.

In mission critical systems, duplicates are installed for fail-over (i.e. 2x RAID cards + identical array configurations in the system and mirrored together), including entire duplicate systems (i.e. 3 systems, 1 = primary, 2 = fail-over, and each system may contain duplicated arrays :eek:).

This is beyond nutz for what you're doing, but if that system is used for credit card transactions for example, it's necessary as they can't afford to have the system go down (HUGE mess if it did).

You're definitely turning me around to be more in favor of a hardware RAID 5 or 6 system. I looked at the Areca ARC 1880 and it looks good. I think this system (card + enclosure + drives) could be a little on the pricey side, but at least it would last as long as my Mac Pro will and have great performance. I do have two questions about this system though.

1) What sort of enclosure would you recommend to go along with the Areca RAID card?

2) Same question as above in terms of expansion. Can you start with 4 or 5 drives and add more later without rebuilding the array completely? That would definitely help in terms of cost/budget if I could add more capacity later (especially with the price of HDDs right now).
Keep in mind, the RAID system can be transferred from one system to another (just get the OS sorted, then load the drivers), and the array will be visible. Cards can last a long time if you get the correct one (= don't out-grow them too quickly, which is why getting the port count is very important).

1. Basic type is a necessity; MiniSAS connections on the back (1x MiniSAS, aka SFF-8088, per 4x bays for a non-SAS Expander based enclosure). As per a specific brand and model, I recommend the Sans Digital TR8X (link has a photo of the rear), as it's a great value (includes the external cables needed to connect to SFF-8088 ports on the card).

If you're using an internal port, you'll need to get a special cable (here).

Pay attention to the cable length, as SATA disks are limited to 1.0 meters (specification). If you exceed this, you will have stability problems if you can even get the volume initialized. I can't stress how critical this is.

2. In the case of a hardware RAID card such as an Areca, Yes. :D Add the disks, launch ARCHTTP (uses the browser to access the card), and you tell it what to do. Once you've given it the correct information, it does it on it's own once you check "Confirm Operation", then press the "Submit" button.

You can do it manually, and for large sets, this can be faster. But at least you have the option.

Another note, if you're using Online Expansion or Online Migration, you should only do one disk at a time, as attempting this with multiple disks can stress the existing HDD's out, and cause a total failure.

Right now my primary drive is actually just an unpartitioned 1TB volume where I'm only using 260GB. The logic there is trying to keep the data in the fastest sectors of the disk to improve performance slightly. I definitely want to move this data to an SSD in some form, but I'm not sure of the configuration of that yet.
Understandable. :) It's called short stroking BTW. ;)

But if you're trying to access more than a single partition simultaneously, you'll slow down. Not only is it trying to read more information, but the heads are moving a lot more = slows you down due to the partitioning (moving more sectors between reads).

So there are limitations.

This is why having separate disks is the first step in reducing bottlenecks (each request has its own disk and SATA port to transfer data). And since requests can be handled simultaneously (parallel), the overall performance is increased in these cases (performance of a single disk remains the same).

On another note, one reason I was considering the DX4 was to stripe smaller SSDs for boot/apps/scratch on that, wired to the motherboards controller and use an internal RAID card to put 5 or 6 (using the optical bays) internal drives into a Hardware RAID 5 array. The problem this could solve is the need for an external storage device for primary storage, instead solely relying on the RAID card. What do you think of this approach?
Not the best approach IMO.
  1. Scratch can increase the wear on the unused cells of the entire set.
  2. No increased performance figures for random access performance.
  3. The on-board SATA ports have a bandwidth limit set by Intel in the ICH (I/O Controller Hub, which is the actual semiconductor that contains those SATA ports as a means of keeping some bandwidth available for other controllers, such as USB and Ethernet). BTW, the bandwidth cap for all SATA ports in the ICH is ~660MB/s, and SSD's haven't performed as well in real-world testing by other MR members.
Better to put the OS/applications and scratch on separate SSD's (an inexpensive unit is fine for scratch = keep it at $100, and consider it a disposable disk). At least the scratch data won't cause any wear on the OS/applications volume, which is mostly read rather than written. ;)

It should also cheaper these days, and that's probably the most important reason of all. :D :p

BTW, the Areca's you're looking at can also boot in a MP once you've flashed the firmware's BOOT.BIN to EBC.BIN (located on the disk as well as their site). Bit more work, but it will run the faster 6.0Gb/s SSD's (ICH is only 3.0Gb/s, which can matter with more recent SSD's).

This is where cheaper comes in if you bear with me...

40GB 3.0Gb/s SSD is ~$100
240GB 6.0Gb/s SSD is $359

Option 1: 1x of each, you're at $459 for 280GB (total capacity).
Option 2: say 4x of the 40GB's, you're at $400 for 160GB of total capacity.

Now ask yourself...
  • Which one is cheaper in terms of cost per GB?
  • Which way is faster?
I'll leave you to figure it out... :eek: :p
 
As an Amazon Associate, MacRumors earns a commission from qualifying purchases made through links in this post.
you might think that, but that may not actually be how it works, and it may not be random enough for an SSD to produce a significant difference. export is CPU limited, for example. obviously its more sequential than what you're talking about, but a lot of people assume it's IO limited when it's not.
I'm only aware of this with Photoshop, as I seem to recall Aperture is threaded.
 
if you decide to get an SSD, I believe most of your files will be incompressible, so you should be looking at non-SF-2200-based drives. that rules out OWC and most OCZ drives. most people here recommend Samsung or Intel. Crucial is also a good option.

You're right, I should do a bit more research to see where the performance benefit would actually be impactful. Obviously import/export wouldn't be affected, but I think there could be other improvements to be had elsewhere. I'll post a review of it with stats if I actually move forward with this.

That's a great point about the incompressible files too. I'll look into that as well.
 
1. At first, Yes (before you've used Disk Utility to create a 10 or concatenated/spanned volume). Once you've done this however, it will appear as a single volume. :)

Keep in mind, that Lion still has some issues with RAID, and a few members have reported arrays suddenly showing up as broken (Disk Utility shows individual volumes). And others haven't. So if you're running Lion, it could be a mixed bag.

The safe solution is to roll back to 10.6.7 IMO (10.6.8 has instabilities with RAID as well). And pay careful attention to OS updates before you install them. Particularly when the OS is new.

Hrm, that's certainly troubling. I haven't been thrilled with Lion, but I don't really want to downgrade either. I guess I'll do some more research here.

2. In a software implementation, unfortunately you have to backup the data first, break the set, add the new disk, create a new one that includes the new disk/s, and finally restore the data from the backup.

In the case the volume is the backup, the data will remain on the primary RAID volume, so this won't be an issue unless the backup contains data that's previously been removed from the primary volume (i.e. archived data that has no other backup). In such a case, that data would need to be backed up before proceeding with expanding the JBOD volume.

Well, that kind of stinks, but at least that's not a problem for the primary RAID (I read ahead) ;)

Not a problem at all.

In mission critical systems, duplicates are installed for fail-over (i.e. 2x RAID cards + identical array configurations in the system and mirrored together), including entire duplicate systems (i.e. 3 systems, 1 = primary, 2 = fail-over, and each system may contain duplicated arrays :eek:).

This is beyond nutz for what you're doing, but if that system is used for credit card transactions for example, it's necessary as they can't afford to have the system go down (HUGE mess if it did).

Yeah totally, primary RAID plus backup volume should be fine for my needs. This does bring up offsite backup though. My current system is just another copy of the volume at the folks house. That gets complicated when the offsite wont fit on one drive, obviously. Any thoughts on cloud backup solutions like Backblaze/Crashplan?

Keep in mind, the RAID system can be transferred from one system to another (just get the OS sorted, then load the drivers), and the array will be visible. Cards can last a long time if you get the correct one (= don't out-grow them too quickly, which is why getting the port count is very important).

1. Basic type is a necessity; MiniSAS connections on the back (1x MiniSAS, aka SFF-8088, per 4x bays for a non-SAS Expander based enclosure). As per a specific brand and model, I recommend the Sans Digital TR8X (link has a photo of the rear), as it's a great value (includes the external cables needed to connect to SFF-8088 ports on the card).

If you're using an internal port, you'll need to get a special cable (here).

Pay attention to the cable length, as SATA disks are limited to 1.0 meters (specification). If you exceed this, you will have stability problems if you can even get the volume initialized. I can't stress how critical this is.

2. In the case of a hardware RAID card such as an Areca, Yes. :D Add the disks, launch ARCHTTP (uses the browser to access the card), and you tell it what to do. Once you've given it the correct information, it does it on it's own once you check "Confirm Operation", then press the "Submit" button.

You can do it manually, and for large sets, this can be faster. But at least you have the option.

Another note, if you're using Online Expansion or Online Migration, you should only do one disk at a time, as attempting this with multiple disks can stress the existing HDD's out, and cause a total failure.

Well, that's pretty awesome. I would just add more drives as the prices drop. I guess the only downside here is it pretty effectively ties you to a Mac Pro, unless they make mini-SAS cards for laptops.

Understandable. :) It's called short stroking BTW. ;)

Thanks ;)

But if you're trying to access more than a single partition simultaneously, you'll slow down. Not only is it trying to read more information, but the heads are moving a lot more = slows you down due to the partitioning (moving more sectors between reads).

So there are limitations.

This is why having separate disks is the first step in reducing bottlenecks (each request has its own disk and SATA port to transfer data). And since requests can be handled simultaneously (parallel), the overall performance is increased in these cases (performance of a single disk remains the same).

To be clear, I'm not partioning anything ;) But yea, separate SSDs or a single unpartioned SSD sounds like the way to go here.

Not the best approach IMO.
  1. Scratch can increase the wear on the unused cells of the entire set.
  2. No increased performance figures for random access performance.
  3. The on-board SATA ports have a bandwidth limit set by Intel in the ICH (I/O Controller Hub, which is the actual semiconductor that contains those SATA ports as a means of keeping some bandwidth available for other controllers, such as USB and Ethernet). BTW, the bandwidth cap for all SATA ports in the ICH is ~660MB/s, and SSD's haven't performed as well in real-world testing by other MR members.
Better to put the OS/applications and scratch on separate SSD's (an inexpensive unit is fine for scratch = keep it at $100, and consider it a disposable disk). At least the scratch data won't cause any wear on the OS/applications volume, which is mostly read rather than written. ;)

It should also cheaper these days, and that's probably the most important reason of all. :D :p

BTW, the Areca's you're looking at can also boot in a MP once you've flashed the firmware's BOOT.BIN to EBC.BIN (located on the disk as well as their site). Bit more work, but it will run the faster 6.0Gb/s SSD's (ICH is only 3.0Gb/s, which can matter with more recent SSD's).

This is where cheaper comes in if you bear with me...

40GB 3.0Gb/s SSD is ~$100
240GB 6.0Gb/s SSD is $359

Option 1: 1x of each, you're at $459 for 280GB (total capacity).
Option 2: say 4x of the 40GB's, you're at $400 for 160GB of total capacity.

Now ask yourself...
  • Which one is cheaper in terms of cost per GB?
  • Which way is faster?
I'll leave you to figure it out... :eek: :p

Ok, we are getting closer but I still have some questions here. Keeping system and scratch on separate SSDs makes sense. But now we are talking about not using the internal SATA ports at all. For non striped volumes it seems like the performance would be ok, even though it's SATA 3 and not 6. The reason I was thinking about the DX4 was to try to keep all the primary storage internal, not performance on the SSDs. I am leaning more towards the external system though, so I could just use the normal bays for the SSDs. Unless there's anything wrong with that?
 
As an Amazon Associate, MacRumors earns a commission from qualifying purchases made through links in this post.
Any thoughts on cloud backup solutions like Backblaze/Crashplan?
Do so if possible, as that will protect your data in circumstances where both the primary and backup locations are destroyed, such as natural disasters (fire and flood for example).

Well, that's pretty awesome. I would just add more drives as the prices drop. I guess the only downside here is it pretty effectively ties you to a Mac Pro, unless they make mini-SAS cards for laptops.
You'd be able to stuff the card in any system with PCIe slots though. The only issue you'd run into in that instance, would be the file system. And there are ways around that (use some form of FAT, such as exFAT, or try a file system utility).

As per laptops, not so much (*might* be possible via Thunderbolt in the future if Areca makes such a product).

But now we are talking about not using the internal SATA ports at all. For non striped volumes it seems like the performance would be ok, even though it's SATA 3 and not 6. The reason I was thinking about the DX4 was to try to keep all the primary storage internal, not performance on the SSDs. I am leaning more towards the external system though, so I could just use the normal bays for the SSDs. Unless there's anything wrong with that?
  1. You can still use the SATA ports in the ICH for at least the scratch disk, and potentially the OS/applications disk as well (6.0Gb/s SSD's can't saturate 3.0Gb/s for random access performance).
  2. Another thing, is there's a way to use the HDD bays with an internal port on a RAID card (adapter kit).
 
As per laptops, not so much (*might* be possible via Thunderbolt in the future if Areca makes such a product).

I think I'll just not worry about it for now. If I ever move to a laptop I'll just keep my pro as a very well designed file server :)

  1. You can still use the SATA ports in the ICH for at least the scratch disk, and potentially the OS/applications disk as well (6.0Gb/s SSD's can't saturate 3.0Gb/s for random access performance).
  2. Another thing, is there's a way to use the HDD bays with an internal port on a RAID card (adapter kit).

So here's the real question: is this even worth it? With all the other costs going into this I'm already looking at close to $2000 for a SAS card and enclosure, eSATA enclosure, and drives. How much do I stand to gain by switching the backplane over to 6Gb/s versus just using 3Gb/s SSDs on the existing ports?


Ok, here's my summary for the proposed system pieced together from our conversation.

Fast and Small Storage
Boot/Apps: 120-240GB 3.0Gb/s SSD
Aperture Library: 120-240GB 3.0Gb/s uncompressed SSD (or perhaps a regular HDD if testing shows the SSD doesn't help much)
Scratch: 40GB 3.0Gb/s SSD

Internal Backup
1TB Cloned Backup for apps/system/aperture
1TB Time Machine volume

Primary Storage
Areca ARC-1880x
Sans Digital 8 Bay SAS SATA Enclosure
4-5 7200RPM 2TB SATA drives in a RAID 5/6 array

Backup Storage
eSATA JBOD Enclosure
eSATA card
3-4 5900RPM 2TB SATA drives in a spanning configuration


Thoughts? Comments?

Remaining questions:

1) What drives do you recommend using with the SAS/SATA enclosure? Can they be any old 7200 RPM SATA drives or do we need to get fancy here?

2) I read your explanation of why the Areca 1880 cards are superior to basic ROC cards and I found it to be very good. I was particularly concerned with how dire the situation sounded if things went south while using lesser cards and what the recovery options are then. I was wondering if you could go into some more detail on what's involved with those cards and explicitly how the 1880 makes things better. I'll understand if you prefer to respond to this question in that thread, but I didn't want to make the conversation too disjointed by asking this there.

3) Finally, noise and heat. I'm not too picky here but I'd like some info on what to expect. Currently I still have my old PowerMac G5 sitting right next to my Mac Pro and I don't mind that noise level at all. How much worse than an additional tower is two enclosures half full of SATA drives? What about temperature?

4) UPS. I have a UPS right now but it sucks so I think I should get a better one. Any recommendations for one given the above system?

Thanks so much by the way. I've been thinking about this problem for like a year now and I think I'm really close to having a solution that I am comfortable with :)
 
Wow that was a fun read! I think your proposed system sounds like a good idea.

As an aside, you asked about cloud solutions. My work has us all connected to Crashplan, and my personal opinion is that unless you have a high speed internet connection at your desk, it's not worth it. I'm pushing 3 TB to my Crashplan account and it seems that it took weeks to do the initial push, and then days to make subsequent backups. Plus the java-based daemon that handles it makes my machine (Mac Pro quad core) grind to a halt.

The 3TB on that Mac Pro is a Promise DS4600 (if I remember correctly) using 4 1TB drives in RAID5. This thing is kinda meh. Over FW400/FW800 it's pretty slow. Using the eSATA port attached to a si3132 SATA card, it doesn't mount when booted. I need to unplug it and plug it back in every time I reboot. :( But in 2+ years, it has been running fine as the primary storage for a workgroup file server.

At home, I'm using a PC running OpenSolaris for a personal data storage server.
 
I think I'll just not worry about it for now. If I ever move to a laptop I'll just keep my pro as a very well designed file server :)
Don't worry about the laptop... Worst case, use the 1G Ethernet ports to network the laptop and the desktop together as a means of transferring/sharing files.

So here's the real question: is this even worth it? With all the other costs going into this I'm already looking at close to $2000 for a SAS card and enclosure, eSATA enclosure, and drives. How much do I stand to gain by switching the backplane over to 6Gb/s versus just using 3Gb/s SSDs on the existing ports?
I was just trying to show you there are options as to how things can be done.

For example, you can opt to use an internal version of the 1880 (they have up to 24 ports this way), and use special cables to connect to external enclosures like the TR8X, and use the HDD kit linked to connect the 4x HDD bays to the card as well. It's a bit cheaper, but is a bit more involved to install initially.

Using 6.0Gb/s drives isn't critical either, but some are faster at random access than 3.0Gb/s models (you'll need to put in the time to research these drives). Personally, my primary recommendation would be Intel SSD's, followed by Toshiba. But OWC's disks have done well in MP's according to other users (40GB model linked is perfect for scratch, and others are currently using that exact model for that exact purpose, so it's proven in the field).

Ok, here's my summary for the proposed system pieced together from our conversation.

Fast and Small Storage
Boot/Apps: 120-240GB 3.0Gb/s SSD
Aperture Library: 120-240GB 3.0Gb/s uncompressed SSD (or perhaps a regular HDD if testing shows the SSD doesn't help much)
Scratch: 40GB 3.0Gb/s SSD
This is fine.

I actually prefer the OS/applications disk on the ICH, as it leaves a port open on the RAID card for expansion. The mention of using one of those for a boot disk, is that I've found a lot of members want a 6.0Gb/s drive on a 6.0Gb/s port, particularly SSD's.

Internal Backup
1TB Cloned Backup for apps/system/aperture
1TB Time Machine volume
I get the clone, and recommend this, as it makes repairing OS issues much faster and more convenient.

But I'm not sure what the Time Machine volume is backing up here. :confused:

Could you elaborate?

Primary Storage
Areca ARC-1880x
Sans Digital 8 Bay SAS SATA Enclosure
4-5 7200RPM 2TB SATA drives in a RAID 5/6 array
For this few disks, you do not need to run a RAID 6. Particularly as you'll be at the system daily if there is a problem (6 is suited for larger member counts, remote systems, ... where the additional redundancy is a necessity).

A level 5 implementation would give you more capacity and additional speed for the same member count than 6. Now if you continue to add disks in the future, it would eventually be necessary to switch it over (migrate) due to the member count on large capacity drives (this is a bit complicated, but it comes down to the additional stress and member count causing increased odds of enough failures during a rebuild that you lose the array during the rebuild process = all data is gone from the primary array :eek: :().

Backup Storage
eSATA JBOD Enclosure
eSATA card
3-4 5900RPM 2TB SATA drives in a spanning configuration
This is just fine, and the most cost effective way to go about it.

Remaining questions:

1) What drives do you recommend using with the SAS/SATA enclosure? Can they be any old 7200 RPM SATA drives or do we need to get fancy here?
Brand wise, I'd recommend you stick with Western Digital. For SATA, the RE4 (RE = RAID Edition, 4th generation).

You need to use enterprise grade HDD's. Preferably ones on Areca's HDD Compatibility List (.pdf file), as it will save you massive headaches if they're not compatible.

The reason you cannot use consumer grade disks on a card such as this, is due to the recovery timings programmed into the drive's firmware (recovery is now controlled by the card, and is done a little differently than by the OS/system with it's built-in ports).

2) I read your explanation of why the Areca 1880 cards are superior to basic ROC cards and I found it to be very good. I was particularly concerned with how dire the situation sounded if things went south while using lesser cards and what the recovery options are then. I was wondering if you could go into some more detail on what's involved with those cards and explicitly how the 1880 makes things better. I'll understand if you prefer to respond to this question in that thread, but I didn't want to make the conversation too disjointed by asking this there.
First off, it has the features previously mentioned that neither software or RoC implementations offer.

As per what it offers in terms of recovery, it keeps a copy of the Partition Tables on the drives (what all hardware RAID cards do), but Areca also keeps a copy stored in it's firmware.

This is what allows it to recover arrays that would be totally lost on other hardware RAID controllers (they're not the only company to do this, but not all do). This is done by hidden commands that you have to type in via the CLI (Command Line Interface utility that's included in the disk or can be downloaded from the support site).

I won't go further, but the simple fact is that you can recover things that would ordinarily be lost on other cards, and certainly so if the RAID was created under software or a RoC. And this is invaluable if you ever find yourself in this predicament. Please understand this doesn't make a card totally fail-proof, but it can save you time and effort when it works. ;)

3) Finally, noise and heat. I'm not too picky here but I'd like some info on what to expect. Currently I still have my old PowerMac G5 sitting right next to my Mac Pro and I don't mind that noise level at all. How much worse than an additional tower is two enclosures half full of SATA drives? What about temperature?
They're not loud at all. As per heat, they get warm, but have sufficient cooling to handle it.

4) UPS. I have a UPS right now but it sucks so I think I should get a better one. Any recommendations for one given the above system?
CP1500PFCLCD is the least expensive that will do the job (pure sine wave = will not damage the MP's PSU; has to do with the design is an Active Power Factor Control implementation in order to meet Energy Star requirements).

Previously, I stuck to APC (i.e. SUA1500 and SMT1500), but the price was too attractive to ignore, and I bought 2 recently. So far, so good, and I'm not the only member here that's running them.

The APC's can be had refurbished, but their prices have been rising lately, and the Cyber Power unit linked above is still cheaper. Hence why I picked up two of them.

Thanks so much by the way. I've been thinking about this problem for like a year now and I think I'm really close to having a solution that I am comfortable with :)
:cool: NP. :)

RAID isn't something that can be figured out in a few minutes. There's a lot to digest, and we've only scratched the surface. Seriously. :eek: ;)

As an aside, you asked about cloud solutions. My work has us all connected to Crashplan, and my personal opinion is that unless you have a high speed internet connection at your desk, it's not worth it. I'm pushing 3 TB to my Crashplan account and it seems that it took weeks to do the initial push, and then days to make subsequent backups. Plus the java-based daemon that handles it makes my machine (Mac Pro quad core) grind to a halt.
It will be slow, except on the fastest of uplink speeds, but it will keep the data safe in the event of things like floods and fire where the system is 100% trashed and unrecoverable.

So it's in a business's best interest to make sure they've at least enough of an uplink speed for the data they're transmitting to such an account.

For home users however, or when budgets are too tight, this is a problem when the total capacity is in the TB range.

The 3TB on that Mac Pro is a Promise DS4600 (if I remember correctly) using 4 1TB drives in RAID5. This thing is kinda meh. Over FW400/FW800 it's pretty slow. Using the eSATA port attached to a si3132 SATA card, it doesn't mount when booted. I need to unplug it and plug it back in every time I reboot. :( But in 2+ years, it has been running fine as the primary storage for a workgroup file server.
The DS4600 is just an RoC box meant for home use. Not a good solution for a business, as a failure will probably translate into too much down-time.
 
As an Amazon Associate, MacRumors earns a commission from qualifying purchases made through links in this post.
I'm only aware of this with Photoshop, as I seem to recall Aperture is threaded.

Aperture is multithreaded, but IO is still usually not a bottleneck.

"DOES HARD DISK TYPE and/or SPEED MATTER?
We tested the Mac Pro by booting from an HDD, importing from an HDD to an HDD. Then we exported from an HDD to an HDD. Then we tested by booting from an SSD, importing from an SSD to an SSD. Then we exported from an SSD to an SSD. The times were the same. So for those two scenarios, storage speed is not a factor.

We did see a 16% gain in import/processing speed on the MacBook Pro 2.3 when we imported from a Thunderbolt RAID 0 enclosure (four 6G SSDs). But there was no gain in export speed.

DOES GPU MATTER?
Using tools like OpenGL Driver Monitor and atMonitor, we determined that the Mac Pro's GPU was a factor in the import processing (CPU waiting on GPU) and that 410MB of video memory was in use. In the export test, 573MB of VRAM was in use. That implies that the MacBook Pro (13") and MacBook Air with integrated GPU are both going to "rob" from main memory leaving less for Aperture and the OS. That's another reason to choose a "muscular" Mac with a dedicated GPU (and at least 1GB of VRAM) for running Pro Apps like Aperture."


http://www.barefeats.com/aper313.html
 
For example, you can opt to use an internal version of the 1880 (they have up to 24 ports this way), and use special cables to connect to external enclosures like the TR8X, and use the HDD kit linked to connect the 4x HDD bays to the card as well. It's a bit cheaper, but is a bit more involved to install initially.

So what about this Internal/External Areca card. Would that be able to drive both the external RAID enclosure and connect to the backplane via an adapter to enable 6.0Gb/s for some of the internal drives?

I actually prefer the OS/applications disk on the ICH, as it leaves a port open on the RAID card for expansion. The mention of using one of those for a boot disk, is that I've found a lot of members want a 6.0Gb/s drive on a 6.0Gb/s port, particularly SSD's.

So you just connect any internal drives (basically just boot, scratch, and a backup clone?) to the ICH and use one SAS port for the primary storage enclosure, and leave the other free for possible future use?

I've also heard that if your SATA ports are 3.0Gb/s that it's best to use SATA 3 drives and not SATA 6 drives. Is that correct?

But I'm not sure what the Time Machine volume is backing up here. :confused:

Could you elaborate?

I've actually been keeping my home directory on the same drive as OS and apps, and just moving the large portions of it (iMovie, iTunes, RAW files) off to my RAID mirrors separately. I'm not sure if I'll keep things that way going forward, but I would like to use time machine for at least whatever volume has ~/Library and ~/Documents on it.

The APC's can be had refurbished, but their prices have been rising lately, and the Cyber Power unit linked above is still cheaper. Hence why I picked up two of them.

Awesome, that Cyber Power unit looks great. Will one be enough for a Mac Pro and two storage enclosures?

RAID isn't something that can be figured out in a few minutes. There's a lot to digest, and we've only scratched the surface. Seriously. :eek: ;)

Totally, which is why I really appreciate that you're willing to explain it so well. The funny part is that I have a pretty good understanding of how the technology works at a low level, but using it in the field is obviously completely different from reading about it in a book ;)

It will be slow, except on the fastest of uplink speeds, but it will keep the data safe in the event of things like floods and fire where the system is 100% trashed and unrecoverable.

For home users however, or when budgets are too tight, this is a problem when the total capacity is in the TB range.

I went ahead and upgraded my connection to 5Mb/s up a while ago. It wasn't too much more and it should allow me to upload everything in about a month or so. I think the peace of mind will probably be worth the effort (and 5 bucks a month is about as cheap as you can get for unlimited backup).

----------

Aperture is multithreaded, but IO is still usually not a bottleneck.

http://www.barefeats.com/aper313.html

That was a good article. I haven't really thought of import or export as major sore points in my workflow, but since they are the easiest to analyze its interesting to see that SSD doesn't really matter there. I think I'll try to time things like loading a project, batch edit operations, metadata changes and see if things like that improve with an SSD.

it might be worth throwing another 8 gigs of RAM in there while I'm at this, since thats about the price of a tank of gas these days.
 
[/QUOTE]

I went ahead and upgraded my connection to 5Mb/s up a while ago. It wasn't too much more and it should allow me to upload everything in about a month or so. I think the peace of mind will probably be worth the effort (and 5 bucks a month is about as cheap as you can get for unlimited backup).[/QUOTE]
what happens if you upload 1tb or 2tb and they raise your rate
 
what happens if you upload 1tb or 2tb and they raise your rate

Well, that better not happen :p

I'm pretty sure Time Warner doesn't have quotas, unlike other providers like Comcast. I think my Internet plan is kind of for small business too, so i really don't foresee a problem.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.