Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Magrathea

macrumors regular
Original poster
Aug 21, 2008
201
0
A buddy is looking to get a raid as they are now using separate drives connected to a linux box. I'm relative tech savvy and know how they basically work but could use a little help on deciding what's best. The application that will be accessing the data is a stats package so we don't need super speedy access, redundancy and size are the main objectives. I was looking at Raid 5 but I read some comments about how it's not good to use with very large drives; something along the lines of if one drive goes bad and then during the rebuild you're bound to get a write error because of the nature of large disks then the whole array will go tits up.

The sysadmin suggested raid6 but I the hardware seems to be astranomical for this. This is what he suggested:

The smallest RAID that makes sense for purchase today is a 12-BAY
Infortrend with 12 1TB drives for a total of 12TB. However, because of
format'ing and the RAID-6 technology, you really only end up with about
9TB of useable space in this type of RAID.

The Infortrend model is a great RAID vendor. I have attached 4 quotes
I just recently obtained to provide you an idea on pricing for a variety
of sized RAIDs from ~9TB useable to ~26TB useable. These quotes are as
follows:

2U 12-Bay 1TB drives (12TB/~9TB usable): $4750.00
2U 12-Bay 2TB drives (24TB/~18TB usable): $6406.00
3U 16-Bay 1TB drives (16TB/~13TB usable): $6305.00
3U 16-Bay 2TB drives (32TB/~26TB usable): $8513.00


Seems like these things are pretty pricey. What's wrong with this?

http://eshop.macsales.com/item/Other World Computing/RP936QE8.0T/

So in summary, do we really need to go with Raid6 rather than Raid5 and, why is the hardware so pricey!
 

JGruber

macrumors 6502
Feb 13, 2006
348
2
Have your buddy check out the Drobo.

I'm currently using the Drobo Elite plugged into a Gig switch, maxed out with 2TB drives.

For under 3K, you can get a DroboPro with 8 2TB drives, for a total of 16TB. Closer to 10TB if you use dual disk redundancy.
 

Mord

macrumors G4
Aug 24, 2003
10,091
23
UK
haha,

We have a Lian-li 2100 filled with 8x 2TB drive in RAID 5 connected to a 9550SX-8lp hardware raid card on an old dual socket 940 E-ATX motherboard.

The whole thing cost us ~£650, everything was second hand or free but the hard drives.

You can get this kind of set up a whole bunch cheaper, even if you don't have our connections
 

logandzwon

macrumors 6502a
Jan 9, 2007
575
9
A buddy is looking to get a raid as they are now using separate drives connected to a linux box. I'm relative tech savvy and know how they basically work but could use a little help on deciding what's best. The application that will be accessing the data is a stats package so we don't need super speedy access, redundancy and size are the main objectives. I was looking at Raid 5 but I read some comments about how it's not good to use with very large drives; something along the lines of if one drive goes bad and then during the rebuild you're bound to get a write error because of the nature of large disks then the whole array will go tits up.

The sysadmin suggested raid6 but I the hardware seems to be astranomical for this. This is what he suggested:

The smallest RAID that makes sense for purchase today is a 12-BAY
Infortrend with 12 1TB drives for a total of 12TB. However, because of
format'ing and the RAID-6 technology, you really only end up with about
9TB of useable space in this type of RAID.

The Infortrend model is a great RAID vendor. I have attached 4 quotes
I just recently obtained to provide you an idea on pricing for a variety
of sized RAIDs from ~9TB useable to ~26TB useable. These quotes are as
follows:

2U 12-Bay 1TB drives (12TB/~9TB usable): $4750.00
2U 12-Bay 2TB drives (24TB/~18TB usable): $6406.00
3U 16-Bay 1TB drives (16TB/~13TB usable): $6305.00
3U 16-Bay 2TB drives (32TB/~26TB usable): $8513.00


Seems like these things are pretty pricey. What's wrong with this?

http://eshop.macsales.com/item/Other World Computing/RP936QE8.0T/

So in summary, do we really need to go with Raid6 rather than Raid5 and, why is the hardware so pricey!

wait... what? You need 9TB attached to a single linux box?

As far as RAID, there are a bunch of different versions, each has different practical applications. However you need to understand that RAID is not the same as backups. It will do anything about data corruption, virus, etc...
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
A buddy is looking to get a raid as they are now using separate drives connected to a linux box. I'm relative tech savvy and know how they basically work but could use a little help on deciding what's best. The application that will be accessing the data is a stats package so we don't need super speedy access, redundancy and size are the main objectives.

What size? 3TB of raw data to analyze in a stats package is huge . If only currently have 800GB of stats data then 3TB is more than enough to survive off of for a while.

I think the sysadmin may be looking at rack size issues rather than disk size. There is an 8 disk version from same vendor which I'm sure is cheaper.

http://www.infortrend.com/americas/main/2_product/es_a08(12)s-g2130.asp
(even a desktop unix if the linux box is deskside.
http://www.infortrend.com/americas/main/2_product/es_a08s-c2132(4).asp)

If looking to conserve rack space then 12 disks in 2U makes more sense than 8 disks in 2U. If looking to conserve money it does not if the amount of data you need to store fits on 8 disks. Especially if the both linux box and storage are going to be mated and serve a single purpose. Even more so if this is a lab and 1-2U of rack space doesn't cost tons of money to "rent".



I was looking at Raid 5 but I read some comments about how it's not good to use with very large drives; something along the lines of if one drive goes bad and then during the rebuild you're bound to get a write error because of the nature of large disks then the whole array will go tits up.

Bound to get an error is perhaps a bit extreme. Your risk of failure goes up.
Your service levels go down too. It all depends on just how slow the rebuild time is. If the RAID array is mostly full and is being hammered by user/application read write requests there won't be much bandwidth to do a rebuild. Copying 2TB at a couple MB/s takes much longer than copying 1TB at a couple MB/s.

If can take a large service outage and devote all the resources to rebuild then it will take less time. Likewise if the array is gross overkill and it is only a relatively small amount of data the rebuild will go a bit quicker.



The sysadmin suggested raid6

RAID 6 is easier to justify once have an over abundance of disk slots to fill up. "We have 12 disk slots so gotta fill them up".

but I the hardware seems to be astranomical for this. This is what he suggested:

Not too astronomical given what getting. It appears this includes the RAID controller. 12 disk boxes don't come cheap. Usually SAS groups are set up of groups of 4. This is 3 groups of 4 rather than just 2 or 1.
You are also paying for the bang-for-the-buck of dense packing the 2U space.

Seems like these things are pretty pricey. What's wrong with this?

http://eshop.macsales.com/item/Other World Computing/RP936QE8.0T/

1. it is only 4 drives in 1U space.
2. it doesn't include a RAID card that can do some of the higher end functions.

RAID 5 (and 6 ) make your system susceptible to power losses. You'll need to put both the linux box and storage boxes on a UPS. If the linux box sucks lots of power that usually presents a problem. That "hack" that folks sometimes use it is to put a battery on the RAID card itself so that it can hold the incompletely written state until someone can turn the power back on. If power never goes off you don't necessarily need that.


That solution works great if really only need around 1-1.5 TB of space.
Just go RAID-1 with 4 1TB drives. Even easier if can split up the space (don't need a 1TB single directory which hardly anyone does. Two 1TB RAID 1 volumes is plenty if the data is less than 800GB big.). You have money to be able to buy a second unit with which to do 1 TB backups to.

In linux can mount the two volumes in two different sub directories.

stat_data

with subdirectores

year_one year_two

then mount each 1TB volumes at those two mount points,.



So in summary, do we really need to go with Raid6 rather than Raid5 and, why is the hardware so pricey!

It is more a matter of your data being "too large" to regularly and frequently back up to an offline/offsite location. Again it is one of those self fulling things when have 9+TB it is too big to move. It is also similar to the "too big to fail" thing with the banks. After they get to be toooooo big then even if have a backup, it would take an eon to do a restore. So folks spend gobs of money to avoid it.

A more prudent solution is don't let the data grow "too big" if don't really need it. Archiving old data never used anymore out of the RAID array helps if possible. One problem these days is that folks don't use the "delete" option anymore. They just collect everything on the Tier 1 disk system.
 
Last edited:

mrbash

macrumors 6502
Aug 10, 2008
251
1
Just did this myself

I recently out-grew my mac-min as a file server for my media files. I had about 6 TB of disks that I was backing up one at a time in a manual way. I needed something better, so this is what I did:

I had a Linux server, and an unused case that could fit in a 4U slot by attaching rails.

I used several arrays of software mirrors to create 3 Logical drives, 1 for my movies, 1 for my TV shows, and 1 for my general storage. Each of the Logical drives are growable, and redundant (1 failure) and hot-swapable. The array's are not hardware dependent, meaning I can move them to another system. Plus, I use Netatalk to offer AFP access to the shares and Bonjour advertising of the shares.

The total cost for me was $150 because I just had to buy the SAS controller and the break-out cables. If you didn't have anything, it would be about 130/drive + the enclosure and power supply + the the SAS controller + cables and breakout. It would be less than $1000 for an external 4U solution that you can expand.

If you are interested, PM me and I can give you specifics of my setup and experience. It was fairly easy to set up.
 
Last edited:

Magrathea

macrumors regular
Original poster
Aug 21, 2008
201
0
More info

The application is in a University environment and will be rack mounted in a server room - that data is from a various radars so yes, there is lot of it!!

Currently individual drives are attached haphazardly to the linux machine in a room that gets to be 90F or more, drives don't like this and I'm surprised that they all haven't died off (supposedly it would cost $35k to fix the aircon in the room but a new little data center is in the works for some other department but it has space for the rack )!!!

This is the lowest option that the sysadmin suggested. Infortrend SAS-to-SATA, 2U/12-bay RAID with RAID6, Single controller with 256MB, ASIC 400, 2x SAS(wide) host ports - $2,701.00

Seagate 1TB, 7,200RPM SATA Disk Drive in Hot-Swappable Drive Carrier - $1,932.00

I guess these prices are not too bad for a professional setup. I was also looking at the Drobo but the elite was around $3k or so.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
The application is in a University environment and will be rack mounted in a server room - that data is from a various radars so yes, there is lot of it!!

If running a stats package on radar data then I suspect the actual valuable output data is much smaller than the input data.
If the radar data can be restored from offline backup and not on super critical deadlines, then RAID-5 should be more than sufficient for it. Unless the system is directly hooked to various radars it is coming on data storage units anyway.

I'd divide whatever you get into two systems (or two subsystems if the RAID card can deal with subsets). One for the bulky raw data and another for the various answers and/or pre-digested input data that the stats package produces. The second is where have cause to possibly need RAID-6. If answers take a long time to recalculate you don't want to loose them.

I wouldn't blow RAID-6 on completely static data that isn't changing unless there was some critical time constraints on a restore. RAID 5 is also close to being overkill; again mainly saving potential restore time along with some read speed increases (the latter weren't a priority). Frankly, just a set of bunch disks would be sufficient to store fixed historic data along with an offsite backup in case needed do a restore. You should have a backup set regardless of RAID "no", 5 , or 6. In the "RAID no" situation, you don't even hook each disk in the JBOD to each other ( spanning or RAID 0 or 5 ). That way if one disk fails all you have to restore is just that one disk. Not whole array. Not recalculating the data... just copy from backup to that specific disk. It is fast, simple , and works. The money being sunk into the redundant parity could be put into paying for backup space.


It is easier to split 12 disks into two different subsystems.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.