it's worth mentioning that the database is an OLTP db.
It all comes down to the IOPS.
Take the following into consideration (from Wiki's
IOPS page - yes, I'm being lazy tonight

):
Some hard drives will improve in performance as the number of outstanding IO's (i.e. queue depth) increases. This is usually the result of more advanced controller logic on the drive performing command queuing and reordering commonly called either Tagged Command Queuing (TCQ) or Native Command Queuing (NCQ). Most commodity (consumer grade) SATA drives either cannot do this, or their implementation is so poor that no performance benefit can be seen. Enterprise class SATA drives, ... will improve by nearly 100% with deep queues.
This is where a controller with the best drives suited for the task makes a significant difference.
Another little tidbit from the same page (specifically on the Intel X-25E):
Newer flash SSD drives such as the Intel X25-E have much higher IOPS than traditional hard disk drives. In a test done by Xssist, using IOmeter, 4KB RANDOM 70/30 RW, queue depth 4, the IOPS delivered by the Intel X25-E 64GB G1 started around 10000 IOPs, and dropped sharply after 8 minutes to 4000 IOPS, and continued to decrease gradually for the next 42 minutes. IOPS vary between 3000 to 4000 from around the 50th minutes onwards for the rest of the 8+ hours test run. Even with the drop in random IOPS after the 50th minute, the X25-E still has much higher IOPS compared to traditional hard disk drives.
This is rather nice, but you also pay for it.
Another thing to keep in mind in terms of write wear:
- Enterprise mechanical (SATA or SAS) = 1E15 bit error rating
- SLC Flash = 1E5
Granted, wear leveling can help with SLC. But if the drive's are filled, there's less additional capacity for this, and it will increase the wear (write amplification = decreases the disk's lifespan). This can be particularly important in the enterprise realm, as the typical planning cycle for storage systems is 3 years. But budgetary restrictions/trepidation over major tech expenditures can push that to 5 years or so. So SLC may not be able to make it if there's insufficient unused capacity over the disks' lifespan (means you have to make sure that there's sufficient n disks to be sure there will be unused capacity over it's entire intended lifespan, and figure on 5, even if they swear on the most precious thing to them, that it will only be 3). This is exceptionally important IMO, as data corruption is a major problem to fix. It also translates into an expensive undertaking, due to the man hours spent to recover the storage system as well as lost revenue due to a down system.

Jobs tend to be lost over this sort of thing....
In the case you're asking about, OLTP or not, I've no idea what kind of load will be presented to the storage pool or how critical it is (had to make some assumptions @ 100 users). I wouldn't expect it to be that high in terms of IOPS (thinking that even though it's an OLTP, it's still more internal; users = humans on a keyboard).
But if each "user" is another server for example (think scaling up as you would with say an ATM network, each "user" being the main server for a state/province, you get the idea), the IOPS requirement will be significantly higher than if each user is an individual person entering data via a keyboard.
So I've kept to generalizations ATM.
the performance of SAS is quite remarkable compared to SATA I'll agree there, the price is nearly even justified
SAS is not only faster at IOPS, but they also tend to be more robust in terms of reliability as well (usually have better motors and servos running the mechanics than SATA models; for SATA, the increased MTBR ratings are a result of the sensors and firmware, not improved components - that is, they tend to have the same motors and servos as the consumer models of the same RPM ratings to reduce costs).
im trying to find a good comparison to compare these two, but am unable to find one - is it in that ananad article?
You can start there, but you'd also want to look up SATA and SAS enterprise disk reviews, as well as RAID card reviews (they can show what the IOPS are for those cards are for the tested configurations - gives a good idea of what the card can do, as well as the disk tech/models).
Tom's Hardware, Hardware Secrets, and Storage Review are good sites to look for this sort of information. But search out specific model numbers as well, and see if anything comes up.
what would you like to know? im not really sure what information to provide. basics are
- 100+ users during the day
- FileMaker 11 running on latest 2xMacPros. database is split onto 2 MacPros because of its size, 2nd mac pro in identical config
- 2x60GB SSDs in RAID0
- hourly backups of the entire database during business hours copied onto 2x60GB SSDs (see below for another inquiry)
- network usage at any given time on both machines is 20KB/s up to 500KB/s - fluctuations of 10MB/s in busy periods.
- disk activity follows a similar structure to network activity.
- read to write ratio is roughly 80% (read)-20% (write)
- anything else?
First off, you ought to be smacked for using a stripe set

.
Get it off of those ASAP, as it's not only dangerous, it's not suited for small files (think random access performance). You'd be better off either running a level 10 or at least a RAID 5 on the right controller for your primary data (i.e. either Areca or ATTO 6.0Gb/s model). In the past, level 10 was the way to go with relational databases, but that's changing due to newer RAID cards. Now parity based levels are viable alternatives (not exactly cheap by consumer standards, but this doesn't seem to be a consumer system, and is warranted in this case).
Now given an
n = 1 configuration for the servers you're working with, you really need to be running some form of redundancy for the OS as well as the data (totally stupid not to IMO, as the stripe set is an extremely weak spot as it's configured right now - backed up or not).
So use a RAID 1 for the OS/applications (attached to the ICH), and get the data on 10 if you're going to skip a RAID controller. 10 or parity based if you do get one, which I'd recommend for the IOPS performance that can be generated with higher queue depths.
I can't say this enough; I'd strongly recommend a RAID controller (see backup as to why, but it has relevance to expansion for the primary array for both capacity and IOPS load for the primary data as well).
The OCZ disks should be fine for an OS/applications array, as they're primarily read.
BTW, based on ~90 IOPS for SATA, 4x in a 10 configuration will be enough from what you've posted (consider this the bare minimum). But it will be near the limit (means you won't really be able to add users without bottlenecking). Either placing it on a RAID controller or using said controller to implement a different level (i.e. RAID 5), will improve matters. So will adding disks, which is possible this way (ICH is limited, and tops out lower). Splitting the RAID 1 to the ICH and the primary and backup duties to a RAID controller will improve dataflow as well (significantly once you go past a few disks).
- The capacity requirements seem low, but what kind of capacity growth is expected?
- How long is it to be in service?
- How many users will be added over this time frame?
This will allow you to figure out both the size of the card and disks to be used for the intended service life (allows you to handle expected growth without having to reconfigure the entire system).
onto my 2nd question, now that the main storage is sorted - the hourly backups are also very important for obviously backing up. the speed at which this backup runs is of utmost important, because the entire database is taken offline by Filemaker for the few minutes that it runs. currently it is 2x60GB OCZ Agility SSDs in RAID0. i have monitored performance of them, and its quite terrible. is it possible for these SLC drives to be effected by write amplification? or is that just MLC?
Write Amplification is relevant to SLC as well as MLC (present with any form of NAND Flash).
BTW,
the OCZ Agility disks are MLC based, not SLC (I'm assuming these disks are for the primary data as well as backup use = picked up 4x and built a pair of stripe sets). Not only is this the wrong type of NAND Flash to be using for this purpose (enterprise use), they're not designed for high IOPS.
in the real world i observed, 155.5MB/s peak data transmission overall. fluctuates between 40MB/s and 150MB/s read. 30MB/s and 160MB/s write. average of roughly 100MB/s read/write. large fluctuations towards the end, indicates high amount of fragmentation (write overhead?). backup took 3mins and 11 seconds.
I looking at the 155.5MB/s as the worst case
combined throughput (see above for configurations). If this isn't correct, let me know, as it will mean changing the number of disks, and perhaps the level used. It may also have bearing as to the port count on a RAID card.
so this indicates 3minutes of downtime every hour for a daily backup. over the period of the day (11 backups) that is basically 33minutes of downtime. the backup size is roughly 40GB each. i think i require the best and most optimal sequential write speeds onto this volume. ideas?
As mentioned, a stripe set isn't the best way to go about this. Given it's a database, the records are almost certainly small, so this is more along the lines of random access, not sequential (betting you've a lot of unbalanced stripes on those disks, because the files fit into one block; it also wastes space).
The idea with the RAID card, is to have a primary and backup array on the same card (they don't have to be identical, but it's not a bad idea; you want the backup as fast as the primary in this case, as it seems you need to minimize your down-time).
So you're looking at 8 ports minimum @ 2 * 4 disk level 10 arrays (i.e. ARC-1880i up to an ARC-1880ix24;
1880 page). For increased users (IOPS), capacity, or both, you'll want more ports than that (I like at least an additional 4 for expansion purposes). This will mean getting an external SAS enclosure and using the proper cable to connect it to the card.
This will get your backup at the same speed as the primary array, so backups will be as quick as possible (based on the write time of the configuration).