Promise Pegasus to support 14 Mac Pros

deconstruct60 · Aug 14, 2012

rezatayebi said:
These machines will be used as both workstations and render nodes.
Our budget is $75000 to buy hardware. The maximom we can spend on storage is $15000. I hope we could something good with that.

It seemed as though at least one of those machines was not going to be used as a workstation or render node. It seem like the intention was to make one Mac Pro that would server as the facade to the storage. Or is that $2-4K for that Mac Pro already weaved into the $15K ?

If not, one way to grow the $15K larger is to dump that Mac Pro's allocation into the "storage' budget and put the disks and "the brains" into one box. Something along the lines of:

http://www.ixsystems.com/storage/ix/performance/truenas-pro.html#specs

or

https://shop.oracle.com/pls/ostore/f?p=700:6:0::::P6_LPI:424445158091311922637762

(although the standard config on one of those starts at $19K )

There are ways to "roll your own" but still get support with something like

http://www.nexenta.com/corp/products/what-is-openstorage/nexentastor

[ NexentaStor is a bit more commercially supported than FreeNAS but likely also much more expensive if need to store 10's of TB.

32TB raw storage is about $4.8K for 'silver' support. But coupled to a $6K 2U server is about $11K. Hook up a pair of these and that would be about $22K. Whether that second system is HA set up (costs extra) or back up depends on the how big a pile data going to manage. ]

Flushed out with a large RAM and a decent sized read and write SSD based caches (e.g., 400GB http://www.anandtech.com/show/6124/the-intel-ssd-910-review ) and shouldn't need to go into spindle overkill to keep up. If the file access patterns are relatively regular then if set up a system to cache the rolling subset working on probably don't need a 1:1, HDD spindle to workstation ratio.

If you put file serving into one box you don't need an expensive network between the "file serving" brains and the disks.

nanofrog · Aug 14, 2012

deconstruct60 said:
If looking for a DAS (direct attached storage) then solutions more along these lines

http://www.sansdigital.com/towerraid-/index.php

One of the 8 bay models. You could have two, three disk RAID 5 volumes with two hot spares hooked to a Mac Pro.

Cost wise for that many machines, it's going to far exceed their storage budget. Likely looking at least $4k per machine for a sufficient DAS per system (which would put it at $54k, and if the necessary capacity pushes it to $5k per system, the OP would be looking at $70k just for storage). So SAN would be the better alternative in this case, as it's more cost effective (going to be difficult to get the necessary throughputs + sufficient capacity for that to begin with, particularly after adding the necessary 10GbE or FC HBA's per machine).

DJenkins said:
There have been a lot of consumer solutions presented for what really is an intensive professional requirement. Pegasus, Drobo etc. will absolutely not be able to handle this, they are best used attached to single workstations or as nearline backups.

Precisely.

deconstruct60 said:
It seemed as though at least one of those machines was not going to be used as a workstation or render node. It seem like the intention was to make one Mac Pro that would server as the facade to the storage. Or is that $2-4K for that Mac Pro already weaved into the $15K ?

This is what I'm wondering, but suspect the $75k figure is for everything (MP's, storage system, and high speed networking equipment).

deconstruct60 said:
If not, one way to grow the $15K larger is to dump that Mac Pro's allocation into the "storage' budget and put the disks and "the brains" into one box.

I suspect this will be a necessity in order to fit the budget, assuming that $75k figure is for everything. Even then, it might be a little short without rolling the SAN from parts, assuming the OP or someone else in the company has the IT skills necessary to manage it.

deconstruct60 · Aug 15, 2012

nanofrog said:
Cost wise for that many machines, it's going to far exceed their storage budget. Likely looking at least $4k per machine for a sufficient DAS per system (which would put it at $54k, and if the necessary capacity pushes it to $5k per system, the OP would be looking at $70k just for storage).

In post #11 he said what was looking at was something like :

( pool of 13 + 2 Mac Pros ) <--- GbE network ---> ( Mac Pro ) <--- SAN/DAS ---> [ storage box ]

That can work as long as the Mac Pro (and/or the GbE network ) doesn't become a choke point. The tower box I was pointing would scale only as many Mac Pro as need to "front" the storage. ( Kind of assumes this is all being done via AFP file sharing because if it is just CIFs or NFS the Mac Pro at that chokepoint becomes questionable. )

If the $15K storage budget covers everything from "GbE network" to the right it is likely way off base. What I was getting at later was that instead of grouping "Macs Pros", stuff in () , and storage box , stuff in [] it is probably more beneficial to group money by wrapping around the whole storage sysyem ( and if need a new network and "file server" node to distribute bits that should be in the the "storage" budget. )

So SAN would be the better alternative in this case, as it's more cost effective (going to be difficult to get the necessary throughputs + sufficient capacity for that to begin with, particularly after adding the necessary 10GbE or FC HBA's per machine).

If the machine pool all need stream just 1-2 4K streams then a GbE link may work. Would just need to get rid of the switch in the middle and use storage server which can scale to 15 links. One of the newer E5 boxes with 3 x16 PCI-e slots and at least one x8 slot would work. (e.g., the hp 820 dual E5 with 3 x16 , 1 x8 , 2 x4 ).

Two x8 RAID cards out to two storage expansion boxes
Two x8 6 port GbE cards
One x4 4 port GbE card
[ internal storage sleds purely used only for storage serving OS (RAID 1) and also SSD caching if available in the OS. ]

The box has two integrated ports than can use for admin and general LAN connectivity. That gives 16 ports total which could all direct link to each box in the pool. ( 15 pool links and one potential "interpool" link ). One GbE port on the pool Mac Pros would be to this dedicated storage only network and the other to general LAN work.

If come up with another project that independent then create another " pool + storage" pod and use the "interpool" link push data between the corresponding storage nodes.

10GbE and FC become necessary when going to hide the file server behind a switch and everyone has to share some fixed bandwidth coming out of the server.

Sometimes that is driven by the "need to store everything in on big humongous pile" syndrome. In some cases though, the storage work can be split up ( reasonably colocated so pool machine are in relatively close proximity to storage) so don't really need that.

theSeb · Aug 15, 2012

deconstruct60 said:
Two x8 RAID cards out to two storage expansion boxes
Two x8 6 port GbE cards
One x4 4 port GbE card
[ internal storage sleds purely used only for storage serving OS (RAID 1) and also SSD caching if available in the OS. ]

The box has two integrated ports than can use for admin and general LAN connectivity. That gives 16 ports total which could all direct link to each box in the pool. ( 15 pool links and one potential "interpool" link ). One GbE port on the pool Mac Pros would be to this dedicated storage only network and the other to general LAN work.

That is a very neat solution. Out of interest, what OS and file system would be best suited for this set up (SSD caching etc)? If the individual workstations are connected directly to the file server, which is then connected to a LAN/Internet, I assume that the file server would be able to share the Internet/LAN connectivity to the workstations?

beaker7 · Aug 15, 2012

I am looking at the Synology DS3612xs personally. 12x 4TB gets you ~24TB usable in RAID 10 and you can drop a dual 10GbE in it for like $8k. Would need 10GbE switch and cards though.

----------

deconstruct60 said:
In post #11 he said what was looking at was something like :

( pool of 13 + 2 Mac Pros ) <--- GbE network ---> ( Mac Pro ) <--- SAN/DAS ---> [ storage box ]

That can work as long as the Mac Pro (and/or the GbE network ) doesn't become a choke point. The tower box I was pointing would scale only as many Mac Pro as need to "front" the storage. ( Kind of assumes this is all being done via AFP file sharing because if it is just CIFs or NFS the Mac Pro at that chokepoint becomes questionable. )

If the $15K storage budget covers everything from "GbE network" to the right it is likely way off base. What I was getting at later was that instead of grouping "Macs Pros", stuff in () , and storage box , stuff in [] it is probably more beneficial to group money by wrapping around the whole storage sysyem ( and if need a new network and "file server" node to distribute bits that should be in the the "storage" budget. )

If the machine pool all need stream just 1-2 4K streams then a GbE link may work. Would just need to get rid of the switch in the middle and use storage server which can scale to 15 links. One of the newer E5 boxes with 3 x16 PCI-e slots and at least one x8 slot would work. (e.g., the hp 820 dual E5 with 3 x16 , 1 x8 , 2 x4 ).

Two x8 RAID cards out to two storage expansion boxes
Two x8 6 port GbE cards
One x4 4 port GbE card
[ internal storage sleds purely used only for storage serving OS (RAID 1) and also SSD caching if available in the OS. ]

The box has two integrated ports than can use for admin and general LAN connectivity. That gives 16 ports total which could all direct link to each box in the pool. ( 15 pool links and one potential "interpool" link ). One GbE port on the pool Mac Pros would be to this dedicated storage only network and the other to general LAN work.

If come up with another project that independent then create another " pool + storage" pod and use the "interpool" link push data between the corresponding storage nodes.

10GbE and FC become necessary when going to hide the file server behind a switch and everyone has to share some fixed bandwidth coming out of the server.

Sometimes that is driven by the "need to store everything in on big humongous pile" syndrome. In some cases though, the storage work can be split up ( reasonably colocated so pool machine are in relatively close proximity to storage) so don't really need that.

You could just get a switch with a 10Gb uplink or two. Problem solved.

deconstruct60 · Aug 15, 2012

theSeb said:
That is a very neat solution. Out of interest, what OS and file system would be best suited for this set up (SSD caching etc)?

ZFS has built in support for SSD caching. Both in logging updates (ZIL) and read cache (L2ARC ) (http://constantin.glez.de/blog/2011...tions-about-flash-memory-ssds-and-zfs#benefit ). The NAS protocols sent out to the workstations would be in CIF or NFS with relatively native ZFS support. So the Solaris variants ( Oracle's or Nextena ) and FreeBSD will work.

Same RAID cards do it now also. For example LSI's Nytro cards.

http://www.lsi.com/products/storagecomponents/Pages/NytroMegaRaid.aspx

But those cards typically restricted to Windows and Linux driver support. Those a limited in this context because likely will need 100's of GB to have an effective cache since the individual files are large and high number of their blocks accessed at a time.

Since the solution dumps all of the GPU cards from the box will need a user interface that is supplied through web. Not really want that confined to the set of workstations in the pool. Most llikely the server admin machine isn't in the pool.

If the individual workstations are connected directly to the file server, which is then connected to a LAN/Internet,

Both workstations and servers are connected to the LAN. Not trying to use this server as a router. It

[LAN network cloud ] <---> workstation pool <----> [ storage network cloud ]
__ | ___
file server <--------------------------------------------------^
__________

The "LAN cloud" and "storage cloud" can be (even should be ) on different subnet groupings.

Most likely don't want everybody to get access to these storage servers. If only a limited group has access that can be distinct separate subnetwork. The "LAN cloud" is for all the mundane Internet and internal traffic ( email, chat , web , calendar , etc. ). None of that stuff should be bogging down file transfers.

Very similar to when have the "Lights out management" subset completely separate from the normal LAN service network. It can be run through separate switches for additional redundancy.

I assume that the file server would be able to share the Internet/LAN connectivity to the workstations?

I guess it could but don't want that since it kills bandwidth. The only "routing" it needs to do is from the logical sever port for the file sharing service to the individual workstations.

You also want to keep them distinct because the speed requirements are likley going to drift apart too. At some point the storage cloud subnet could move to 10GbE or FCoE and onto a PCI-e card in the pool Mac Pros.

theSeb · Aug 15, 2012

deconstruct60 said:
ZFS has built in support for SSD caching. Both in logging updates (ZIL) and read cache (L2ARC ) (http://constantin.glez.de/blog/2011...tions-about-flash-memory-ssds-and-zfs#benefit ). The NAS protocols sent out to the workstations would be in CIF or NFS with relatively native ZFS support. So the Solaris variants ( Oracle's or Nextena ) and FreeBSD will work.

Same RAID cards do it now also. For example LSI's Nytro cards.

http://www.lsi.com/products/storagecomponents/Pages/NytroMegaRaid.aspx

But those cards typically restricted to Windows and Linux driver support. Those a limited in this context because likely will need 100's of GB to have an effective cache since the individual files are large and high number of their blocks accessed at a time.

Since the solution dumps all of the GPU cards from the box will need a user interface that is supplied through web. Not really want that confined to the set of workstations in the pool. Most llikely the server admin machine isn't in the pool.

I was wondering whether ZFS would be suitable for a high demand workload such as this, but that's really my ignorance about ZFS coming out. Since OS X is completely unnecessary and, frankly, not needed for the file server, what kind of hardware in terms of CPU power would be required to support this kind of set up? Would workstation grade components be needed?

Both workstations and servers are connected to the LAN. Not trying to use this server as a router. It

[LAN network cloud ] <---> workstation pool <----> [ storage network cloud ]
__ | ___
file server <--------------------------------------------------^
__________

The "LAN cloud" and "storage cloud" can be (even should be ) on different subnet groupings.

Most likely don't want everybody to get access to these storage servers. If only a limited group has access that can be distinct separate subnetwork. The "LAN cloud" is for all the mundane Internet and internal traffic ( email, chat , web , calendar , etc. ). None of that stuff should be bogging down file transfers.

Very similar to when have the "Lights out management" subset completely separate from the normal LAN service network. It can be run through separate switches for additional redundancy.

I guess it could but don't want that since it kills bandwidth. The only "routing" it needs to do is from the logical sever port for the file sharing service to the individual workstations.

You also want to keep them distinct because the speed requirements are likley going to drift apart too. At some point the storage cloud subnet could move to 10GbE or FCoE and onto a PCI-e card in the pool Mac Pros.

I assume this would be done via the two GbE ports on the Mac Pro? I can see your points, but would it not be more efficient to use a smart switch with link aggregated GbE connections to all of the machines (a 48 port smart switch with 802.3ad is relatively inexpensive), instead of keeping the LAN and storage components separate, considering the relatively modest bandwidth requirements of Internet and other LAN connectivity?

deconstruct60 · Aug 15, 2012

beaker7 said:
I am looking at the Synology DS3612xs personally. 12x 4TB gets you ~24TB usable in RAID 10 and you can drop a dual 10GbE in it for like $8k. Would need 10GbE switch and cards though.

Small-Tree's 10GbE 24 Port switch is bigger than the whole storage budget here: $15,385 . Nevermind the ~ $400/port 10GbE cards (smallest is a 2 port ). So another $12K for the workstation pool. [ I thought 10GbE would come down out of the nosebleed zone now that Intel has a "on motherboard" chipset solution and 40G and 100GbE have arrived. Looks like it going drop much slower than I thought. ]

As long as the concurrent bandwidth is far below 10GbE that will work. That means 50-75% of the workstations not doing file server accesses while the others did. Or it means a substantially smaller pool of workstations.

The problem with very large disks and effectively just 6 spindles is that 15 workstations all grabbing different files can easily overwhelm them. Nevermind there there is now choke point at the server boundary. 10Gb/s sound like alot up until there are 12 concurrent requests that all want 800Mb/s from storage.

If for the most part the users are "timesharing" for access to the storage this will work quite smoothly. If an aggregated 4GbE 3612xs wouldn't be to bad if only had 2-4 concurrent accesses.

The only missing piece about the Synology is that it doesn't do virtual hybrid volumes. While the 10GbE network may hold up the concurrent high sequential bandwidth access only get solved by throwing "lots of spindles" at the problem.

You could just get a switch with a 10Gb uplink or two. Problem solved.

A switch with a 10GbE uplink is fine for moving traffic between switches but isn't really going to help with a storage pool access between the "worker nodes" and the "storage head server".

If there is a low concurrent bandwidth then can drop down to just one 6 GbE port card. Use one aggregating switch between the workstation pool and the file server. The trade-off costs between the other 6 port card and 4 port card is likely pretty close to covering the switch so there isn't a huge rise in costs.

It somewhat depends upon on synchronized the workflow is between individual workstations. If everyone hits "file > open" or " file > save" in a small timespan(s) every day can make that 3615 groan under the load.

nanofrog · Aug 15, 2012

deconstruct60 said:
In post #11 he said what was looking at was something like :

( pool of 13 + 2 Mac Pros ) <--- GbE network ---> ( Mac Pro ) <--- SAN/DAS ---> [ storage box ]

That can work as long as the Mac Pro (and/or the GbE network ) doesn't become a choke point. The tower box I was pointing would scale only as many Mac Pro as need to "front" the storage. ( Kind of assumes this is all being done via AFP file sharing because if it is just CIFs or NFS the Mac Pro at that chokepoint becomes questionable. )

If the $15K storage budget covers everything from "GbE network" to the right it is likely way off base. What I was getting at later was that instead of grouping "Macs Pros", stuff in () , and storage box , stuff in [] it is probably more beneficial to group money by wrapping around the whole storage system ( and if need a new network and "file server" node to distribute bits that should be in the the "storage" budget. )

I went by the basic concept of DAS @ 1 per system.

As per the OP's desires, your description seems more accurate, and is similar to mine.

But given the budget, it's a problem, as the potential for choking seems quite high with a simple configuration. Keep in mind, I'm assuming worst case without further information on usage, and in such a case, the equipment to accomplish this won't be cheap (i.e. each user saturating a 10GbE link at the some time).

deconstruct60 said:
If the machine pool all need stream just 1-2 4K streams then a GbE link may work. Would just need to get rid of the switch in the middle and use storage server which can scale to 15 links. One of the newer E5 boxes with 3 x16 PCI-e slots and at least one x8 slot would work. (e.g., the hp 820 dual E5 with 3 x16 , 1 x8 , 2 x4 ).

Two x8 RAID cards out to two storage expansion boxes
Two x8 6 port GbE cards
One x4 4 port GbE card
[ internal storage sleds purely used only for storage serving OS (RAID 1) and also SSD caching if available in the OS. ]

Assuming the band per user would fit this scenario, it could work, and be a much easier pill to swallow cost wise.

But we need more information to know one way or the other, so I'm thinking worst case ATM in regard to user bandwidth requirements.

deconstruct60 · Aug 15, 2012

theSeb said:
I was wondering whether ZFS would be suitable for a high demand workload such as this, but that's really my ignorance about ZFS coming out. Since OS X is completely unnecessary and, frankly, not needed for the file server, what kind of hardware in terms of CPU power would be required to support this kind of set up? Would workstation grade components be needed?

The workstation components are necessary if want this high of throughput. Mainstream CPUs only pragmatically have x16-20 PCI-e throughput. Only with the 80 PCI-e lanes you get with two Xeon E5 is this not highly oversubscribed slots here. However, two E5 2620s would be fine as far as "CPU" goes. RAM is probably a more pressing issue. If ZFS doesn't cache smartly and effectively here it isn't going to perform well. If sticking to the equivalent of "RAID 10" in ZFS there should be a huge amount of parity work to do. Even if do some RAIDZ vdevs the 12 cores match ZFS's general philosophy of "lots of cores and threads available" .

If you don't need that much concurrent throughput then yeah you can scale down.

I assume this would be done via the two GbE ports on the Mac Pro? I can see your points, but would it not be more efficient to use a smart switch with link aggregated GbE connections to all of the machines (a 48 port smart switch with 802.3ad is relatively inexpensive),

No for a couple of reasons.

First, because link aggregating gets rid of the security separation. One thing that Fibre Channel and iSCSI have is a notion that the client and server should be authenticated to each other. It is more "efficient" from a security checking standpoint to know that the wifi phone in the conference room can't connect to the storage server(s) at all.

If the switch has VLAN support in addition to 802.3ad you can it to implement both subnets with one device. And if don't need the very high concurrency bandwidth to the server can drop down to just one 6 port card to the single Mac Pro port.

You don't get the dual link off the Mac Pro but if the Mac Pro is not getting more than 1GbE worth of feed that is a bandwidth cost issue, not an efficiency issue. A separate 2 port GbE card per Mac Pro solves that if it gets to be an issue. (or a low cost 1 port GbE if there are expected empty slots... don't really need a aggregate link to send mundane traffic). In this context with the budget constraints most likely storage system isn't going to able to pump out more than 1 GbE to more than 2-3 systems anyway for these prices.

But yes if got to the point the server could pump out data faster than the Mac Pro could consume and didn't have more budget... bounding the "inbound' on the Mac Pros would a reasonable stopgap.

However, it isn't going to more efficient to mix "Apple notification server traffic" , IM , and other chirpy traffic onto the same Ethernet connection. None of that is going to improve bandwidth to the storage server if the Mac Pro connections are switching over to go do those things. If there is one that that sends Ethernet climbing higher is race contention for multiple systems on the same "line" trying to jump in and get access at the same time.

instead of keeping the LAN and storage components separate, considering the relatively modest bandwidth requirements of Internet and other LAN connectivity?

It isn't the GB/s transferred it is the switching back and forth that is the issue. If the server has to resend jumbo frames because they got lost in UDP you'll loose more data than with just normal packets.

deconstruct60 · Aug 15, 2012

nanofrog said:
But given the budget, it's a problem, as the potential for choking seems quite high with a simple configuration. Keep in mind, I'm assuming worst case without further information on usage, and in such a case, the equipment to accomplish this won't be cheap (i.e. each user saturating a 10GbE link at the some time).

Humans don't really work all that fast compared to computers. It would also be more than extreme to be shipping around uncompressed 4K video as primary activity to a large number of workstations. A lossless compressed 4K video file is significantly easier because not shipping around a huge amount of largely redundant data. That would be a core issue of framing the solution around moving something that doesn't really need to be moved.

But we need more information to know one way or the other, so I'm thinking worst case ATM in regard to user bandwidth requirements.

I think the point more so would be how to easily go about measuring the "canonical bandwidth" for this problem. Logging activity monitor info or other instruments.

nanofrog · Aug 15, 2012

deconstruct60 said:
Humans don't really work all that fast compared to computers. It would also be more than extreme to be shipping around uncompressed 4K video as primary activity to a large number of workstations. A lossless compressed 4K video file is significantly easier because not shipping around a huge amount of largely redundant data. That would be a core issue of framing the solution around moving something that doesn't really need to be moved.

Compressed information may not be possible though, assuming users will be creating the content from scratch initially, as the application may not be able to compress the initial data (need to save an uncompressed copy, then run a compression tool).

I certainly understand the point you're trying to make though, and would be usable at some point, if not initially.

deconstruct60 said:
I think the point more so would be how to easily go about measuring the "canonical bandwidth" for this problem. Logging activity monitor info or other instruments.

Metrics are definitely needed, but I suspect their current setup isn't offering the performance they need per system/user anyway (no mention of a significant DAS on either existing workstation mentioned, so presume just a few drives internal to the MP is the current configuration).

Thus the desire to know more precise data as to what the avg. user on these systems are doing (how many 4k streams on avg., is it compressed or not, concurrent use, ...) in order to determine the performance requirement they're actually going to need. Otherwise, a solution is as likely to fall short as be in excess of what's actually needed. Granted, it's not as precise as metrics provided by monitoring software, but enough of it is based on best-guess from available information to begin with (i.e. growth rate and capacity consumption rate, as in reality, either could change from what a system was designed around).

Once something is implemented, proper metric/monitoring software tools can be implemented in order to keep up with it, and maintain efficiency as time progresses.

rezatayebi · Aug 16, 2012

Thanks a lot people.
Now I know a lot more about storage on server

As soon as we do the setup, I'll let you know our solution. Thanks again

sfxguy · Aug 16, 2012

rezatayebi said:
Thanks a lot people.
Now I know a lot more about storage on server

As soon as we do the setup, I'll let you know our solution. Thanks again

I am setting up something very similar for composting and 3d work.

I am going to go Pegasus r6 with a mac mini server set up with dual link aggregation and two Drobos for backup of the whole system.

I'd love to hear how your solution works out.

Search

Search

Promise Pegasus to support 14 Mac Pros

deconstruct60

macrumors G5

nanofrog

macrumors G4

deconstruct60

macrumors G5

theSeb

macrumors 604

beaker7

Cancelled

deconstruct60

macrumors G5

theSeb

macrumors 604

deconstruct60

macrumors G5

nanofrog

macrumors G4

deconstruct60

macrumors G5

deconstruct60

macrumors G5

nanofrog

macrumors G4

rezatayebi

macrumors regular

sfxguy

macrumors regular

Our Staff