New Mac Pro channel bonding

alphaod · Oct 24, 2013

So I recall Phil mentioned that the new Mac Pro supports channel bonding on Thunderbolt 2.

I'm thinking, could Apple possibly allow users to interconnect say three Mac Pros together using Thunderbolt to communicate with one another and act as one coherent unit?

I think this would greatly increase the versatility of the new Mac Pro. Can you imagine the render performance of combining 6 powerful graphics cards, 3 processors, and terabytes of very fast solid state storage? Or maybe even more?! All these computers even at maximum still only using the fraction of the power of a regular server and the noise level less than the older Mac Pro.

Perhaps I'm just dreaming, but for this could seem possible with the increased bandwidth on six Thunderbolt 2 ports and the new Mac Pro modular design.

slughead · Oct 24, 2013

alphaod said:
All these computers even at maximum still only using the fraction of the power of a regular server and the noise level less than the older Mac Pro.

What is the decibel level of the nMP at load? Apple has not yet published that information.

ScottishCaptain · Oct 24, 2013

alphaod said:
I'm thinking, could Apple possibly allow users to interconnect say three Mac Pros together using Thunderbolt to communicate with one another and act as one coherent unit?

Not possible.

This is quite possibly the ultimate example of "easier said then done". Nobody seems to realize what kind of silicon you need to do this, or the extent of the software modifications required on top of that. Neither Apple's hardware nor Mac OS X will ever support anything even remotely like this.

alphaod said:
I think this would greatly increase the versatility of the new Mac Pro. Can you imagine the render performance of combining 6 powerful graphics cards, 3 processors, and terabytes of very fast solid state storage? Or maybe even more?! All these computers even at maximum still only using the fraction of the power of a regular server and the noise level less than the older Mac Pro.

I keep reading this and I don't get it.

Nearly every renderer out there worth the asking price already supports this. It's called distributed network rendering, and what puzzles me is that if you've ever used a medium to high-end renderer then you should already know this.

V-Ray supports it, Maxwell supports it, Octane supports it, Cinema 4D's AR3 and XMB support it (in R15 via Team Render), Luxology's Modo supports it, Mental Ray supports it, and most Renderman compliant renderers support it too. After Effects and Nuke support network rendering for large jobs as well.

The nMP will automatically do a 10000bT interlink between machines via a thunderbolt cable (this is the same thing as having a 10gBE network adapter). What more do you need? Again, if you're using a recent renderer, then you should have absolutely no problems rendering across more then one machine. We've had this technology for years- no, decades now.

So again, you don't need to be able to "combine" Mac Pros into a single machine to take advantage of their power. You don't even need a fast network connection to do it, though the faster the better (especially for large scenes) in general. You should know this if you're working in the CG industry.

-SC

Sharky II · Oct 24, 2013

slughead said:
What is the decibel level of the nMP at load? Apple has not yet published that information.

Yes they have!

kd5jos · Oct 24, 2013

ScottishCaptain said:
Not possible.

This is quite possibly the ultimate example of "easier said then done". Nobody seems to realize what kind of silicon you need to do this, or the extent of the software modifications required on top of that. Neither Apple's hardware nor Mac OS X will ever support anything even remotely like this.

This is so complicated and so much easier said than done here is an example in OS X 10.1:
http://www.stat.ucla.edu/computing/clusters/deployment.php

That was back on G3's

Or here where you pick a scripting/programming language to do it in...
http://bonsai.hgc.jp/~mdehoon/software/cluster/software.htm

What exactly is it that you think needs to be there in terms of silicon? The new MacPro is capable of 7 teraflops. Do YOU understand what that means? Really I am asking do YOU understand what 7 teraflops is?

What part of this is so impossible?

sjh · Oct 24, 2013

network rendering w/ Ethernet vs. TB/2

kd5jos said:
This is so complicated and so much easier said than done here is an example in OS X 10.1:
http://www.stat.ucla.edu/computing/clusters/deployment.php

That was back on G3's

Or here where you pick a scripting/programming language to do it in...
http://bonsai.hgc.jp/~mdehoon/software/cluster/software.htm

What exactly is it that you think needs to be there in terms of silicon? The new MacPro is capable of 7 teraflops. Do YOU understand what that means? Really I am asking do YOU understand what 7 teraflops is?

What part of this is so impossible?

Note the different of network redering via Ethernet or the Web, vs. the direct connection via Thunderbolt/2

Unless you have some TokenRing or Firewire-Network-style controller it won´t work out of the box

issue clearer now?

deconstruct60 · Oct 24, 2013

alphaod said:
So I recall Phil mentioned that the new Mac Pro supports channel bonding on Thunderbolt 2.

Far more closer to Thunderbolt 2 being channel bonding. It is the same underlying channels that are present in Thunderbolt v1 that being being bonded together to get to the higher speed.

Primary a trade off from having largely seperate channels for encoded PCIe data and encoded DisplayPort data to having a single virtual channel that will carry both. As long as only carrying one you can see a net increase in many cases. But if carrying a healthy mix of both pragmatically it is going to be the exact same real world performance. TB v2 brings no aggregate bandwidth increase, it just changes how sharing happens.

I'm thinking, could Apple possibly allow users to interconnect say three Mac Pros together using Thunderbolt to communicate with one another and act as one coherent unit?

They do seem to allow running TCP/IP over Thunderbolt in Mavericks.

" ... High-speed collaboration.
With Thunderbolt 2 you can connect to your workgroup via SAN or NAS shared storage with up to 20Gb/s of bandwidth. Want to hook up for a simple and fast local transfer? IP over Thunderbolt support in OS X Mavericks gives you a 10Gb connection to another Mac for the cost of a mere cable. Connect several Mac Pro systems together with Thunderbolt cables and you have an instant network render cluster. ... "
http://www.apple.com/mac-pro/performance/

That will NOT get you a single coherent system. Thunderbolt is too slow for single sytem coherent interconnect. About an order of magnitude too slow. ( there are ad nauseam previous threads on this. )

As outline in the snippet above, this is far more targeted at being able to set up a "cheap" SAN network than a coherent system image.

I think this would greatly increase the versatility of the new Mac Pro. Can you imagine the render performance of combining 6 powerful graphics cards, 3 processors, and terabytes of very fast solid state storage?

None of that particularly requires a single coherent system image.

and the new Mac Pro modular design.

In the sense that the Mac Pro has moved to enabling external expansion it is modular. But this is no where near module "snap toghether" component variation of modular design.

Multiple computers sitting on a LAN is a network far more so than a modular design.

deconstruct60 · Oct 24, 2013

ScottishCaptain said:
Not possible.

It could be done ( possible). The performance , unless the applications were carefully crafted, would suck. Thunderbolt is too slow to do this effectively. However, running Thunderbolt TCP/IP interconnect as the backbone of a PVM ( http://en.wikipedia.org/wiki/Parallel_Virtual_Machine ) or MPI ( http://en.wikipedia.org/wiki/Message_Passing_Interface ) set-up would probably work pretty well. Take some embarrasingly parallel apps and partition the data correctly and should turn in decent performance numbers for the amount of money investing in infrastructure.

Neither Apple's hardware nor Mac OS X will ever support anything even remotely like this.

The hardware is agnostic to this, but OS X can't even get NUMA (inside the same box with multiple CPUs packages ) quite right. Let alone go across boxes. Transparently apps leveraging this? No. But for folks that already have PVM/MPI apps it would not be a big problem.

The nMP will automatically do a 10000bT interlink between machines via a thunderbolt cable (this is the same thing as having a 10gBE network adapter). What more do you need?

Probably something more than just a point to point connection. Not sure, but I don't think you get a arbitrary network topology. I think the TCP/IP over TB is going to make for a nice hub and spoke small workgroup SAN network. It is open question to me whether this actually scales past small workgroup sized LAN set ups. That means the clustering is also capped.

That isn't a bad thing. The individual nodes ( with dual GPUs) are no slouches when it comes to TFLOPs horsepower. Just 2-3 nodes is enough for a sizable number of possible jobs.

AidenShaw · Oct 24, 2013

deconstruct60 said:
Probably something more than just a point to point connection. Not sure, but I don't think you get a arbitrary network topology. I think the TCP/IP over TB is going to make for a nice hub and spoke small workgroup SAN network. It is open question to me whether this actually scales past small workgroup sized LAN set ups. That means the clustering is also capped.

Also it will be interesting to see how much CPU this uses, doing the TCP/IP stack in Apple OSX instead of offloading it to the NIC.

deconstruct60 · Oct 24, 2013

AidenShaw said:
Also it will be interesting to see how much CPU this uses, doing the TCP/IP stack in Apple OSX instead of offloading it to the NIC.

Technically, it is a just an IP stack. If most stack consumers are UDP or something like FoCE users then there isn't much TCP to offload in most cases.

TCP makes sure you data arrives correctly and in the right order. If just pumping packets onto a network the overhead isn't that high.

Likewise, if folks are running their own custom "right order and arrives safely" protocol using UDP tools it is all running on the CPU anyway (no TCP. no TCP offload. )

It would make sense if Apple hooked this into XSan usage some how so that didn't have to set up a physical fiber channel network to get it working. so for example 2-3 Mac Pros as clients hooked to 1 Mac Pro as a server ( with the expensive disks hanging off it. ) would be a simple way for a small shop to set up a decent SAN without alot of drama or consultants.

I don't think this is a general alternative for folks who need 10GbE connections that aren't entirely local to a small workgroup. (e.g., need to make 1-2 hops though switches to get to where TCP/IP packets are going. )
It is good for being able to "skip" the expensive switch and get a very limited cluster/network going.

ScottishCaptain · Oct 24, 2013

kd5jos said:
This is so complicated and so much easier said than done here is an example in OS X 10.1:
http://www.stat.ucla.edu/computing/clusters/deployment.php

That was back on G3's

Or here where you pick a scripting/programming language to do it in...
http://bonsai.hgc.jp/~mdehoon/software/cluster/software.htm

What exactly is it that you think needs to be there in terms of silicon? The new MacPro is capable of 7 teraflops. Do YOU understand what that means? Really I am asking do YOU understand what 7 teraflops is?

What part of this is so impossible?

You should really re-read the op's original post, and I quote:

I'm thinking, could Apple possibly allow users to interconnect say three Mac Pros together using Thunderbolt to communicate with one another and act as one coherent unit?

What you're talking about is software based clustering. The application needs to be written to handle operation across a network. That's fine and dandy, like I said in my previous post most 3D renderers already do that so there's no need for being able to turn multiple Mac Pros into a single computer.

What I'm talking about, and what I think the op was thinking about- was something more akin to SGI's Altix NUMA bus. Each of those nodes had two big fat high density data connectors on the back of them, which could be used to either daisy chain systems together or connect them to a NUMA router. The entire system was then configured to basically boot a single operating system image, and once everything was running you essentially had one big giant computer with all the processors accessible locally. No networking required.

So, once again:

1) Hardware clustering (making N+1 Mac Pros appear as a single computer) is highly improbable, since that requires deep hardware support that extends all the way down to the processors

2) Software clustering already exists, but your applications need to be written to support it. As I have already stated; most 3D renderers already support this so there is no need for #1.

PS: I am well aware of what a teraflop is, but I'm not sure why you're bringing that into this discussion. Are you sure "YOU" know what that means? Because teraflops have nothing to do with clustering.

-SC

slughead · Oct 24, 2013

Sharky II said:
Yes they have!

My bad, they didn't have it under their "specs" section, just the idle dBA.

Seems kind of unbelievable that they can cool 2 GPUs and a 12 core CPU running at "load" at 17dBA -- less noise than a 200mm fan... I wonder if Apple is fudging numbers again.

Edit: Hey wait a minute? The old Mac Pro was only 29 dBA on high load? My ass. When transcoding big files it was louder than the axillary fan I installed, which was also 29. I could hear that thing in my garage. How are they measuring? From 10 feet away?

Edit2: My MP is also louder than my windforce, which is supposedly 32dbA

AidenShaw · Oct 24, 2013

deconstruct60 said:
Technically, it is a just an IP stack. If most stack consumers are UDP or something like FoCE users then there isn't much TCP to offload in most cases. TCP makes sure you data arrives correctly and in the right order. If just pumping packets onto a network the overhead isn't that high.

Likewise, if folks are running their own custom "right order and arrives safely" protocol using UDP tools it is all running on the CPU anyway (no TCP. no TCP offload. )

You obviously understand networking better than the Apple marketing droids who fed copy to the website.

They didn't know how to spin "it's fast, but your data gets scrambled and corrupted during transit" or "it behaves like you expect a network to behave, but uses a ton of CPU"....

Those of us who understand the difference between "IP" and "TCP/IP" and deal with FCoE should be very skeptical of Apple's claims in the ad copy.

Sharky II · Oct 24, 2013

FWIW my 2008 Mac Pro never ramps/ramped the fan up even when i ran BOINC on it for a few days. Under full tilt it remains at the same speed/noise level.

deconstruct60 · Oct 25, 2013

AidenShaw said:
They didn't know how to spin "it's fast, but your data gets scrambled and corrupted during transit" or "it behaves like you expect a network to behave, but uses a ton of CPU"....

If this is a primarily a single point-to-point the kind of corruption and scrambled problems that TCP is primarily designed to handle aren't going to happen all that often. if only going from point A to point B over a single wire it is kind of hard for the packets to arrive out of order. Thunderbolt has its own error correction mechanisms so the data loss will be low too.

The Mac Pro has two 1GbE sockets so by no means should folks chuck using those for normal Internet data access. The "IP over Thunderbolt" is going to be more useful for some usages than others. I don't think the ad copy is trying to make it out to be universally useful in all situations. Just that is it there as an option.

If the "threat" of "IP over Thunderbolt" finally gets the network vendors to get serious about pricing 10GbE cards/switches at the maturity price levels they should be at then great. For now thought there is the trade off of the extra $2-4K you need to spend to deploy a 2-4 node 10GbE network versus 4 thunderbolt cables to do this one. ( I guess for now if using long distance fiber TB cables can still be up in the same price range. ). For a small cluster an additional 3-4K can also by more x86 CPU cores too.

Search

Search

New Mac Pro channel bonding

alphaod

macrumors Core

slughead

macrumors 68040

ScottishCaptain

macrumors 6502a

Sharky II

macrumors 65816

Attachments

kd5jos

macrumors 6502

sjh

macrumors newbie

deconstruct60

macrumors G5

deconstruct60

macrumors G5

AidenShaw

macrumors P6

deconstruct60

macrumors G5

ScottishCaptain

macrumors 6502a

slughead

macrumors 68040

AidenShaw

macrumors P6

Sharky II

macrumors 65816

deconstruct60

macrumors G5

Our Staff