Intel announce Xeon Phi - Xeon (Knights Corner) PCI-E co-processor

Umbongo · Jun 18, 2012

http://blogs.intel.com/technology/2...ocessors-accelerate-discovery-and-innovation/
http://www.intel.com/newsroom/kits/isc/2012/pdfs/Intel_ISC_2012-Presentation.pdf

Made with Intel's innovative 22nm, 3-D tri-gate transistors, the Intel Xeon Phi coprocessor, available in a PCIe form factor, contains more than 50 cores and a minimum of 8GB of GDDR5 memory. It also features 512b wide SIMD support that improves performance by enabling multiple data elements to be processed with a single instruction. Last year Intel showed a live demonstration of the single Knights Corner coprocessor delivering over 1 TeraFLOPs (1 trillion floating point operations per second) of double precision real life performance, as measured by DGEMM.

Not really Mac related news, but still of interest to some on this forum I'm sure. They do want to push this to workstations in the future.

goMac · Jun 18, 2012

It looks great, but it's going to mean we need more OpenCL support.

ScottishCaptain · Jun 18, 2012

This isn't a GPU, it doesn't support OpenCL.

It is unlikely that this card will ever operate under OS X.

-SC

Ape_Man · Jun 18, 2012

ScottishCaptain said:
This isn't a GPU, it doesn't support OpenCL.
-SC

OpenCL isn't just bound to GPU's.

Here is a list of CPU's & GPU's that conform to OpenCL specs. Although, being x86 based might mean that it isn't limited to OpenCL code.

http://www.khronos.org/conformance/adopters/conformant-products/

Probably not Mac bound but it would be great to be able to one of these in a PCIe slot and run an instance of a 3D render engine or anything that scales well to multiple cores. Being x86 based and double precision will be far more useful in everyday creative apps than a multi-core GPU at this point in time.

Cheers

Umbongo · Jun 18, 2012

Ape_Man said:
Probably not Mac bound but it would be great to be able to one of these in a PCIe slot and run an instance of a 3D render engine or anything that scales well to multiple cores. Being x86 based and double precision will be far more useful in everyday creative apps than a multi-core GPU at this point in time.

I suppose it depends if the next Mac Pros are worth providing add-in card compatibility for.

ScottishCaptain · Jun 18, 2012

Well, apparently I was wrong.

http://software.intel.com/en-us/blogs/2012/06/05/knights-corner-open-source-software-stack/

The cards themselves boot a variant of Linux. Nearly everything is open source, so writing drivers for Mac OS X would just require someone with previous experience in doing so (but apparently the drivers are pretty simple from a programming standpoint)- assuming Intel doesn't just do it for us.

So it won't run OpenCL (unless you ported OpenCL to it), but who cares? If this thing runs as clean as they say it does (software wise- no nasty GPU drivers to contend with), then they might actually be onto something here. It's been a long time since I've seen any sort of co-processor worth programming for and taking advantage of.

I wouldn't be surprised if someone has this thing running under OS X within months of release. Since OpenCL is such a pain in the ass to port to, Ape_Man might be right- we might see fancy things like 3D raytracers and physics/volumetric simulators running on Phi faster then we're seeing them crop up in OpenCL/CUDA.

I just hope the price is reasonable, and nothing obscure like $8K.

-SC

goMac · Jun 18, 2012

ScottishCaptain said:
This isn't a GPU, it doesn't support OpenCL.

It is unlikely that this card will ever operate under OS X.

-SC

It does actually. Intel is releasing OpenCL drivers for it.

No telling if they are doing Mac drivers, but it would be nice. If Thunderbolt got up to speed, this would be a great for an external box.

(The OpenCL bit is detailed online, but I've also had conversations with engineers who were on this project. OpenCL != GPU also. OpenCL can even run on your normal CPU. It doesn't even have graphical output routines. NVidia already makes headless cards that are meant for CUDA/OpenCL.)

Umbongo · Jun 19, 2012

goMac said:
It does actually. Intel is releasing OpenCL drivers for it.

No telling if they are doing Mac drivers, but it would be nice. If Thunderbolt got up to speed, this would be a great for an external box.

(The OpenCL bit is detailed online, but I've also had conversations with engineers who were on this project. OpenCL != GPU also. OpenCL can even run on your normal CPU. It doesn't even have graphical output routines. NVidia already makes headless cards that are meant for CUDA/OpenCL.)

Having read some discussion of it I don't believe any special needs to be done to get one working in a Mac. It runs Linux natively and is designed to work in any system.

theSeb · Jun 19, 2012

Umbongo said:
http://blogs.intel.com/technology/2...ocessors-accelerate-discovery-and-innovation/
http://www.intel.com/newsroom/kits/isc/2012/pdfs/Intel_ISC_2012-Presentation.pdf

Not really Mac related news, but still of interest to some on this forum I'm sure. They do want to push this to workstations in the future.

I am a bit behind this stuff these days so please be gentle. Is this similar in functionality to an Nvidia Tesla, without the GPU video output bit?

Edit: Now that I've read the Anandtech article, I see that it is. Unless I've lost my reading comprehension.

deconstruct60 · Jun 19, 2012

goMac said:
No telling if they are doing Mac drivers, but it would be nice. If Thunderbolt got up to speed, this would be a great for an external box.

What???? This would be horrible in an external box. Are you talking one of those x4 PCI-e lane v2.0 TB break-out box? Horrible.

What the drivers do is step up a virtual network connection between the host and the PCI-e card. When the card leverages a x16 lane PCI-e v3.0 interface, it is almost like getting a low cost 40Gb/s Infiniband network card connection between two separate boxes computers that provides a virtual Ethernet connection. Only there is no expensive Infiniband switch or cards involved. Just a software driver to provide a virtual network between the Linux running on the card and the host system.

With a card that maxes out a 1TFlops if you don't get the data to the card to be crunched on then those TFlops are useless. Sure the card will have 8GB of memory but if the pipe moving the input and output results is relatively slow the card will stall out. Thunderbolt is about equivalent to x4 v2.0. A x16 v3.0 link is about 24 times faster.

No. An alternative universe Mac Pro that had three x16 PCI-e v3.0 slots and power to run two Xeon Phi cards would be much better than jumping through throttling hoops to loop in Thunderbolt. For example, dual E5 2630 , an embedded GPU to "solve" the Thunerbolt issue and use the two open x16 PCI slots for the Phi cards would kill as a computational box. That's 12 "full sized" x86 cores and another 100 "vector oriented" x86 ones.

That's far more productive box than trying to "choke" 50 vector x86 cores with Thunderbolt.

wisty · Jun 19, 2012

theSeb said:
I am a bit behind this stuff these days so please be gentle. Is this similar in functionality to an Nvidia Tesla, without the GPU video output bit?

Edit: Now that I've read the Anandtech article, I see that it is. Unless I've lost my reading comprehension.

Basically, yes, but I think it's x86. It's kind of like a bunch of 50+ Atom processors, so it should be a little easier to port stuff to it.

I'd expect it to be fairly easy for Apple to put in a MacPro (possibly even old ones), and offer support in their Pro apps, as long as they still give a ****. If not, there might be third party support (Abode).

deconstruct60 · Jun 19, 2012

ScottishCaptain said:
I just hope the price is reasonable, and nothing obscure like $8K.

Probably between $2,600-4,000

Highly likely more than a entry Mac Pro will cost.

The current Nvidia Tesla C2075 is going for around $2,100-2,400 and it only as 6GB of memory and is substantially slower. I doubt Nvidia, Intel (and maybe AMD) are going to get into a price war on these things and drive the costs much lower. However, it is doubtful that Intel could get away with an $8,000 cost if Nvidia is pricing their Kepler solutions in the $2-4K range.

The K10's are going to be competitive to these.

http://www.theregister.co.uk/2012/06/18/nvidia_isc_tesla_k10_performance/

The only major upside the Xeon Phi have is that the porting time for current Linux supercomputer apps is probably a bit faster. Not quite just a simple recompile but better than custom code that needs to be reworked to CUDA and then matched to the new Nvidia architecture.

deconstruct60 · Jun 19, 2012

wisty said:
Basically, yes, but I think it's x86. It's kind of like a bunch of 50+ Atom processors, so it should be a little easier to port stuff to it.

It should be easy to port OpenMP and OpenMPI stuff to it. Not so sure about Mac Apps ported to the other side. Unless there is a "pure" computation engine that is POSIX/Unix neutral then that's a problem.

I'd expect it to be fairly easy for Apple to put in a MacPro (possibly even old ones), and offer support in their Pro apps, as long as they still give a ****.

Except there are no Infinband cards for Mac Pro (unless things have changed). There are no Tesla cards for Mac Pro. Currently no official Quadra support. The last two means it is no surprise there is no Maximus (http://www.nvidia.com/object/maximus.html ) support.

And no it won't be easy for "old" Mac Pros because these cards eat power. Unless going to chuck the power eating GPU card already present it is unlikely the "old" Mac Pro can power two cards with the appropriate power budgets. [ minus the hackery of external power supplies .... which is a nice hack but not commercial solution. ]

If not, there might be third party support (Abode).

Doubtful. Unless someone ports the low level drivers that the Intel The Symmetric Communications Interface (SCIF) ( http://software.intel.com/en-us/blogs/2012/06/05/knights-corner-open-source-software-stack/ ) this card is a no go. There are some parts of SCIF that get called from kernel space ( ".... The SCIF APIs are callable from both user space (uSCIF) and kernel-space (kSCIF). .... " ) This low level card support is likely going to be very similar to graphics cards where the vendors need to do some work and Apple also needs to do some level work.

The other problem is that so far it is suggestive that Intel is not going to sell these at retail. I think the Anantech's title is a bit deceptive on that front. Most articles so far suggest Intel is going to sell these to system vendors and that they will use these cards a "configure to order" options. You'll be able to buy systems with them inside, but you won't be buying the individual cards at retail.

So on both the kernel and system configuration front if Apple is not involved this is likely a no-go.

theSeb · Jun 19, 2012

I thought there was quadro support?

http://www.nvidia.com/object/product_quadro_fx_4800_for_mac_us.html

deconstruct60 · Jun 19, 2012

theSeb said:
I thought there was quadro support?

http://www.nvidia.com/object/product_quadro_fx_4800_for_mac_us.html

If you look at the matrix of current Quadro cards

http://www.nvidia.com/object/workstation-solutions.html

There is currently the 5000 , 6000, and 7000 (**) series that are not supported on the Mac. Yes, there was Quadro support from about 2005-2010 but there really hasn't been much movement since then.

The catch seems to be if Apple doesn't put the card into the CTO options there isn't large enough market it support it. Likewise even when in the CTO there likely was not enough growth to justify it being kept as a CTO option in 2010.

The Quadro typically lag behind the mainstream card in architecture so it isn't like Apple's "behind the bleeding edge" stance on GPU cards is at odds here. It is likely mainly getting the drivers done and being able to sell enough cards to justify the effort of marketing separate products.

(**) The 7000 is a dual GPU offering and since Apple generally doesn't do dual GPUs it really isn't three whole offerings behind. Just not doing 6000 basically means not getting 7000.

24Frames · Jun 19, 2012

The pricing could be interesting. If you combined a Xeon Phi with a Single Quad-Core or Hex-Core CPU running at a relatively high clock speed it might be a rather more useful option than having a Dual 6-Core or 8-Core Xeon configuration without the Xeon Phi, albeit at a similar price point. At the very least it would give more flexibility as to where to spend the workstation budget!

I read somewhere that the HPC (high performance computing) market is growing at 20% per annum, which is why Intel is interested. 3D/VFX is also growing fast.

To my way of thinking unless Apple target the 2013 Mac Pro, or whatever it turns out to be, at these markets where users need far more processing power than is available on an iMac or MacBook Pro, they will continue to be stuck with the situation where more and more users discover that an i7 Quad Core iMac or MacBook Pro is a more cost effective solution.

BTW the Apple Store has the NVidia Quadro 4000 in stock. I assume that means they are happy for you to buy one and install it in your Mac Pro, which means they support it.

http://store.apple.com/us/product/H3314LL/A/nvidia-quadro-4000-for-mac

Edit: Fixed link.

theSeb · Jun 19, 2012

deconstruct60 said:
If you look at the matrix of current Quadro cards

http://www.nvidia.com/object/workstation-solutions.html

There is currently the 5000 , 6000, and 7000 (**) series that are not supported on the Mac. Yes, there was Quadro support from about 2005-2010 but there really hasn't been much movement since then.

The catch seems to be if Apple doesn't put the card into the CTO options there isn't large enough market it support it. Likewise even when in the CTO there likely was not enough growth to justify it being kept as a CTO option in 2010.

The Quadro typically lag behind the mainstream card in architecture so it isn't like Apple's "behind the bleeding edge" stance on GPU cards is at odds here. It is likely mainly getting the drivers done and being able to sell enough cards to justify the effort of marketing separate products.

(**) The 7000 is a dual GPU offering and since Apple generally doesn't do dual GPUs it really isn't three whole offerings behind. Just not doing 6000 basically means not getting 7000.

Oh, I see. Thanks for the explanation.

thekev · Jun 19, 2012

theSeb said:
I thought there was quadro support?

http://www.nvidia.com/object/product_quadro_fx_4800_for_mac_us.html

They're not that well implemented. The Quadro Fx 4800 was complete ass in its implementation. It got bad reviews. They did not sell many of a very expensive card, and NVidia never looked at it again. Keep in mind this is a slightly different power class than the Quadro 4000 available now. It's the predecessor to the Quadro 5000. The current Quadro 4000 is considered a mid range card. You can pick up one for $700 as long as you don't buy directly from Apple. In some things it's been proven to be better, but it's missing a lot of features compared to Windows. One of the guys on here mentioned a comparison with a 5870 under Windows where the 5870 completely choked. On OSX it's a little different in that companies that would use workstation cards on Windows are testing with what is available on OSX. Given the relatively small number of Quadro deployments on OSX, they're sort of forced to work with what they're given.

On the issue of Teslas, some things require a minimum amount of ram to run. The teslas are loaded with ram so apart from raw speed, they can handle problems with a larger memory footprint.

theSeb · Jun 19, 2012

thekev said:
They're not that well implemented. The Quadro Fx 4800 was complete ass in its implementation. It got bad reviews. They did not sell many of a very expensive card, and NVidia never looked at it again. Keep in mind this is a slightly different power class than the Quadro 4000 available now. It's the predecessor to the Quadro 5000. The current Quadro 4000 is considered a mid range card. You can pick up one for $700 as long as you don't buy directly from Apple. In some things it's been proven to be better, but it's missing a lot of features compared to Windows. One of the guys on here mentioned a comparison with a 5870 under Windows where the 5870 completely choked. On OSX it's a little different in that companies that would use workstation cards on Windows are testing with what is available on OSX. Given the relatively small number of Quadro deployments on OSX, they're sort of forced to work with what they're given.

On the issue of Teslas, some things require a minimum amount of ram to run. The teslas are loaded with ram so apart from raw speed, they can handle problems with a larger memory footprint.

So for something like FCP X is the 4800 completely pointless or would it perform better than a 5870? Same question for a view port in a rendering app.

thekev · Jun 19, 2012

theSeb said:
So for something like FCP X is the 4800 completely pointless or would it perform better than a 5870? Same question for a view port in a rendering app.

http://barefeats.com/nehal10.html
http://arstechnica.com/apple/2009/12/a-second-look-at-the-nvidia-quadro-fx-4800-mac-edition/
It's possible I may eat that statement. There is a feature I like there. Take a guess what it is. I thought the 4800 was the one NVidia abandoned. The 4000 came up later. It's weaker in some areas, but it's the one you hear about these days. I can't find a lot of info on whether the 4800 made it to Lion support. PNY has the 4k for around $700. The problem is that it doesn't seem to have OpenCL support. It's faster in some applications. I don't think I'd personally touch it. I can't find any specific FCPX testing under Lion. I'd want to see that. If it's lacking a Lion compliant driver, you're dooming it to a short service life.

It's annoying that we haven't seen a 7970. Luxmark = benchmark for open source rendering engine, although it may not be perfect data. I'm thoroughly pissed off with Apple at the moment.

http://barefeats.com/wst10g14.html

goMac · Jun 19, 2012

Umbongo said:
Having read some discussion of it I don't believe any special needs to be done to get one working in a Mac. It runs Linux natively and is designed to work in any system.

I'm sure you can get the card in a machine and powered on, but to get Mac OS X to actually submit instructions to the card you need some sort of driver layer. That's why Intel is working on OpenCL drivers, and least for Windows, so you can use OpenCL to submit instructions to the card. Otherwise, it's just a card running Linux not doing much else.

deconstruct60 said:
What???? This would be horrible in an external box. Are you talking one of those x4 PCI-e lane v2.0 TB break-out box? Horrible.

I'll quote myself on this one...

No telling if they are doing Mac drivers, but it would be nice. If Thunderbolt got up to speed, this would be a great for an external box.

This sort of thing would be great for speeding up render/compile times on a laptop when sitting down at my desk.

ScottishCaptain · Jun 19, 2012

Doubtful. Unless someone ports the low level drivers that the Intel The Symmetric Communications Interface (SCIF) ( http://software.intel.com/en-us/blog...oftware-stack/ ) this card is a no go. There are some parts of SCIF that get called from kernel space ( ".... The SCIF APIs are callable from both user space (uSCIF) and kernel-space (kSCIF). .... " ) This low level card support is likely going to be very similar to graphics cards where the vendors need to do some work and Apple also needs to do some level work.

Is there some reason why XNU/IOKit wouldn't support SCIF as-is today?

I'm sure you can get the card in a machine and powered on, but to get Mac OS X to actually submit instructions to the card you need some sort of driver layer. That's why Intel is working on OpenCL drivers, and least for Windows, so you can use OpenCL to submit instructions to the card. Otherwise, it's just a card running Linux not doing much else.

People keep talking about OpenCL on the Phi here and elsewhere, but I can't for the life of me figure out why. Is there some advantage OpenCL has as a language that C or C++ or even Objective C doesn't?

It seems to me like running native x86 code on the Phi would be infinitely easier to work with (and port to!) then attempting to re-write whatever existing code you have to run under OpenCL instead. I understand the need for a standard like OpenCL to act as a HAL so code (generally) runs uniformly across GPUs, but the Phi isn't a GPU- it runs x86 code, and other Phi cards would presumably do the same.

-SC

FluJunkie · Jun 19, 2012

ScottishCaptain said:
It seems to me like running native x86 code on the Phi would be infinitely easier to work with (and port to!) then attempting to re-write whatever existing code you have to run under OpenCL instead. I understand the need for a standard like OpenCL to act as a HAL so code (generally) runs uniformly across GPUs, but the Phi isn't a GPU- it runs x86 code, and other Phi cards would presumably do the same.

-SC

Puzzles me too. The whole *point* of the card, as Intel is pushing it, is that you don't have to learn Yet Another Thing in order to use the Phi for massively parallel problems, versus CUDA/OpenCL.

amarcus · Jun 19, 2012

ScottishCaptain said:
People keep talking about OpenCL on the Phi here and elsewhere, but I can't for the life of me figure out why. Is there some advantage OpenCL has as a language that C or C++ or even Objective C doesn't?

Yes, OpenCL is platform agnostic. With a suitable driver the Xeon Phi should be able to accelerate any past, present or future OpenCL code (code that has already been written with parallelism in mind!). However strict C/C++/Objective C will likely require some form of rewriting and recompilation to utilise the additional cores and to target the coprocessor.

In short: OpenCL opens the door to a pre-established software library.

Adam

goMac · Jun 19, 2012

ScottishCaptain said:
People keep talking about OpenCL on the Phi here and elsewhere, but I can't for the life of me figure out why. Is there some advantage OpenCL has as a language that C or C++ or even Objective C doesn't?

It seems to me like running native x86 code on the Phi would be infinitely easier to work with (and port to!) then attempting to re-write whatever existing code you have to run under OpenCL instead. I understand the need for a standard like OpenCL to act as a HAL so code (generally) runs uniformly across GPUs, but the Phi isn't a GPU- it runs x86 code, and other Phi cards would presumably do the same.

Well, a few reasons...

As mentioned above, OpenCL is compiled on the fly. That way it doesn't matter whether you are running on ARM, on a GPU, on a Xeon Phi, or on an i7. The code is going to run. In fact, you could mix and match. You could have a GPU and a Xeon Phi both in the same machine, and run the same code over both at once. OpenCL can already do this on Mac OS X with multiple graphics cards, no SLI or Crossfire required.

The other good reason is that C and Objective C are really not optimized at all for these sorts of processors, while OpenCL is. Imagine if you put 4 of these in a machine. Traditional languages like C and Objective C aren't really meant to run over 4 entirely different "computers" at once, while OpenCL is.

This thing is really a lot like a GPU that uses x86, not really a CPU. If you look at the project history, that's actually how it started. Intel was trying to build an x86 GPU.

Intel announce Xeon Phi - Xeon (Knights Corner) PCI-E co-processor

macrumors 601

macrumors 604

macrumors 6502a

macrumors newbie

macrumors 601

macrumors 6502a

macrumors 604

macrumors 601

macrumors 604

macrumors G5

macrumors regular

macrumors G5

macrumors G5

macrumors 604

macrumors G5

macrumors regular

macrumors 604

macrumors 604

macrumors 604

macrumors 604

macrumors 604

macrumors 6502a

macrumors 6502a

macrumors 6502

macrumors 604

Our Staff