Well the Cray-1 was rated at 160 million floating point operations per second and I would expect an W-2140B Xeon handily tops that.
Yes that's true. Today Cray XK7 is #4 in the Top500 list.Well the Cray-1 was rated at 160 million floating point operations per second and I would expect an W-2140B Xeon handily tops that.
bxs what is the dock you have on the front of your iMac Pro.
I have tried a couple with my iMac but nothing will attach firmly to the tapered sides. I was lucky and retired on 29 June 1999 and took advantage buying Apple shares that day with a full year franchising. They were US $35 when the US dollar was less than the Aussie dollar and still have them.
At the time Bill Gates invested some money in Apple as they were going bad and were calling Steve Jobs back.
...While I don't have the numbers for the W-2140B. The old i7-5960x hit about 360 GFLOPS on Linpack. The low end iMac Pro should be a bit higher. So, somewhere around 2,500 to 6,000 times faster than a Cray 1....
Yea.... most of that performance is coming from the Cray having 1000s of cores. Now if the iMP had 1000s of cores how would it compare ?According to these tests, the Linpack numbers are about 380 GFLOPS for 8 cores, 460 GFLOPS for 10 cores, and 686 GFLOPS for 18 cores: http://hrtapps.com/blogs/20180202/
I don't know if he used AVX512 instructions or not. According to this page using AVX could make a 2:1 or so difference: https://software.intel.com/en-us/articles/how-intel-avx2-improves-performance-on-server-applications
Regardless, if we use the Cray-1 number of 160 MFLOPS vs the above benchmark, an 18-core iMac Pro is 4,287 times faster.
For years supercomputer people criticized microprocessors as having fast "paper" performance but poor memory bandwidth. That was true for a period but not any longer (if comparing to older supercomputers). Without adequate memory bandwidth, a CPU when processing a large array that out-strips cache will stall.
The Cray-1 memory bandwidth was 640 megabytes/sec, and the XMP from 1982 was 10 gigabytes/sec. By contrast the Pentium 4 from 2000 was only about 2.4 gigabytes/sec.
However by 2014 the i7-4790K was at 27.2 gigabytes/sec, the W-2195 used in the iMac Pro about 85 GB/sec, the Xeon 8176 is about 160 gigabytes/sec and the IBM Power8 at 238 GB/sec.
Even though people commonly complain about how "hot" iMacs run, the Cray-1 consumed 115,000 watts and required a separate refrigeration plant. The total power consumption for both the Cray and the cooling system was about 250,000 watts: https://cs.lbl.gov/assets/Images/Ne...g-Month-Throwback-Thursday-Gallery/Cray-1.jpg
The Cray-1 used freon tubes to carry the heat away from the copper-clad logic boards. Each Cray-2 logic module consumed about 300-500 watts, so they were all immersed in liquid fluorinert. There were 320 modules: https://lh3.googleusercontent.com/-...Jnvh7w32qHkdeGcjCHTQCJoC/w3907-h2394/sc_3.jpg
However the fastest supercomputers today are farther beyond the fastest desktop computers than the Cray-1 was beyond the original IBM PC. This year the US Oak Ridge Summit supercomputer is expected to reach 200 peak petaflops (or 200,000 TFLOPS, or 200,000,000 GFLOPS): https://www.top500.org/news/oak-ridge-readies-summit-supercomputer-for-2018-debut/
That is about 291,000 times faster than an 18-core iMac Pro.
...The Cray XK7 gets....~0.5 Tflop/s per core. For the 10 core iMP each core gets around 0.023 Tflop/s. Just using this we see that the Cray XK7 is but 0.5/0.023 faster or about 22 times faster on a core basis.
Even though people commonly complain about how "hot" iMacs run, the Cray-1 consumed 115,000 watts and required a separate refrigeration plant. The total power consumption for both the Cray and the cooling system was about 250,000 watts: https://cs.lbl.gov/assets/Images/Ne...g-Month-Throwback-Thursday-Gallery/Cray-1.jpg
The Cray-1 used freon tubes to carry the heat away from the copper-clad logic boards. Each Cray-2 logic module consumed about 300-500 watts, so they were all immersed in liquid fluorinert. There were 320 modules: https://lh3.googleusercontent.com/-...Jnvh7w32qHkdeGcjCHTQCJoC/w3907-h2394/sc_3.jpg
However the fastest supercomputers today are farther beyond the fastest desktop computers than the Cray-1 was beyond the original IBM PC. This year the US Oak Ridge Summit supercomputer is expected to reach 200 peak petaflops (or 200,000 TFLOPS, or 200,000,000 GFLOPS): https://www.top500.org/news/oak-ridge-readies-summit-supercomputer-for-2018-debut/
I think that's probably an artifact of the test method and what % of the load was on the Cray's CPUs vs GPUs. The XK7 simply uses the AMD Opteron 6200 16-core CPU -- nothing magic about that. The XK7 is a hybrid design, so it also uses the nVidia Tesla K20. For hybrid supercomputers they commonly use the CPUs to feed the GPUs which do much of the work. This of course assumes the task has been written to efficiently harness those.
If so, the XK7 numbers might have little to do with the CPU. Certainly the Opteron 6200 is not 22 times faster than a Xeon W-2195 from an iMac Pro.
Interesting history on the Cray-1.
These are certainly cases where one is looking at GPU power but comparing to CPU power.
The 200 Petaflops achieved by Oak Ridge Summit is in GPU power. Which is measured based on double precision math. At least they didn’t fudge the numbers with single-precision math or worse yet half-precision math (AMD Vega).
When comparing those results of Summit with a desktop. I'd compare it to a desktop GPU. In which case the nVidia Titan V. Which uses the same V100 GPU and does 6.14 TFLOPS base clock and 7.45 TFLOPS max boost clock in double precision math. With good cooling it should be possible to maintain that 7.45 TFLOPS. Higher if you overclock it. All that they really plan is something about 27,000 times faster than available in Professional desktops. The speed of 25,000 Titan V with a modest overclock. Which they are installing in the system.
When comparing to a Xeon. I'd compare the fastest CPU based supercomputer to the fastest Desktop CPU. For an accurate gauge on how much faster a super computer is. In raw general purpose computing power. I'd be interested to see what speeds those kinds have achieved.
It would be nice if Top 500 maintained separate lists. One for general purpose multiple CPU based super computer performance. GPU focused systems. Highly specialized systems using ASICs on separate lists. Google's TPU for deep learning comes to mind. It would be too easy to list absurdly high numbers when only a highly specialized function can be performed at that speed.
....These are certainly cases where one is looking at GPU power but comparing to CPU power...The 200 Petaflops achieved by Oak Ridge Summit is in GPU power. Which is measured based on double precision math....
...When comparing those results of Summit with a desktop. I'd compare it to a desktop GPU. In which case the nVidia Titan V. Which uses the same V100 GPU and does 6.14 TFLOPS base clock and 7.45 TFLOPS max boost clock in double precision math...
...When comparing to a Xeon. I'd compare the fastest CPU based supercomputer to the fastest Desktop CPU...