Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Philip Turner

macrumors regular
Dec 7, 2021
170
111
Which Mac should the CERN Data Center buy to process and store the petabyte of data it creates every day?
M2 Ultra Mac Pro with expandable storage. It's cheaper than an A100.

How long would it take you to train Stable Diffusion on an M1 GPU if it took 200,000 A100 hours to Stability?
24 million GPU-hours at 10 times the hardware cost. Unless you harnessed the neural engine for FP8 training: 5.7 million ANE-hours at 2.4x the hardware cost and ~1/5 the power.

My point wasn't about whether the hardware is powerful enough for HPC. We know it is. It's the software that's lacking. Nobody wants to switch to Metal except people already producing Apple-centric software. Apple's OpenCL driver is severely lacking, and even that's a dying industry standard. M1 would probably take ∞ hours to train Stable Diffusion because it doesn't run CUDA or even oneAPI.

A more appropriate question, which existing library would run FP64 eigendecomposition on the GPU with maximal performance?
 
  • Like
Reactions: Xiao_Xi

Joe Dohn

macrumors 6502a
Jul 6, 2020
840
748
Now that Apple has proven its capability in designing and producing Silicon for their own products and with Intel struggling on the innovation front, wouldn’t it make sense for Apple to enter the enterprise data center chip market for AI and HPC?

They could. But the enterprise market likes to know technologies will be supported for a long time. And historically, Apple has the tradition of killing off technologies they consider to be old or obsolete. Sometimes, they are killed a bit too early (for example, some of the Home Pod speakers). And considering that some enterprise companies still run mainframes, Cobol and need magnetic tapes, I don't see Apple being successful here anytime soon – unless they somehow change their mindset to be more like Microsoft.
 
  • Like
Reactions: Oculus Mentis

Joe Dohn

macrumors 6502a
Jul 6, 2020
840
748
Now please don't compare that to the rubbish that is Windows on ARM. Not only didn't Windows make the transition yet, they'll likely never make it. Like they failed to establish a mobile OS, despite having one long before the iPhone and buying Nokia out of all the cellphone makers! Microsoft failed in the phoneOS market, in the tabletOS market, in the laptop market and now they're clinging to their horrible copy of the Macintosh. If you see a bright future for them, I don't.

I'm not sure you are talking about. Windows on ARM runs decently – and the performance of the compatibility layer is comparable to Rosetta's in terms of performance (Rosetta's compatibility is better). And all that on Apple Silicon hardware through virtualization, no less.
 

HobeSoundDarryl

macrumors G5
My point is the current architecture does not give you flexibility in expansion - everything is on the silicon. I know apple could re-architect an M series processor and logic board to allow for upgradeable components but right now they don't have such an architecture to .allow expansion of components.
That's by Apple's choice though. They've built the ability to flexibly "burn in" various configurations of ONLY their own RAM and ONLY their own storage offerings.

You asked a question without that filter: What would a Mac Pro offer that Studio doesn't? Those are the things in the rumored Silicon Mac that it could offer. If it did, that becomes an attractive Mac to those who want/need those kinds of capabilities they do not have in any way with the Mac Studio... or any other Silicon Mac.

And basically if some Thunderbolt "lanes" were built inside a tower case, why can't they connect as standard slots? Then it's a matter of making software support "internal" Thunderbolt/slots vs. only external Thunderbolt/USB.

Thinking about that:
Those who need more RAM could simply have 2 tiers of RAM: fastest RAM is the RAM block in Silicon and slightly slower RAM could be added in the traditional way. macOS could manage what is in fastest RAM vs. fast RAM much like Grand Central manages tasks. "We" let on like fastest RAM is much faster than traditional slotted RAM but we didn't seem to have much complaint about RAM speeds in the Intel Mac decade. As I shift back & forth between Studio Ultra with Apple RAM vs. an older Intel Mac with third party RAM, RAM-intensive tasks don't "feel" substantially hindered. In fact, for most usage, I don't notice a difference. I'm sure there is one but it's not like a second tier of slotted RAM would be a terrible hinderance to how fast processing can occur. As is, when Silicon RAM demand is greater than supply, macOS already starts doing swaps with the SSD. As I understand it, traditional RAM would be FASTER than swapping in and out with SSD. If so, traditional RAM as tier 2 RAM would deliver a faster experience for those needing more RAM than hard capped in Silicon now.

Those who need more storage INSIDE could add it inside. Especially for those concerned with how their Macs look, "hiding" expansion options INSIDE may be preferable to hubs/docks & cables to enclosures up to all over the desk. Modern Macs are "messier" if hardware needs are greater than what one can get Apple to include inside. A pro case can put all that INSIDE for the cleaner look some seem to greatly desire (see ongoing calls for iMac "bigger").

Those with the need for specialized cards would have slots for specialized cards. That's a real market and not only 5 or 10 scientists around the world. No slot Macs means all those buyers must look to PC options.

Those with the want to evolve with better graphics horsepower would have the option instead of having to forever be happy with the peak graphics capability on delivery of the Silicon Mac we own.

That shared, if you put the filter onto your question to pinch the possibilities into Apples choices of how to present Silicon, there is no Mac Pro to build... only perhaps a faster Mac Studio or one with even more Apple-only RAM and/or Apple-only SSD storage. Mac Studio Ultra has a second "empty" SSD slot already. If that could be populated and made functional, that becomes an "up to 16TB of fastest Apple storage"- maybe RAIDed???- with only software support. That seems like a small step towards a Studio-like Pro. Then how about letting that be activated in existing Studios for those who would like to add more storage inside? Then how about letting third party SSD drive prices for that add-on down? Etc. Something is better than nothing.
 
Last edited:
  • Love
Reactions: Shirasaki

maflynn

macrumors Haswell
May 3, 2009
73,682
43,740
That's by Apple's choice though.
Correct.

And basically if some Thunderbolt "lanes"

Remember the trash can Mac? People complained early on about the lack of upgradeability and many Apple fans defended apple saying they should just use Thunderbolt. Then we have Apple just a couple of years ago, offering their mea culpa saying yes, that mac was a mistake, it lacked exandpability and failed their customers.

Now you're saying the same thing - just use thunderbolt. Do you want a computer that can offer up to 128GB/s bandwidth or one that is 40?

1672586225572.png


1672586206081.png
 
  • Like
Reactions: Oculus Mentis

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
My point wasn't about whether the hardware is powerful enough for HPC. We know it is.
What metrics do you use to think that Apple's hardware is suitable for HPC?

M2 Ultra Mac Pro with expandable storage. It's cheaper than an A100.
Your suggestion is non-existent hardware of unknown price with immature software?

By the way, the A100 ($25,000) is 5 more expensive than the high end M1 Ultra ($5,000). Do you know of any benchmark that shows that 5 M1 Ultra could beat an A100?
 

HobeSoundDarryl

macrumors G5
Remember the trash can Mac? People complained early on about the lack of upgradeability and many Apple fans defended apple saying they should just use Thunderbolt. Then we have Apple just a couple of years ago, offering their mea culpa saying yes, that mac was a mistake, it lacked exandpability and failed their customers.

Now you're saying the same thing - just use thunderbolt. Do you want a computer that can offer up to 128GB/s bandwidth or one that is 40?

View attachment 2135435

View attachment 2135434

No I'm not saying that. I'm suggesting reallocating some Thunderbolt lanes (based in PCIe) INSIDE a tower or xMac case to offer PCIe standard slots. That's the tower Mac Pro people expect: slots and Silicon... hopefully fastest possible slots and fastest possible Silicon. I'd certainly prefer PCIe 5 (or 6 if it takes that long for Apple to finally deliver) but the bigger point is tower or xMac case with slots so that Mac Pro buyers can expand/evolve their Mac Pro INSIDE.

And we essentially have the new Trashcan Mac Pro in our Mac Studios now: thoroughly locked down, no slots, "use thunderbolt." When it needs to evolve, throw the whole Studio out and buy a new one.

That's not trying to put Studio down. I paid up big for one myself. I'm still answering the original question you asked: why Mac Pro vs. Studio. Mac Pro people want the stuff Mac Pro has now paired with the goodies of Studio tech guts (if not better).

If you were really not asking a question but instead saying there is no need for a Mac Pro because Apple created Studio... OK, I can respect that opinion. I simply suspect traditional Mac Pro buyers would like the traditional Mac Pro (differentiating) features seemingly impossible to realize in Mac Studio.
 
Last edited:

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
You betcha they don't care. They will discontinue Rosetta 2 in not to long a time and a lot of software, especially custom software wont work anymore. That's just what they did with the x86 transition.

Pretty unlikely this will turn out to be very close to the same time schedule as the PPC-> x86. The situation is very substantially different. Several reasons.

1. Apple owns Rosetta 2. Apple did not own Rosetta. The first 'Rosetta' was an Apple named veneer they slapped on top of Transitive's 'QuickTransit'.


Which meant each Mac sold meant that Apple owed a royalty payment to someone. (i.e., an on going cost center. ) In contrast, Apple built their own technology for Rosetta 2. Yes, they paid relatively a lot up front but after it is working the costs don't have to be high if they stop putting in new features.


Furthermore in 2009 IBM bought Transitive. So Apple trying to 'out leverage' a small company like Transitive for low royalty rates wasn't going to work the same against IBM. Transitive was 'shopping' themselves and their tech in 2007-2008 to other big tech firms. That shouldn't have been a secret to Apple.


2. Apple is adding different usages for Rosetta than just macOS. Apple has extended this into running Linux binaries also. Apple can't mandate that x86 Linux apps simply vanish with the same leverage they can use on macOS apps.

[ If Apple was deeply committed to shrinking the costs of Rosetta 2 as rapidly as possible they would not adding more incremental usages to the software. ]

Rosetta 2 is unlikley to ever get AVX added. Or virtualization or other parts that would make it harder to maintain on a small fixed budget over the long term. But if the investement is low then the motivate to quickly kill it is also likely low also. ( if it works and it is cheap ... why kill it? )



3. During the PPC -> Intel transition Apple has three ports to deal with. the Mach Kernel on PPC , Intel, and ARM. What seems to be often lost in the context is that Apple was throwing tons of money and resources on growing ARM OS porting at the same time as getting off x86. It wasn't the entire macOS stack, but macOS releases were being delayed to get iOS upgrades out the door. Once Apple decided to move past web-apps on the Phone and rolled out the iPad there was lots more work to do on the ARM OS port side . (i.e., more money , more people resources , etc.)

This context is almost a complete 180 degree opposite to that. Intel and ARM OS kernels are relative mature (besides the new changes Apple makes each year.). There are close to 100M Intel Mac users out there. Higher user base inertia and a stable software base. the ASM OS version actually has the larger user base that pays more money into OS development without any macOS on ARM revenue input at all. . Apple is moving toward what was already paying the bills. Not some 'new' money but the established 'old' money.

Apple finished the PPC -> x86 transition in 18 months on a two year schedule. Here, Apple is just as far over time budget as the previous transition was under it. Still selling Intel Macs now in transition year three. Possibly even an cancelled 'M2 Extreme' ( probably going to have to 'lean' on the MP 2019 for most of 2023, if not limp into 2024 on it. So pretty good chance won't really finish transition for four years. ). The 2018 Mini is still being sold over 4 years later also. Apple is in a desperate hurray to stop selling x86 Macs? Doesn't look like it. A M1 Pro Mini would have been how hard to do? Not very hard at all.



Will Apple keep Rosetta 2 around for 20 years? Probably not. When the last Intel Mac goes on the Vintage/Obsolete list and macOS on Intel stops, then keeping Rosetta 2 around doesn't make much sense. Dumping it before the Intel Mac's get de-support to try to 'whip' developers/users across to the other side onto Apple Silicon is very dubious. More than likely chase off just as many folks off the platform as move the over with that kind of very heavy-handed tactic.

Is Apple going to dump Intel Macs faster than Vintage/Obsolete countdown clock? On average probably not. Apple is tossing Intel systems onto the list in the 'minimal' 5 year countdown after end of sales ( extreme corner case Mac Pro 2013 aside given its Rip van Winkle upgrade problems. ). Apple needs to get to the point though were they stop selling 'new' Intel Macs though so that the countdown clock is running on all of the Intel Macs. At that point, Rosetta 2's days are 'numbered'. [ The 2018 Mini and 2019 Mac Pro likely will not stop the 'clock' from starting past 2023 though even if they are sold through the end of the year. Protection by 'Rip van Winkle' paced updates isn't going to do much to extend their respective Vintage/Obsolete dates. ]
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
That seems to be a consensus. What's Mac Pro's market share for similarly powerful PCs/workstations? Close to 0%. If Apple stopped producing them, nobody would notice.

It isn't 0% , but it isn't a 'top 4' , prime-time player status either. Apple was firmly in the "others" category on top market share charts. They don't complete with HP/Dell/Lenovo on highly visible player status. But they did 'OK' against small boutique vendors who are also in the "other" category.

Apple having tossed the rest of the Mac line up makes them less than a boutique workstation vendor in terms of 'volume buying power' with Intel. But Apple sold more than just a few. Probably in the 55-100K units/yr run rate zone. ( > $440-800M/yr ) Some folks would notice, but it wouldn't be as though they didn't have lots of options to shift to.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
That's definitely true, but all I meant by saying that, is there isn't any Windows Servers with ARM Processors. (at least sold to the public! There's no technical reason there couldn't be though, and I think there is Azure ARM servers. (VM servers)

Classic , backend workload only, headless Windows servers? Yes. Any type of Windows server ? No.

" ... To support their work, we’ve made Insider Preview releases of Windows 11 Pro and Enterprise available to run on Arm-based Azure VMs. Client application developers can now take advantage of Azure’s highly available, scalable, and secure platform to run cloud-based software and build and test workflows that help them increase their agility. ..."


" ...
Applies to: ✔️ Linux VMs ✔️ Windows VMs ✔️ Flexible scale sets ✔️ Uniform scale sets ...
....
These virtual machines feature the Ampere® Altra® Arm-based processor operating at 3.0 GHz, which provides an entire physical core for each virtual machine vCPU. ..."

That is probably more just Hyper-V on ARM with client OS VM instances running on top. But it is a server that is aggregating and servicing client workloads. Also demonstrative that Windows on Arm is not limited to just Qualcomm SoCs.


I think what Microsoft is still trying to shy away from is people trying to buy an Arm server and Windows Server on Arm and then throwing heavyweight x86 binaries apps for the workload. That's the bigger issue. But also a chicken-and-egg thing as a hefty chunk of Windows server market are "buy only Dell or HP or Lenovo" shops and those vendors really might only have a narrow range of Ampere boxes that might be useful to a few hyperscalars. Mostly the same crowd that used to operate under the "nobody ever got fired by buying iBM" philosophy from last century.

Microsoft has done forays . As mentioned in this thread Server version identifiers show up in various places in the code and there have been tech releases a couple versions back, but it isn't 'official external' .


The other 'problem' is Windows Server is not particularly good at non Microsoft framework , scalable workloads. Linux scales more cost effectively where not hooked to proprietary solutions. It operates much better in niches where Windows is presumed to be the best solution and folks cobble together solutions skewed toward it. Once deeply committed to dragging around a legacy Windows application library stack then almost guaranteed to snag some 32-bit x86 boat anchor along the way if dealing with folks who are mostly resistant to change.


The bulk of Microsoft's/Azure's , like Apple's Cloud services , VM instances are Linux; not the respective proprietary OS's.
 

Philip Turner

macrumors regular
Dec 7, 2021
170
111
What metrics do you use to think that Apple's hardware is suitable for HPC?
2.6 trillion FLOPS, 5.3 trillion FLOPS, 10.6 trillion FLOPS, 21.2 trillion FLOPS

Your suggestion is non-existent hardware of unknown price with immature software?
You were asking a ridiculous question, so I gave a ridiculous answer.

Do you know of any benchmark that shows that 5 M1 Ultra could beat an A100?
Forget 5, just one M1 Ultra beats an A100 in general-purpose FP32 processing power. (21 TFLOPS vs 19 TFLOPS). And with 1/3 of the power. Although that's clearly not what the A100 is designed for.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
Forget 5, just one M1 Ultra beats an A100 in general-purpose FP32 processing power. (21 TFLOPS vs 19 TFLOPS). And with 1/3 of the power. Although that's clearly not what the A100 is designed for.
It is better to use benchmarks because the computations may use different parts of the GPU. These figures don't fully explain the true potential of GPUs.

which existing library would run FP64 eigendecomposition on the GPU with maximal performance?
The eigendecomposition seems to be a very difficult numerical problem, so I am not sure if such a library exists. Therefore, we can compare hardware using another library for an easier problem: find a solution to large dense sets of linear equations.
- HPL for double precision (traditional scientific computations) https://netlib.org/benchmark/hpl/
- HPL-AI for mixed precision (machine learning computations) https://hpl-mxp.org
For the HPL-AI benchmark, A100 uses 32-bit Tensor-Cores (156 TFLOPS)

hpl.jpg

Edited for better visualization.

hpl-ai.jpg

There is no result for an A100 because the author of the post was not able to pass the residual error accuracy test for an A100.


Do you think two M1 Ultra could perform better in HPL-AI than two A100? You can choose another benchmark, but keep in mind that the A100 will use 32-bit tensor cores for matrix multiplication in single precision.
 

Gudi

Suspended
May 3, 2013
4,590
3,267
Berlin, Berlin
I agree, and from what rumors I've seen, I'm guessing M2 Extreme yields were so low that it didn't make financial sense. If history is an indicator, the M2 Extreme was going to be even larger in size then the M1 Ultra - if that is the case, then each wafer produces less M2s driving the unit costs up.
Nah! The by far likeliest "explanation" is that the entire rumor was fabricated by idiots, who thought only because two M1 Max can be fused together to work as one M1 Ultra, you can also glue four of them together. It was a ridiculous idea to begin with.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,674
Nah! The by far likeliest "explanation" is that the entire rumor was fabricated by idiots, who thought only because two M1 Max can be fused together to work as one M1 Ultra, you can also glue four of them together. It was a ridiculous idea to begin with.

There are literally Apple patents describing fusion of multiple chips into one SoC. There is no doubt that Apple is working on this technology. It’s key to competing in the high-end desktop market.

BTW, the next gen is likely to offer a new interconnect protocol that’s optimized for inter-chip communication. They explicitly talk about accessing caches and memory controllers that exist on a different device and how this can be achieved in a very fast and performance-efficient manner.
 

maflynn

macrumors Haswell
May 3, 2009
73,682
43,740
There are literally Apple patents describing fusion of multiple chips into one SoC. There is no doubt that Apple is working on this technology. It’s key to competing in the high-end desktop market.
Yep, that's my opinion as well, though it does look like things were not as smooth as Apple had hoped, and/or the yields of such actions were such that its currently not financially feasible
 

maflynn

macrumors Haswell
May 3, 2009
73,682
43,740
It's pointless competing against the giants in this space.
Apple has done a great job through its history of identifying markets where they could make a profit. They purposely chose to keep their laptops priced high, instead of participating in the race to the bottom with Dell, and HP years ago. did it hurt them? Sure, their sales were a lot lower then Dell and HP, but their profit margins were healthy.

I admit I know next to nothing about the enterprise market, but given the investment that would be needed by apple to enter the market and succeed is significant. I don't see any strategy that has them coming out positively. Many points were made about how apple's philosophy and corporate culture runs against the grain of will need to be done with the enterprise, i.e., support linux, be open and less secretive, etc etc.

funnily enough, the automotive market is just another market that seems incredibly difficult to break into and one that doesn't offer high profit margins, yet its rumored that apple is indeed working on something for that sector - so go figure, nothing is really out of the reach of Apple if they choose to do it.
 

Gudi

Suspended
May 3, 2013
4,590
3,267
Berlin, Berlin
There are literally Apple patents describing fusion of multiple chips into one SoC. There is no doubt that Apple is working on this technology. It’s key to competing in the high-end desktop market.
And these patents are literally already in use for fusing two M1 Max to one M1 Ultra. But that doesn't mean you can fuse four chips together without creating the need for conflict resolution and dispatching mechanisms, which far outweigh any theoretical benefits. The Mac Studio is doing exactly that, competing in the high-end desktop market.
BTW, the next gen is likely to offer a new interconnect protocol that’s optimized for inter-chip communication. They explicitly talk about accessing caches and memory controllers that exist on a different device and how this can be achieved in a very fast and performance-efficient manner.
The M1 already has the SSD controller and flash on-the-chip apart from the flash storage module.

 

leman

macrumors Core
Oct 14, 2008
19,521
19,674
And these patents are literally already in use for fusing two M1 Max to one M1 Ultra. But that doesn't mean you can fuse four chips together without creating the need for conflict resolution and dispatching mechanisms, which far outweigh any theoretical benefits.

Your claims have no basis in reality. Again: Apple has published patents that deal exactly with this. Ultra interconnect is covered in a separate patent.

The M1 already has the SSD controller and flash on-the-chip apart from the flash storage module.


Which has zero relation to what I wrote.
 
  • Like
  • Haha
Reactions: Gudi and maflynn

Oculus Mentis

macrumors regular
Original poster
Sep 26, 2018
144
163
UK
There are literally Apple patents describing fusion of multiple chips into one SoC. There is no doubt that Apple is working on this technology. It’s key to competing in the high-end desktop market.
Nvidia also has its chip to chip interconnect technology due to appear in its Grace and Grace Hoppers chips


Nvidia, same as Apple, uses TSMC for its manufacturing that’s why, in my opinion, Apple could join the market if it had the will and the appetite to expand the roots of its technology into the wider chip industry.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,674
Nvidia, same as Apple, uses TSMC for its manufacturing that’s why, in my opinion, Apple could join the market if it had the will and the appetite to expand the roots of its technology into the wider chip industry.

Oh, I have no doubt about that. For example, an Apple datacenter CPU would be miles ahead of anyone else. With heir performance per core and low power usage per core, they could easily build a huge chip that will outperform everyone else by a factor of two while using half the power. The difficulties often quoted (lack of PCIe channels) are easily remedied, after all, adding more PCIe controllers is not necessarily rocket science for them at this point.

What I doubt however is whether this kind of business makes sense for them. As mentioned before, Apple's technology is very expensive, more expensive than the competitors. Even if they can offer more performance and lower energy cost, the initial cost (and long-term maintenance!) is going to be a problem. Besides, some posters already pointed out that, enterprise is all about reliability and software. This is something where Apple will need to put in significant effort.

Basically, there are two options if they want to go this route. Either they sell datacenter hardware to partners which then take care of the ecosystem (this hardware will need to run Linux and offer modularity/maintainability far beyond the current models), or they keep the hardware private and sell services. In the later case they'd need to compete agains low-priced giants like Amazon and Google who already have their own ARM hardware. And I very much doubt they could offer competitive price here. I mean, Xcode Cloud costs $99 for 250 hours of compute time. I can have a reasonably-sized dedicated AWS instance 24/7 for under $30 per month. Nobody is going to use a general-purpose Apple cloud service at those prices.

Where I do see potential is specialised services. Like Xcode Cloud. It's expensive, but it also offers convenience that a simple cloud compute service won't. Or maybe they can have specialised in-house Apple Silicon hardware to run their Message etc. servers. Basically things where they can either maintain a high margin or reduce operational costs. But not where they sell the baseline service directly.
 

Philip Turner

macrumors regular
Dec 7, 2021
170
111
Do you think two M1 Ultra could perform better in HPL-AI than two A100? You can choose another benchmark, but keep in mind that the A100 will use 32-bit tensor cores for matrix multiplication in single precision.
Probably not, although I do have benchmark data for SGEMM. This is going off a tangent because my original point is the hardware could run things like OpenMM and JAX. However the software ecosystem prevents you from fully utilizing an M1 Mac you already own. It’s not the optimal solution for a computer you don’t yet own, but want to buy for HPC/AI.

For my best comparison, the following: M1 Ultra extrapolated from M1 Max GPU (e = emulated). Next is using the CPU, GPU, and ANE simultaneously, extrapolated from M1 Max or M1. Then the RTX 3090 Ti, with a purpose more similar to M1 Ultra, and the same generation as A100. Finally, a single A100 - not using sparsity. The data does reflect downfalls of real-world performance, as hardware isn't fully utilized. Regarding cost: M1 Ultra is a hybrid between consumer hardware and overpriced special-purpose hardware. The GPU's design and lower clock speed also requires more silicon for the same performance.

Utilized TFLOPSM1 Ultra (GPU)M1 Ultra (System)RTX 3090 TiA100 80 GB
Vector FP64~0.26-0.53e0.620.619.57
Matrix FP64~0.26-0.53e1.400.6119.14
Vector FP3220.8022.0339.2919.14
Matrix FP3216.8722.36157.16153.23
Vector FP1620.8023.2639.2976.59
Matrix FP1616.2843.06314.32306.47
Power (W)96184450250
Bandwidth (GB/s)75075010082039
Cost (USD)50005000200016300

Sources:

https://github.com/pytorch/pytorch/files/10250248/Matrix.Multiplication.Performance.xlsx

https://github.com/philipturner/metal-benchmarks

https://web.eece.maine.edu/~vweaver/group/green_machines.html

https://www.anandtech.com/show/17024/apple-m1-max-performance-review/3

https://github.com/philipturner/met...BlitEncoderAlternative/MainFile.swift#L27-L36

https://www.nvidia.com/en-us/geforce/news/geforce-rtx-3090-ti-out-now/

https://www.servethehome.com/nvidia-geforce-rtx-3090-review-a-compute-powerhouse/3/

https://www.ebay.com/itm/333991727955
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.