A bunch of new mac studio benchmarks dropped

crazy dave · Mar 17, 2022

leman said:
Intel and AMD also use same cores - and even same dies - for different products, by allowing different configurations at various performance and power levels. Apples approach is obviously a design choice, but it’s different enough from the mainstream practice to ask why they do it this way. Is it an arbitrary restriction (the chips can run faster but Apple limits them), is it a physical restriction of the design, is it a statistical restriction? At any rate, it’s more sane and transparent than what happens elsewhere.

To some extent yes, but I remember in the M1 Max review how Andrei noted that when Apple duplicated the performance cores they basically duplicated everything around them as well which is less typical in an Intel or AMD solution. To me from a business perspective it reflects the different business models. Apple builds and sells the full widget. They aren't trying to sell to a hundred different vendors at a hundred different price points so they don't build a hundred different variants. They may increase variants as chiplet tech becomes more commonplace but they can afford to keep the silicon part (in terms of frequency) simple and extend performance by going wide. So that's what they do. Maybe if the Mac Pro really is a firestorm/icestorm based processor but with a non-M die they'll do something different, but so far Apple's model is that "we give everyone up and down the M-based product stack the best balance of single threaded performance and efficiency and then use that efficiency to go wide for greater throughput while still being super efficient". As you say: its simply more sane and transparent. They don't have to push the clocks to really good performance, so ... they don't.

leman · Mar 17, 2022

crazy dave said:
To some extent yes, but I remember in the M1 Max review how Andrei noted that when Apple duplicated the performance cores they basically duplicated everything around them as well which is less typical in an Intel or AMD solution. To me from a business perspective it reflects the different business models. Apple builds and sells the full widget. They aren't trying to sell to a hundred different vendors at a hundred different price points so they don't build a hundred different variants. They may increase variants as chiplet tech becomes more commonplace but they can afford to keep the silicon part (in terms of frequency) simple and extend performance by going wide. So that's what they do. Maybe if the Mac Pro really is a firestorm/icestorm based processor but with a non-M die they'll do something different, but so far Apple's model is that "we give everyone up and down the M-based product stack the best balance of single threaded performance and efficiency and then use that efficiency to go wide for greater throughput while still being super efficient". As you say: its simply more sane and transparent. They don't have to push the clocks to really good performance, so ... they don't.

Fully agreed. That is also what I mean with horizontal scaling: they scale both the computational backend and the resources needed (cache, RAM bandwidth etc.) to drive that backend. This is indeed a refreshingly honest take on computing.

Pressure · Mar 17, 2022

Homy said:
@JimmyjamesEU posted this in another thread but it seems that M1 Ultra will be outperforming 3090 as Apple claimed. Here are some results from rendering Disney's Moana in Redshift:

2x 2080ti = 34m:17s
M1 Max = 28m:27s
Single 3090 = 21m:45s
2x 3090 = 12m:44s

My guess for M1 Ultra 48c is 18m:58s and for 64c 14m:13s.

Peek Performance: March 8th Event [Event discussion begins on p6!]

My BTO Mac Studio has just switched to "preparing to ship" :eek: Maybe I'll get it sooner rather than later? I have a hold on my credit card, but it's still at 'Processing'. No hold for the display yet though. I had to spend an hour on the phone with Apple on Saturday to get mine cleared...

arstechnica.com

Peek Performance: March 8th Event [Event discussion begins on p6!]

My BTO Mac Studio has just switched to "preparing to ship" :eek: Maybe I'll get it sooner rather than later? I have a hold on my credit card, but it's still at 'Processing'. No hold for the display yet though. I had to spend an hour on the phone with Apple on Saturday to get mine cleared...

arstechnica.com

Peek Performance: March 8th Event [Event discussion begins on p6!]

My BTO Mac Studio has just switched to "preparing to ship" :eek: Maybe I'll get it sooner rather than later? I have a hold on my credit card, but it's still at 'Processing'. No hold for the display yet though. I had to spend an hour on the phone with Apple on Saturday to get mine cleared...

arstechnica.com

A Redshift developer have said that the default benchmark for Redshift is showing poor scaling due to using 128x128 blocks.

Please note that the benchmark is using 128x128 blocks which are not ideal to the M1 chips (due to their pretty high latencies). So the benchmark (as it stands today) might not show you particularly good scaling. The same is true for the M1 Pro/Max - but to a lesser extent.

As a stopgap solution I might look at adding a blocksize parameter to the benchmark. Or force it to 256 or larger if apple silicon is detected.

EntropyQ3 · Mar 17, 2022

leman said:
Yes, M1 only scales horizontally, by increasing the number of clusters. The clusters themselves are “locked” and will perform the same in all M1 chips. Some folks familiar with chip design speculated that this is how Apple achieves their superior power efficiency: by optimizing the very layout to max out at a relatively low speed. These chips can scale down very well (I remember seeing that M1 has a power level with something really ridiculous like 50mhz), but it cant be pushed higher even if there is enough thermal headroom. This is very different to x86 designs.

I question this.
It's not that you don't design towards a target frequency, you do, what I take issue with is the assertion that Apples SoCs somehow can't run at higher frequencies.
Because that assertion requires that:
a) Apples SoCs aren't subject to the general stochastic performance spread that all other logic chips are. The distribution function is roughly similar in all examples I have seen, and I have never seen an explanation as to why Apples SoCs should be any different.
b) The properties of a process vary with voltage/frequency/temperature regardless of the specifics of the logic design. Again, this makes the end result behave similarly. How Apples SoC specifically would not be subject to this is also something I've never seen an argument for.

So my impression is that this is an assertion put forth primarily by x86 fans who want to believe that Apple can't do the same things as is done in x86 space, and the only "proof" of that assertion is that Apple isn't doing it - ignoring that the underlying process technology says "of course they can".

Note that Apple isn't doing that anywhere, not with their phone SoCs, not with their Mac SoCs, nowhere. It's just not part of their current business/market segmentation plan.

Also note that there can certainly be a bit of variability in how different logic designs spread due to manufacturing and operating parameters! But we have no way of knowing if Apples SoCs are particularly different in those respects. They seem to power/clock scale like their competitors in mobile space for instance, so why would the same designs in Macs be any different, really?

leman · Mar 17, 2022

EntropyQ3 said:
I question this.
It's not that you don't design towards a target frequency, you do, what I take issue with is the assertion that Apples SoCs somehow can't run at higher frequencies.
Because that assertion requires that:
a) Apples SoCs aren't subject to the general stochastic performance spread that all other logic chips are. The distribution function is roughly similar in all examples I have seen, and I have never seen an explanation as to why Apples SoCs should be any different.
b) The properties of a process vary with voltage/frequency/temperature regardless of the specifics of the logic design. Again, this makes the end result behave similarly. How Apples SoC specifically would not be subject to this is also something I've never seen an argument for.

So my impression is that this is an assertion put forth primarily by x86 fans who want to believe that Apple can't do the same things as is done in x86 space, and the only "proof" of that assertion is that Apple isn't doing it - ignoring that the underlying process technology says "of course they can".

Note that Apple isn't doing that anywhere, not with their phone SoCs, not with their Mac SoCs, nowhere. It's just not part of their current business/market segmentation plan.

Also note that there can certainly be a bit of variability in how different logic designs spread due to manufacturing and operating parameters! But we have no way of knowing if Apples SoCs are particularly different in those respects. They seem to power/clock scale like their competitors in mobile space for instance, so why would the same designs in Macs be any different, really?

I don’t think we are in any disagreement here. There are two potential messages in saying as I did that Apple chips only scale horizontally: one is a basic observation about the product (i.e. all M1 Firestorm cores perform identically and this is a hard fact), another one is an prediction/explanation (i.e. Firestorm can’t go above 3.2ghz). I am only claiming the first part (which again, is an irrefutable fact), I am not at all sure about the second part, although I do find it interesting. Earlier experiments done by Andrei from Anandtech did note very rapid increase of power consumption on A12 chips close to their maximal frequency, and while there is undoubtedly some variation between individual chips, I don’t think we can easily dismiss the idea that Firestorm is designed to top somewhere close to 3.2ghz, but do so reliably and efficiently. After all, Apples incredible power efficiency has to come from somewhere. Their node lead is not enough to explain why they need 3x less power to deliver the same peak performance as the closest competitor.

This is also not about Apple being unable to do the same thing as x86 manufacturers, but more about them not needing/wanting to do it. I do think that it would be nice if M series could have a little “play” in this area, to get a bit more performance in the desktop space at the expense of power efficiency, but there are also undeniable advantages of doing what Apple is doing. If they can continue delivering consistent performance improvements while keeping the current approach it would be amazing.

EntropyQ3 · Mar 17, 2022

leman said:
I don’t think we are in any disagreement here. There are two potential messages in saying as I did that Apple chips only scale horizontally: one is a basic observation about the product (i.e. all M1 Firestorm cores perform identically and this is a hard fact), another one is an prediction/explanation (i.e. Firestorm can’t go above 3.2ghz). I am only claiming the first part (which again, is an irrefutable fact), I am not at all sure about the second part, although I do find it interesting. Earlier experiments done by Andrei from Anandtech did note very rapid increase of power consumption on A12 chips close to their maximal frequency, and while there is undoubtedly some variation between individual chips, I don’t think we can easily dismiss the idea that Firestorm is designed to top somewhere close to 3.2ghz, but do so reliably and efficiently. After all, Apples incredible power efficiency has to come from somewhere. Their node lead is not enough to explain why they need 3x less power to deliver the same peak performance as the closest competitor.

This is also not about Apple being unable to do the same thing as x86 manufacturers, but more about them not needing/wanting to do it. I do think that it would be nice if M series could have a little “play” in this area, to get a bit more performance in the desktop space at the expense of power efficiency, but there are also undeniable advantages of doing what Apple is doing. If they can continue delivering consistent performance improvements while keeping the current approach it would be amazing.

My main point was simply that a major part of the performance spread (process stochastics) and shape of the power vs. frequency curve is due to the lithographic process itself, not the silicon design.

It pretty much stands to reason that Apple sets its processor frequencies to optimise for yield. They don't sell any lower frequency bargain bin versions of their SoCs anywhere, so chips that fail to operate within the defined parameters are pure loss. Ergo, the cut-off point on the distrubution curve is, in all likelyhood, set quite low.

Since Apple doesn't sell higher frequency versions of their SoCs, and users have no option to set higher power targets or clocks, we just don't know either the shape of their power vs. frequency curve beyond Apples predefined limits, or any hard limits where some specific subsystem of the SoC tends to fail. Seemingly, no-one outside Apple does.

Apple chooses other means to achieve sufficient meaningful separation between products. This could change in the future, but I'm not sure that it necessarily ever will. I have the feeling that Apple is quite satisfied to fight the marketing battle with efficiency, ergonomics and product design rather than going into a myriad of processor frequency bins - that's the tactic of a parts supplier, not someone who sells whole systems to end users. And if they ventured higher on the frequency vs power curve, it would directly impact some of their primary system level benefits - battery life, enclosure design flexibility, noise, et cetera.

jeanlain · Mar 17, 2022

Reviews confirm that the M1 Ultra GPU is not twice faster than the M1 Max GPU (and sometimes slower!).
Performance seem to be a bit behind the RTX 3080 in native benchmark tools. This is good, but I'm not sure how Apple found the M1 Ultra to be faster than the 3090. In video editing, colour grading, or rendering huge 3D scenes maybe...

mi7chy · Mar 17, 2022

It doesn't even ship until tomorrow for normal people so needs time for optimization.

JimmyjamesEU · Mar 17, 2022

jeanlain said:
Reviews confirm that the M1 Ultra GPU is not twice faster than the M1 Max GPU (and sometimes slower!).
Performance seem to be a bit behind the RTX 3080 in native benchmark tools. This is good, but I'm not sure how Apple found the M1 Ultra to be faster than the 3090. In video editing, colour grading, or rendering huge 3D scenes maybe...

Regardless of whether the Ultra is near the 3090, a result that shows it slower than the Max is...weird. Surely it has to be a bug or a problem with the test.

ader42 · Mar 17, 2022

jeanlain said:
Reviews confirm that the M1 Ultra GPU is not twice faster than the M1 Max GPU (and sometimes slower!).
Performance seem to be a bit behind the RTX 3080 in native benchmark tools. This is good, but I'm not sure how Apple found the M1 Ultra to be faster than the 3090. In video editing, colour grading, or rendering huge 3D scenes maybe...

I seem to recall reading somewhere that the M1 Ultra performed better with higher resolutions - are there any similar benchmarks that go higher than 1440p?

Homy · Mar 17, 2022

jeanlain said:
Reviews confirm that the M1 Ultra GPU is not twice faster than the M1 Max GPU (and sometimes slower!).
Performance seem to be a bit behind the RTX 3080 in native benchmark tools. This is good, but I'm not sure how Apple found the M1 Ultra to be faster than the 3090. In video editing, colour grading, or rendering huge 3D scenes maybe...

If you go to gfxbench.com you get other results. There M1 Ultra is faster than RTX 3080 and close to 3090. Also Anandtech explained last time reviewing M1 Max/Pro that M1 seems to be CPU bound in games and can't use all the memory bandwidth to feed the GPU.

Skärmavbild 2022-03-18 kl. 01.41.22.png

Gerdi · Mar 17, 2022

leman said:
I have no idea how Apple does it. The 3.2ghz limit could be a physical property of this CPU design, or it could merely be a statistical common ground most manufactured chips can sustain, or it could be a business decision on Apples side (a weird one). Regardless of the technical reason, this is what we have, and this is what I meant in my post: M1 relies exclusively on horizontal scaling, the cores themselves do not exhibit any performance scaling/binning across products. This is true for the CPU and the GPU equally.

My point is, that every circuit scales with the same parameters, be it an Intel CPU or an Apple CPU or any other electronic circuit. And given that an A15 core essentially runs with almost same frequency in an iPhone and in the M1Ultra just shows that Apple is not remotely is using the frequency scaling potential - but this potential must be there. Let me repeat it, it is technically impossible that a circuits does not scale if you change the parameters - unless you assume that an iPhone is already designed using highest leaking cells possible running at the highest possible overdrive voltage. And it is safe to assume that the Intel/AMD desktop CPUs maxing out all parameters.

So it is a deliberate decision to not sacrifice efficiency, which would be the case if you try to scale frequency one way or another - it does not mean that the M1 potentially would scale worse than Intel or AMD CPUs.

Binning is an orthogonal issue. If you do not bin, then the timing sign-off has to be done at the worst corner, which limits the frequency on top of any physical design choices.

donth8 · Mar 17, 2022

JimmyjamesEU said:
Regardless of whether the Ultra is near the 3090, a result that shows it slower than the Max is...weird. Surely it has to be a bug or a problem with the test.

Agreed, something is definitely wrong… No way it should be slower with twice the cores and GFXBench usually scales linearly for Apple chips.

Tomsguide tested with wildlife extreme unlimited and it got 210.3 fps compared to 121 fps for the M1 max. That is almost a 74% increase which is better than the 55% in GFXBench.

Homy · Mar 17, 2022

JimmyjamesEU said:
Regardless of whether the Ultra is near the 3090, a result that shows it slower than the Max is...weird. Surely it has to be a bug or a problem with the test.

donth8 said:
Agreed, something is definitely wrong… No way it should be slower with twice the cores and GFXBench usually scales linearly for Apple chips.

Tomsguide tested with wildlife extreme unlimited and it got 210.3 fps compared to 121 fps for the M1 max. That is almost a 74% increase which is better than the 55% in GFXBench.

Maybe GFXbench needs an update to recognize M1 Ultra, because Ultra is slower in many GFXbench tests compared to Max.

Homy · Mar 17, 2022

Gaming is supposed to be Mac's Achilles' heel. Yet M1 Ultra is only 18 fps slower than 3090 and faster than Mac Pro with dual Radeon Pro Vega II 32 GB in Shadow of the Tomb Raider at 1440p. A game that is not coded for ARM running through Rosetta, not intended for iGPUs but dGPUs and neither written for Apple GPUs but AMD/Nvidia, all according to Brad Oliver one of the lead Mac game developers previously at Westlake Interactive and Aspyr. Impressive iGPU indeed running at 200W less power! (source the Verge)

Skärmavbild 2022-03-18 kl. 03.07.10.png

leman · Mar 17, 2022

Gerdi said:
My point is, that every circuit scales with the same parameters, be it an Intel CPU or an Apple CPU or any other electronic circuit. And given that an A15 core essentially runs with almost same frequency in an iPhone and in the M1Ultra just shows that Apple is not remotely is using the frequency scaling potential - but this potential must be there. Let me repeat it, it is technically impossible that a circuits does not scale if you change the parameters - unless you assume that an iPhone is already designed using highest leaking cells possible running at the highest possible overdrive voltage. And it is safe to assume that the Intel/AMD desktop CPUs maxing out all parameters.

If I remember correctly, folks familiar with circuit design claimed that it can be approached differently, resulting in different properties. Maybe someone like @cmaier can comment on this.

leman · Mar 17, 2022

Homy said:
Gaming is supposed to be Mac's Achilles' heel.

Only in the imagination of people who have limited understanding of GPU performance. The preconceived notions of “Apple is bad for gaming” is so strong that people simply don’t stop to look at the facts. These GPUs are tremendously good at rasterization and gaming is one of their design strongest point. Apple has been pouring a lot of R&D resources over the last years to make the worlds most capable perf/watt gaming GPU. The problem with Apple and gaming is just like the problem with Apple and rendering - there is not much software that takes the platform seriously and actually tries to achieve good performance there. But take a well-optimized game (so far only BG3), and even the base M1 makes a good figure next to dedicated gaming laptops.

jeanlain · Mar 17, 2022

Homy said:
Also Anandtech explained last time reviewing M1 Max/Pro that M1 seems to be CPU bound in games and can't use all the memory bandwidth to feed the GPU.

But GFXBench barely uses the CPU.

Maybe GFXbench needs an update to recognize M1 Ultra, because Ultra is slower in many GFXbench tests compared to Max.

The trend seems to be that at low resolution (1080p), there is no gain.
GFXBench needs to update their test suit to better account for non-mobile devices. 1080 or 1440p is not adequate for offscreen tests.

jeanlain · Mar 18, 2022

Homy said:
Gaming is supposed to be Mac's Achilles' heel. Yet M1 Ultra is only 18 fps slower than 3090 and faster than Mac Pro with dual Radeon Pro Vega II 32 GB in Shadow of the Tomb Raider at 1440p.

The performance gain is bigger at higher resolution, as expected. The ultra is almost 2X faster than the M1 Max at 4k, not bad.
It's a frustrating that we have no such comparison in a native games. None of these game has an integrated benchmark tool, AFAIK.

jeanlain · Mar 18, 2022

leman said:
These GPUs are tremendously good at rasterization and gaming is one of their design strongest point. Apple has been pouring a lot of R&D resources over the last years to make the worlds most capable perf/watt gaming GPU. The problem with Apple and gaming is just like the problem with Apple and rendering - there is not much software that takes the platform seriously and actually tries to achieve good performance there.

Which doesn't seem to bother Apple...

leman · Mar 18, 2022

jeanlain said:
Which doesn't seem to bother Apple...

Rome was not built in one day. They play the long game.

T'hain Esh Kelch · Mar 18, 2022

leman said:
Rome was not built in one day. They play the long game.

So far it has been a 46 year long game..

jeanlain · Mar 18, 2022

leman said:
Rome was not built in one day. They play the long game.

Well, at least someone put on the task to build Rome. Has Apple ever contacted a PC game developer? I bet this has always been the other way around.

leman · Mar 18, 2022

T'hain Esh Kelch said:
So far it has been a 46 year long game..

How so? Their first mainstream Mac capable of gaming was released in late 2020

Rickroller · Mar 18, 2022

jeanlain said:
Well, at least someone put on the task to build Rome. Has Apple ever contacted a PC game developer? I bet this has always been the other way around.

Why wouldn’t Apple build their own 3D renderer…? When you consider their ‘long game’ approach to most things, it seems logical that they would put some effort into doing something that takes advantage of and shows off what their GPU can really do. Besides it is even more conspicuous in its absence considering their focus on AR/VR as a vital part of Apples future. Heck even AMD have a rendering engine, and they’re coming from a position where they were looking for loose change between the couch cushions to pay rent!

A bunch of new mac studio benchmarks dropped

macrumors 68000

macrumors Core

macrumors 603

macrumors 6502a

macrumors Core

macrumors 6502a

macrumors 68020

Suspended

Suspended

macrumors 6502

macrumors 68030

macrumors 6502

macrumors regular

macrumors 68030

macrumors 68030

macrumors Core

macrumors Core

macrumors 68020

macrumors 68020

macrumors 68020

macrumors Core

macrumors 604

macrumors 68020

macrumors Core

macrumors regular

Our Staff