So M1 Max with 32 Core GPU matches RTX 3080 Mobile.

jeanlain · Oct 21, 2021

EugW said:
The bottom line is that there isn't perfect linear scaling.

Achieving only 1.5 better performance at a pure compute task by doubling the number of cores and memory bandwidth is very disappointing, especially considering that performance is doubled by going form 8 to 16 cores.
Of course it's not "perfect", but here it is *very far* from it.

If this scaling reflects all GPU tasks, I'd say that Apple screwed up somewhere with the M1 Max.
But I believe there is some unknown issue with this particular test, or that performance is constrained by the low-power mode.

EDIT: the scaling from 16 to 32 cores is much more linear in GFXbench, where the M1 Max is about 1.9x faster than the M1 Pro in aztec high tier and Manhattan 3.1.

GFXBench - Unified cross-platform 3D graphics benchmark database

The first unified cross-platform 3D graphics benchmark database for comparing Android, iOS, Windows 8, Windows Phone 8 and Windows RT capable devices based on graphics processing power.

gfxbench.com

StoneJack · Oct 21, 2021

jeanlain said:
Achieving only 1.5 better performance at a pure compute task by doubling the number of cores and memory bandwidth is very disappointing, especially considering that performance is doubled by going form 8 to 16 cores.
Of course it's not "perfect", but here it is *very far* from it.

If this scaling reflects all GPU tasks, I'd say that Apple screwed up somewhere with the M1 Max.
But I believe there is some unknown issue with this particular test, or that performance is constrained by the low-power mode.

I think you are wrong. M1X gives a desktop class GPU performance in a tiny notebook, constrained by power envelope, battery and weight. That's good enough on its own. It is like you don't need eGPU at all. and those cards cost as much as notebook itself. So Apple GPU is very good on its own

jeanlain · Oct 21, 2021

diamond.g said:
I am not sure if anyone has a way of viewing Mac results on the web page, they hide the iOS results.

The 3Dmark database search is garbage.

jeanlain · Oct 21, 2021

jeanlain said:
I don't know the score of the 5600M in redshift, but given the 5500M results, it should complete the test in about 21 minutes. 4x faster than that gives 5.3 minutes, about the same as a RTX 3060 with RTX on.

Thinking about it more, I suppose Apple tested a particular scene using more than 8GB VRAM in redshift, which the 5600M could not handle properly. The sheer compute power of the M1 Max is not 4x that of the 5600M. It's closer to 2x.

crazy dave · Oct 21, 2021

StoneJack said:
I think you are wrong. M1X gives a desktop class GPU performance in a tiny notebook, constrained by power envelope, battery and weight. That's good enough on its own. It is like you don't need eGPU at all. and those cards cost as much as notebook itself. So Apple GPU is very good on its own

He’s not talking about how good the GPU is in an absolute sense, but rather being disappointed in how it scales relative to the core count and memory of the smaller Apple M-series GPUs.

crazy dave · Oct 21, 2021

jeanlain said:
Achieving only 1.5 better performance at a pure compute task by doubling the number of cores and memory bandwidth is very disappointing, especially considering that performance is doubled by going form 8 to 16 cores.
Of course it's not "perfect", but here it is *very far* from it.

If this scaling reflects all GPU tasks, I'd say that Apple screwed up somewhere with the M1 Max.
But I believe there is some unknown issue with this particular test, or that performance is constrained by the low-power mode.

EDIT: the scaling from 16 to 32 cores is much more linear in GFXbench, where the M1 Max is about 1.9x faster than the M1 Pro in aztec high tier and Manhattan 3.1.

GFXBench - Unified cross-platform 3D graphics benchmark database

The first unified cross-platform 3D graphics benchmark database for comparing Android, iOS, Windows 8, Windows Phone 8 and Windows RT capable devices based on graphics processing power.

gfxbench.com

That’s just weird. Graphics is scaling but not compute? Does not compute … 🙃 TBDR GPUs are weird if true …

Personally I wouldn’t have said it was disappointing if Apple had said this is what the scaling was - ie that they lowered the clocks to keep the GPU power/heat in check for notebooks. That’d be totally fine. But they didn’t.

Edit: you know as I think about it, it still isn’t disappointing to me actually, it’s just … weirdly intriguing …

jmho · Oct 22, 2021

jeanlain said:
Thinking about it more, I suppose Apple tested a particular scene using more than 8GB VRAM in redshift, which the 5600M could not handle properly. The sheer compute power of the M1 Max is not 4x that of the 5600M. It's closer to 2x.

They mention in the small print that they used a 1.32 GB scene.

UBS28 · Oct 22, 2021

How about waiting with looking at benchmark untill Apple releases an Mac OS update that enables high-power mode for the 16”?

I believe a new Mac OS is coming out in a few days which hopefuly already contains this feature.

crazy dave · Oct 22, 2021

UBS28 said:
How about waiting with looking at benchmark untill Apple releases an Mac OS update that enables high-power mode for the 16”?

I believe a new Mac OS is coming out in a few days which hopefuly already contains this feature.

But that would ruin the fun of all this speculation

iBug2 · Oct 22, 2021

My guess is still the same. M1 Max will match 3080 Laptop in rendering tasks, not in compute. And it'll do it with battery.

thenewperson · Oct 22, 2021

Are there any numbers between the 16 core M1 Pro and the 32 core Max?

Macintosh IIcx · Oct 22, 2021

jmho said:
They mention in the small print that they used a 1.32 GB scene.

The 4x faster render with Redshift with the 32 GPU over the 5600M + 8GB HBM2 is what really caught my eye as the most impressive performance jump for the M1 Pro/Max GPUs (2,5x with the 16 GPU, so not perfect linear scaling here ether).

3D rendering is often an excellent indicator of compute power (for normal operations) if you are to compare GPUs, as it tend to scale very nicely.

These Redshift numbers from Apple goes very much against Geekbench Metal for M1 Max 32 GPU so let's wait and see.

My high hope is Radeon VII-like performance for rendering on the M1 Max 32 GPU.

Macintosh IIcx · Oct 22, 2021

treehuggerpro said:
First post, possibly last. Intriguing thread, I’m interest because I’m expecting these new chips will be in the new iMacs next.

Pressure’s comment, along with the High Power Mode in Monterey, would explain some of the discrepancies between Apple’s M1 Max linear (4x) marketing and the numbers from the new MacBook Pro models, if:

- The M1 Pro’s GFX run at full clock, i.e. the benchmarks scale in a linear way.

- The M1 Max’s GFX are down-clocked about 25% for reduced power and heat, i.e. the results are (currently) less than linear.

- Monterey will provide the 16 inch M1 Max model with a full-clock mode + extra cooling (probably while plugged in) that the 14 inch will not get; which possibly explains why there’s a small weight adjustment between the M1 Max and M1 Pro on the 16 inch model that’s not in the 14 inch model.

If right, this would work out well for the new 27” iMacs with the M1 Max chips running full clock GFX, a good bump over the existing 27” iMac, while still being cooler, quieter and more power efficient in a new design.

Another piece of the puzzle to this is that the 16" M1 Pro/Max comes with a 140W Power Adapter. You will only get the 96W power adapter on the 14" even if you order the M1 Max 32 GPU. So High Power Mode might allow over 96W power draw, who knows (until Monday)?

Serban55 · Oct 22, 2021

iBug2 said:
My guess is still the same. M1 Max will match 3080 Laptop in rendering tasks, not in compute. And it'll do it with battery.

even if it will match an 3070 laptop...we never ever had this kind of jump in macs on this segment...

Serban55 · Oct 22, 2021

Blackmagic DaVinci Resolve 17.4 adds 5x faster performance with Apple M1 Pro and M1 Max chips - Newsshooter

Blackmagic Design has released DaVinci Resolve 17.4 update that adds 5x faster performance when using Apple's latest MacBook Pros that have Apple M1 Pro

www.newsshooter.com

the8thark · Oct 22, 2021

Melbourne Park said:
With such a commitment two vertical integration, I now think that Steve Job's ethos has been chucked right out now by Tim Cook. It's the industrial engineer in charge of the whole operation ... Henry Ford would be proud.

Apple has always been about vertical integration. This has never changed. Sure Apple has doubled down on it now, but this concept and ethos is the same as in the 128k days as it is now. It has not changed.

The one and same ethos Apple has always had. Nothing was created or chucked out by Tim.

jmho · Oct 22, 2021

Serban55 said:
even if it will match an 3070 laptop...we never ever had this kind of jump in macs on this segment...

In many ways it's a shame they compared it to a 3080.

Someone mentioned the XDR and how Apple sabotaged themselves with the comparison to the $30k Sony monitor and I totally agree.

It's psychologically like offering someone $100, but then only giving them $80. They should be overjoyed that you gave them $80, but by "anchoring" them at $100 the only thing they're thinking about is the missing $20.

Malus120 · Oct 22, 2021

jeanlain said:
Achieving only 1.5 better performance at a pure compute task by doubling the number of cores and memory bandwidth is very disappointing, especially considering that performance is doubled by going form 8 to 16 cores.
Of course it's not "perfect", but here it is *very far* from it.

If this scaling reflects all GPU tasks, I'd say that Apple screwed up somewhere with the M1 Max.
But I believe there is some unknown issue with this particular test, or that performance is constrained by the low-power mode.

(Leaving this here because I believe the point is still relevant, but I missed that you were talking about pure compute tasks which might be expected to scale better than the kind of workloads I was referencing.)

Not to be rude, but non linear scaling with GPU's when increasing core count and or memory bandwidth is not at all unusual.

Look at the top tier SKUs vs the mid tier SKUs from AMD or Nvidia. The RX 6900XT (AMD's flagship,) has twice as many (80 Compute Units) vs the 6700XT (40 Compute Units) but performance is only ~60% better. Similarly, Nvidia's RTX 3090 has 10496 CUDA Cores and 936GB/s of memory bandwidth, roughly double that of the RTX 3060TI (4864 / 448GB/s)and RTX 3070 (5888 / 448GB/s) SKUs, yet only performs 45%-60% better than those parts.

I'm hoping the M1 Max scale's well but expecting close to a 2X over the M1 Pro is probably a little optimistic (although Apple's marketing suggests otherwise so I'd love to be wrong.)

quarkysg · Oct 22, 2021

Malus120 said:
Not to be rude, but non linear scaling with GPU's when increasing core count and or memory bandwidth is not at all unusual.

Look at the top tier SKUs vs the mid tier SKUs from AMD or Nvidia. The RX 6900XT (AMD's flagship,) has twice as many (80 Compute Units) vs the 6700XT (40 Compute Units) but performance is only ~60% better. Similarly, Nvidia's RTX 3090 has 10496 CUDA Cores and 936GB/s of memory bandwidth, roughly double that of the RTX 3060TI (4864 / 448GB/s)and RTX 3070 (5888 / 448GB/s) SKUs, yet only performs 45%-60% better than those parts.

I'm hoping the M1 Max scale's well but expecting 2X over the M1 Pro is probably more than a little optimistic (although Apple's marketing suggests otherwise so I'd love to be wrong.)

I would think Apple's approach with UMA plus massive bandwidth will scale better compared to conventional GPUs with PCIe.

My simplistic calculation is as such.

Let's take the RTX 3090 example with 936 GB/s of bandwidth. With PCIe's max bandwidth of 32 GB/s, assuming the CPU is sending data over to the RTX 3090 to be processed, total effective bandwidth will be

(936 + 32) / 2 = 484 GB/s

The above is assuming that the direction is one way without the CPU getting back processed data (e.g. rastered frame to be display).

If CPU needs the result back, the total effective bandwidth will be diminished further.

I'm sure this is not the complete picture but I think it is a good approximation.

iBug2 · Oct 22, 2021

Serban55 said:
even if it will match an 3070 laptop...we never ever had this kind of jump in macs on this segment...

Yeah, in any case, these laptops are now in a league of their own. The 3080 laptops cannot sustain those powers with battery anyway, so you should think of them as desktops. When you use them mobile, they will be considerably slower than M1 Max.

iBug2 · Oct 22, 2021

jmho said:
In many ways it's a shame they compared it to a 3080.

Someone mentioned the XDR and how Apple sabotaged themselves with the comparison to the $30k Sony monitor and I totally agree.

It's psychologically like offering someone $100, but then only giving them $80. They should be overjoyed that you gave them $80, but by "anchoring" them at $100 the only thing they're thinking about is the missing $20.

I believe it'll be as good as 3080 unless you are doing pure compute.

diamond.g · Oct 22, 2021

jeanlain said:
The 3Dmark database search is garbage.

I wonder if it is because the mobile version of wildlife isn't running the same precision as the desktop version. (IE mobile is FP16 vs FP32)

crazy dave · Oct 22, 2021

Malus120 said:
(Leaving this here because I believe the point is still relevant, but I missed that you were talking about pure compute tasks which might be expected to scale better than the kind of workloads I was referencing.)

Not to be rude, but non linear scaling with GPU's when increasing core count and or memory bandwidth is not at all unusual.

Look at the top tier SKUs vs the mid tier SKUs from AMD or Nvidia. The RX 6900XT (AMD's flagship,) has twice as many (80 Compute Units) vs the 6700XT (40 Compute Units) but performance is only ~60% better. Similarly, Nvidia's RTX 3090 has 10496 CUDA Cores and 936GB/s of memory bandwidth, roughly double that of the RTX 3060TI (4864 / 448GB/s)and RTX 3070 (5888 / 448GB/s) SKUs, yet only performs 45%-60% better than those parts.

I'm hoping the M1 Max scale's well but expecting close to a 2X over the M1 Pro is probably a little optimistic (although Apple's marketing suggests otherwise so I'd love to be wrong.)

Compute does indeed scale pretty linearly. For instance, you mention the 3090 vs 3070:

Alienware Alienware Aurora Ryzen Edition - Geekbench

Benchmark results for an Alienware Alienware Aurora Ryzen Edition with an AMD Ryzen 9 5950X processor.

browser.geekbench.com

LENOVO 82JD - Geekbench

Benchmark results for a LENOVO 82JD with an Intel Core i7-11800H processor.

browser.geekbench.com

As you see above, you’ll find basically double for the 3090 compared to the 3070. I couldn’t be bothered to match CPUs which is one reason why the 3090 does even better. The other far more important reason being the 3090 is also clocked higher than the 3070 here. But that just hits home that a lot of GPU compute, especially benchmark compute, is basically tied to TFLOPs. I mean if you look at subtests it’s a more complicated picture but bottom line: embarrassingly parallel compute scales linearly with core count until you hit memory bottlenecks or clocks are reduced for power saving or reduced heat.

EntropyQ3 · Oct 22, 2021

diamond.g said:
I wonder if it is because the mobile version of wildlife isn't running the same precision as the desktop version. (IE mobile is FP16 vs FP32)

Do you have a good source for that, or is it personal speculation? I checked ULs benchmark descriptions and the general web and have seen no indication of this. Is my google-fu failing me?

Serban55 · Oct 22, 2021

jmho said:
In many ways it's a shame they compared it to a 3080.

Someone mentioned the XDR and how Apple sabotaged themselves with the comparison to the $30k Sony monitor and I totally agree.

It's psychologically like offering someone $100, but then only giving them $80. They should be overjoyed that you gave them $80, but by "anchoring" them at $100 the only thing they're thinking about is the missing $20.

thats a shame for those who believe the company from where they buy their product...i dont care , i care to see on my work how it perform..if its not what i expect/need , i return it
But remember these should be on par with the 3080 mobile from razer...and that cannot perform well under load, dont mistake the 3080 mobile from the MSI. Again, if these Mbp cooling can sustain the 32gpu for long perioads of time, after 5 min will be beyond that razer 3080mobile that will thermal throttle a lot
So again, the cooling system could lead the edge a lot compared with the compact dgpu 3080 laptops

So M1 Max with 32 Core GPU matches RTX 3080 Mobile.

macrumors 68020

macrumors 68040

macrumors 68020

macrumors 68020

macrumors 68000

macrumors 68000

macrumors 6502a

macrumors 68030

macrumors 68000

macrumors 601

macrumors 65816

macrumors 6502a

macrumors 6502a

Suspended

Suspended

macrumors 601

macrumors 6502a

macrumors 6502a

macrumors 65816

macrumors 601

macrumors 601

macrumors G5

macrumors 68000

macrumors 6502a

Suspended

Our Staff