First M3 Max Benchmark!

name99 · Nov 28, 2023

The techheads reading this thread might enjoy reading
https://eclecticlight.co/2023/11/27/evaluating-m3-pro-cpu-cores-1-general-performance/

As usual, Howard is more interested in understanding the chips than in getting page views, so he looks at things from a few unusual angles.
Points to note include
- once genuine frequencies are taken into account, the IPC gains look higher than zero.
(You might ask why the M3 P frequency didn't ramp to full max. Presumably the heuristics for DVFS ramping have changed slightly, though who knows how exactly. Perhaps max frequency is only engaged after a few seconds of lower frequency, on the grounds that there's no point in doubling power if the issue is saving .25 seconds of wait time?)

The "NEON" and "simd" changes are substantial and interesting.
The relevant gating code is essentially a stream of dependent FADD's. If FADD takes 3 cycles (the case for M1) then a 3GHz M1 can do 1B successive dependent FADDs, and that's in fact the case.
So how do we get to ~1.6 as fast? The best case frequency different is 4/3 which isn't enough.
My GUESS is that the latency for FADD (perhaps under some specific conditions like when a result feeds directly into the same unit so there's no waiting for the register from a common dispatch bus) FADD latency has dropped to 2.5 cycles (or more precisely something like 2 dependent FADDs in 5 cycles).
If that's the case then we expect performance to increase by something like 4/3*3/2.5=1.6.

throAU · Dec 5, 2023

name99 said:
Evaluating M3 Pro CPU cores: 1 General performance

Using assembly language test loops to understand the differences between M1 Pro and M3 Pro CPU cores casts new light on their differences.

eclecticlight.co

Cheers, I only discovered his site recently with his m1 coverage, its an invaluable resource for many things, but he's definitely a good source for real world performance info.

Basic75 · Dec 7, 2023

name99 said:
My GUESS is that the latency for FADD (perhaps under some specific conditions like when a result feeds directly into the same unit so there's no waiting for the register from a common dispatch bus) FADD latency has dropped to 2.5 cycles (or more precisely something like 2 dependent FADDs in 5 cycles).

RWT Forums - Real World Tech

content overridden

www.realworldtech.com

Mac Hammer Fan · Dec 7, 2023

CPU of the M3 Max is very good, but GPU could be better. It's a pity the number of GPU cores hasn't significantly increased. Hopefully this will be the case with the M4 Max.

leman · Dec 7, 2023

Mac Hammer Fan said:
CPU of the M3 Max is very good, but GPU could be better. It's a pity the number of GPU cores hasn't significantly increased. Hopefully this will be the case with the M4 Max.

The GPU is the biggest redesign Apple did in a decade, since they first started using shader cores of their own design. It’s a fairly impressive feat of engineering. More shader cores can come later. But even with the same core count M3’s GPU is substantially faster in pretty much every task.

avkills · Dec 7, 2023

Mac Hammer Fan said:
CPU of the M3 Max is very good, but GPU could be better. It's a pity the number of GPU cores hasn't significantly increased. Hopefully this will be the case with the M4 Max.

I disagree and agree. I disagree because this is a laptop CPU we are talking about; and if the M3 Max can even get 1/3rd the performance of a 4090 desktop chip then that is a major win for Apple.

I agree because the M3 Ultra (if there is ever going to be such a thing) needs to be on par with 4080 performance, but closer to 4090.

Search

Search

First M3 Max Benchmark!

name99

macrumors 68030

throAU

macrumors G4

Evaluating M3 Pro CPU cores: 1 General performance

Basic75

macrumors 68020

RWT Forums - Real World Tech

Mac Hammer Fan

macrumors 65816

leman

macrumors Core

avkills

macrumors 65816

Our Staff