Here's a fun table I made:
Rather than continually deal with IPC averages, I thought it might be fun to look at each individual test and see how they changed since the M1. I chose a GB 6.3 result from each processor (so run-to-run variation comes into play I'm not trying to get exact numbers from averages) and compare the change with clock speed. Given my results I can say that IPC increases depends strongly on the workload. For instance since the M1, IPC for GB's HTML 5 and Background Blur tests have increased roughly 40%, IPC for PDF Render and Photo Library and Object Remover and Ray Tracer have increased ~20%, IPC for Clang, HDR, Photo filter by 11-15%. IPC Text Processing, Asset Compression, Structure from motion, are about 7%, File Compression and Navigation are about 3-4%, while Horizon Detection is completely flat. Object Detection prior to SME went up by 18% between the M1 and M2 but was flat between the M2 and M3. Obviously unknown what it would've done in the M4 without SME. Now if you want you could create "an average" of those by taking the geometric mean for FP and INT tests and weighted arithmetic mean over the two but that would conceal everything that's interesting and why I don't like averages. This shows in fact Apple that is iterating quite strongly in areas they care about for the CPU's performance and they are leaving to clock speed those that they don't. The average is brought down by the latter. I would argue rather than the criticism that Apple "studies for the test" it would appear that they have their own design priorities for what's most important to improve for their users and those are different from GB's.
I can't attach the full spreadsheet to check it for errors, but I think it's right. It only took me a bit to do this so anyone could do the same (and please do point out any errors beyond run-to-run stuff if you get very different results).
Also I should point out that clocks have increased by 38% since the M1. Often times big clock speed increases like we've been getting necessitate microarchitecture changes just to keep up as otherwise IPC falls as clocks rise (especially if you increase them by nearly 40%). Thus, part of this is that Apple has been so aggressive with clocks, particularly M2->M3 was a huge gain in clocks (I know the M2 Max could clock higher, but this is base M2) and there was even a pretty sizable jump in M3->M4. These have "eaten" IPC gains if you will. That they've also increased clocks by this much without spiking power to ungodly amounts is a testament to both their and TSMC's engineering efforts as well.
@name99 @mr_roboto @leman