Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Dulcimer

macrumors 6502a
Original poster
Nov 20, 2012
976
1,199
From the announcement of iPad Pros today, M4 also launched on my months after M3. Most comparisons were with M2 iPads but a couple things I tease apart:

- 2nd gen 3nm (N3E?)
- “new display engine enables stunning precision, color, and brightness”
- Neural cores: 18 vs 38 TOPS

On device AI is no doubt going to be the main focus of this next gen of AS.

Anyone can piece together perf comparisons between M2/M4 and compare with M3?

From what I’m seeing with the speed of iterating on a new chip plus reiterating existing M3 features, it really does seem like M4 = M3 with newer gen Neural Engine. This also explains why the A17 Pro had such high TOPS compared to M3.
 
Well M4 will be a new design as you cannot directly port an N3B design (like M3) to N3E.

It does make sense Apple would focus most of their upgrade effort on the Neural Engine since they are going to be pushing AI really fierce at WWDC.

N3B is higher-performing than N3E, but it draws more power to do so. It is also more expensive to fabricate on and has a higher defect rate. So M4 CPU and GPU performance may not be exceptionally better than M3, even if NE performance is. Still, it will draw a fair bit less power (good for MacBooks) and be cheaper to produce so prices should hold the line and we might see more base RAM or storage since Apple would have more margin to work with.
 
They also upped the efficiency cores to 6 from 4. There’s a new binned version of the M4 with 9 CPU cores, the first time they’ve binned on that. 3 performance cores and 6 efficiency cores.
memory bandwidth is up this generation to 120 GB/s
 
Then does M3 Ultra still make sense? I am waiting for the new Studio. If Apply skips M3 Ultra Studio, then it can be another year of wait for me.
 
  • Like
Reactions: wojtek.traczyk
Then does M3 Ultra still make sense? I am waiting for the new Studio. If Apply skips M3 Ultra Studio, then it can be another year of wait for me.

Speculation is that if there is an M3 Ultra, it will be a custom SoC and not two M3 Max fused together. Also depends on if it was designed on N3B like the rest of the M3 family or N3E like the M4. If the latter, than it would most-likely use the same CPU, GPU and NE cores as the M4 and therefore probably be called the M4 Ultra.

WWDC would then likely includes upgrades for the entire desktop line, with Mac Mini getting M4 and M4 Pro, Mac Studio getting M4 Max and M4 Ultra and Mac Pro getting M4 Ultra (and maybe an M4 "Extreme"). The MacBook Pros would then be updated to M4 in the Fall (when supply is better).

Of course, there is also N3X coming in 2025 that will offer better performance at the expense of higher power draw. If Apple really wants to set the Ultra SoC apart, they might be waiting for N3X which means the Mac Studio and Mac Pro may not be updated until WWDC 2025 when they could get an M4 or M5 Ultra on N3X.
 
Last edited:
The power/thermals to performance is insane in M4. Can’t wait to see what they will come up with M4 Max/M4 ultra. Can Apple put an M4 Ultra in MBP 16 inch?
 
  • Like
Reactions: BenRacicot
The power/thermals to performance is insane in M4. Can’t wait to see what they will come up with M4 Max/M4 ultra. Can Apple put an M4 Ultra in MBP 16 inch?
The M4 power is that big not just based on the M4 but also because thermals. I Dont expect this kind of jumps from M2pro/Max to M4pro/max because those devices had a far better cooling compared to the fan-less design like the ipads iphones and macbook air
 
  • Like
Reactions: SBeardsl
N3B is higher-performing than N3E, but it draws more power to do so. It is also more expensive to fabricate on and has a higher defect rate. So M4 CPU and GPU performance may not be exceptionally better than M3, even if NE performance is. Still, it will draw a fair bit less power (good for MacBooks) and be cheaper to produce so prices should hold the line and we might see more base RAM or storage since Apple would have more margin to work with.
N3B is not higher-performing than N3E. TSMC originally claimed that N3B would have 10-15% better performance than N5 iso-power, while N3E would be 18%. N3E is also a few % more power efficient iso-perf. However, it is ~5% less dense (again according to TSMC). This is greatly complicated, however, by FinFlex, which can make a significant difference in PPA (power, performance, area).

As you say, N3B is definitely more complex and expensive.

Do not expect the same performance iso-clock between M3 and M4. Expect improvements.

Then does M3 Ultra still make sense? I am waiting for the new Studio. If Apply skips M3 Ultra Studio, then it can be another year of wait for me.
No, now that the M4 is out, everything strange about the M3 is starting to make a lot more sense. They are likely to ship an M4 studio soon - my guess is announcement at WWDC in a month.
 
lets hope that confused user is not confused about releasing an Mac Studio at WWDC
 
I hope so too but it's just a guess, based on the early arrival of the M4 and the MIA M3 Ultra, which appears to have been planned from the start. I don't have any special inside info.
 
  • Like
Reactions: DrWojtek
Hopefully, they have something more intense in the works for the Mac Pro with better integration with external cards and storage. They seem to be working into some changes, but until the Mac Pro starts to look as capable as it used to be, they'll sell more of the Mac Studio.
 
  • Like
Reactions: drrich2
The first GB6 result suggests that the CPU is clocked at lower 3.9Ghz, which makes perfect sense for the 5mm thin iPad. Very curious to see single-core performance and whether the IPC has changed in any meaningful way.
 
From the announcement of iPad Pros today, M4 also launched on my months after M3. Most comparisons were with M2 iPads but a couple things I tease apart:

- 2nd gen 3nm (N3E?)
- “new display engine enables stunning precision, color, and brightness”
- Neural cores: 18 vs 38 TOPS

On device AI is no doubt going to be the main focus of this next gen of AS.

Anyone can piece together perf comparisons between M2/M4 and compare with M3?

From what I’m seeing with the speed of iterating on a new chip plus reiterating existing M3 features, it really does seem like M4 = M3 with newer gen Neural Engine. This also explains why the A17 Pro had such high TOPS compared to M3.
So the M4 got the NPU from A17 pro. The M4 wasn’t apples AI chip. Seems this will be next iPhones SOC. It’s NPU eventually comes to M5. With 32 cores?
 
So the M4 got the NPU from A17 pro. The M4 wasn’t apples AI chip. Seems this will be next iPhones SOC. It’s NPU eventually comes to M5. With 32 cores?
We don't know.

They measured the TOPS of the M3 with INT16 and the A17 Pro with INT8 ...

So you can't compare the numbers 🙈
 
We don't know.

They measured the TOPS of the M3 with INT16 and the A17 Pro with INT8 ...

So you can't compare the numbers 🙈
Yeah, I've seen this claim before, but I've yet to see any actual benchmark numbers, or indeed anything at all that provides actual data about this. Right now as far as I can tell, the relative performance of M2, A17, and M3 NPUs is completely unknown. M4 just adds one more question. Does anyone have any hard data about this at all?
 
Yeah, I've seen this claim before, but I've yet to see any actual benchmark numbers, or indeed anything at all that provides actual data about this. Right now as far as I can tell, the relative performance of M2, A17, and M3 NPUs is completely unknown. M4 just adds one more question. Does anyone have any hard data about this at all?

Not to mention that there is evidence that 16-bit and 8-bit operations have the same performance on A17. For example, in GB6 ML I8 and F16 versions of the same tests result in pretty much the same score on all Apple NPUs. Also, @name99 who spent a lot of time studying Apple patents says that their NPU implementation does not have different performance for these workloads.

The point is that just because Nvidia and AMD go that route, it does not mean that Apple needs to do the same.
 
  • Like
Reactions: DrWojtek
8-bit vs 16-bit is extremly big diffrence most ML is memory bound so bringing back higher bandwidth seems like good idea.
Still it seems NPU will be memory bound as 120GB/s is just pathetically low for potentially 30TOPS in most Attention based models it won't be possible to use even 50% of whole potential and ofc. software is still extremely bad I'll be waiting for WWDC to maybe get lower lvl access to NPU maybe some extension for Metal (something like SYCL?)
 
8-bit vs 16-bit is extremly big diffrence most ML is memory bound so bringing back higher bandwidth seems like good idea

Weight storage format does not have to be the same as the internal ALU precision. You can still use bandwidth-saving quantized models and unpack the weights in the NPU. It's as you say, the NPU will be bandwidth-limited on larger models. Increasing the ALU rate is probably not very helpful in this context.

software is still extremely bad I'll be waiting for WWDC to maybe get lower lvl access to NPU maybe some extension for Metal (something like SYCL?)

I don't think this is likely. Releasing a stable low-level API would mean that Apple cannot freely experiment with the hardware model. That's also why we still don't have a direct access to AMX.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.