M4+ Chip Generation - Speculation Megathread [MERGED]

diamond.g · Nov 18, 2024

crazy dave said:
The rumors are a little confusing. The best I can make out parsing the oracle of Bloomberg is that there will indeed be an Ultra that is 2x Max (has any 3rd party die shots with Ultra Fusion happened yet?), but there is also a desktop specific Hidra chip in the works maybe destined for the Mac Pro only, maybe by the end of 2025.

It's also all but officially confirmed that Nvidia themselves will be doing an ARM SOC next year (or more than one, rumors aren't certain), but not what its (their) specifications will be.

As @Mac_fan75 said, it's true that the improvements from node alone will be relatively small compared to 2026 and 2027, that doesn't mean Apple won't make larger architectural changes that may or may not allow immediate performance improvements (and may certainly improve with future software).

True, there were a couple of voices that did imply that, but not many. Most kept ragging on N3B being a "failure". It may not have been as successful a node as TSMC had initially hoped, but we should all be so lucky to have such failures in our work.

I am confused, why do folks think Nvidia doesn't already have a SOC? NIO uses one in their ET5/7.

crazy dave · Nov 18, 2024

diamond.g said:
I am confused, why do folks think Nvidia doesn't already have a SOC? NIO uses one in their ET5/7.

Perhaps I should specify an ARM-based PC SOC. Nvidia also makes an SOC for the Switch and will be making a new SOC for the Switch 2 and of course their giant big iron SOCs Grace Hoppers, but Nvidia is said to be entering the PC market next year to directly compete with Qualcomm, AMD, Intel, and Apple. They are supposedly partnering with MediaTek though it isn't clear if they will be making SOCs beyond that partnership as well.

EugW · Nov 18, 2024

crazy dave said:
True, there were a couple of voices that did imply that, but not many. Most kept ragging on N3B being a "failure". It may not have been as successful a node as TSMC had initially hoped, but we should all be so lucky to have such failures in our work.

? M3 is N3B. M4 is N3E.

crazy dave · Nov 18, 2024

EugW said:
? M3 is N3B. M4 is N3E.

I know. I was saying most voices on these forums were saying the opposite of what @Mac_fan75 was saying, that N3B was such a monumental failure. There were indeed also a few voices saying that N3E relative to N3B would be so disappointing that Apple would skip N3E and wait for N3P, but not many. Neither statement about N3B nor N3E were accurate.

N3B was delayed and more expensive than TSMC had hoped for with less room for future improvements (hence why it won't seemingly have any child nodes), but it still landed major orders and produced good chips. N3E is slightly less dense than N3B but cheaper with a slightly better power/performance profile and better headroom for further improvements, hence why the future nodes will be based on it. Basically both nodes are fine and neither are failures like say Intel 10 (famously so and is one of the reasons why Intel is in the situation it's in today) or Samsung 3 (yields are supposedly awful and has few to no customers, even Samsung itself reportedly doesn't want to use it) or back in the day GloFo/TSMC/Samsung 20nm (if memory serves, they had decent yields, but absolutely awful properties - e.g. terrible power leakage - one of the generations, one of the last in fact, when Intel was incredibly far ahead in fabrication).

technomacs · Nov 18, 2024

technomacs said:
Additionally - looking at the Apple line up currently - I'm thinking they're going to kill the Mx Pro line; likely with M5. It has no place in the current Mac Lineup.

Here is what I think:

M4 - 10 CPU Cores - 4 Perf, 6 efficiency, 14 GPU Cores (Cut down config with 12), starting out RAM, 12GB; max is 32GB (128 bit)
M4 Pro - 12 CPU Cores - 6 perf, 6 efficiency, 20 GPU Cores (Cut down config with 18), starting out RAM, 18GB; max is 48GB (192 bit)
M4 Max - 20 CPU Cores - 16 Perf, 4 efficiency, 52 GPU Cores (Cut down configs with 36 and 44 GPU Cores); max is 128GB (512 bit)
M4 Ultra - 2 x M4 Max - max ram is 256GB (2 x 512 bit)
M4 Extreme - 4 x M4 Max - with a IO Chip too for more PCIe lanes - max ram is 512GB (4 x 512 bit)

Apple Dedicated GPU based on the 4th Gen Desktop Apple Silicon:
224 GPU Cores: 4 x 64 modules (Cut down to 56 GPU Cores each), and a 256 bit bus (32/48GB) - $2499/$2999
256 GPU Cores: 4 x 64 GPU Modules + 4 x 64MB L3 Cache and a 256 bit bus (32/48GB) - $3299/$3799
336 GPU Cores: 6 x 64 GPU Modules (Cut down to 56 cores each) + 6 x 64MB L3 Cache and a 384 bit bus (48/72GB) - $4999/$5499
384 GPU Cores: 6 x 64 GPU Modules + 6 x 64MB L3 Cache and a 384 bit bus (48/72GB) - $6499/$6999
448 GPU Cores: 8 x 64 GPU Modules (Cut down to 56 cores each) + 8 x 64MB L3 Cache and a 512 bit bus (64/96GB) - $7999/$8499
512 GPU Cores: 8 x 64 GPU Modules + 8 x 64MB L3 Cache and a 512 bit bus (64GB/96GB/128GB/192GB) - $9499/$9999/$10499/$11999

M5 - 12 CPU Cores - 6 Perf, 6 efficiency, 24 GPU Cores (Cut down configs with 14 and 18), starting out RAM, 12GB; max is 64GB (128 bit)
M5 Max - 20 CPU cores - 16 Perf, 4 efficiency, 64 GPU Cores (Cut down configs with 48 and 56), starting out RAM, 48GB; max is 256GB (512 bit)
M5 Ultra - 2 x M5 Max - max ram is 512GB (2 x 512 bit)
M5 Ultra alt design 4 x (8 CPU Cores - 6 perf, 2 efficiency, 40 GPU Cores) - (Cut down configs with 112 GPU Cores and 136 GPU Cores) - max ram is 768GB (4 x 384 bit)
M5 Extreme: 8 x (8 CPU Cores - 6 perf, 2 efficiency, 40 GPU Cores) - (Cut down config with 272 GPU Cores) - max ram is 1.5TB (8 x 384 bit)

Apple Dedicated GPU based on the 5th Gen Desktop Apple Silicon:
432 GPU Cores: 6 x 80 GPU Modules (Cut down to 72 cores each) + 6 x 64MB L3 Cache and a 384 bit bus (48GB/96GB) - $2999/$3499
480 GPU Cores: 6 x 80 GPU Modules + 6 x 64MB L3 Cache and a 384 bit bus (48GB/96GB/192GB) - $3999/$4499/$5499
576 GPU Cores: 8 x 80 GPU Modules (Cut down to 72 cores each) + 8 x 64MB L3 Cache and a 512 bit bus (64GB/128GB/256GB) - $5299/$6099/$7499
640 GPU Cores: 8 x 80 GPU Modules + 8 x 64MB L3 Cache and a 512 bit bus (128GB/256GB) - $7799/$8499
864 GPU Cores: 12 x 80 (Cut down to 72 each) + 8 x 64MB L3 Cache + 4096 bit HBM4 (192/384GB) - $8999/$10499
960 GPU Cores: 12 x 80 + 12 x 64MB L3 Cache + 6144 bit HBM4 (288 or 576GB) - $11999/$13999

Right. Going to update my DGPUs, M5 lineup and M4 Ultra and Extreme Predictions based on what we know about M4, M4 Pro and M4 Max now.

M4 Ultra - 3 x 12P 4E, 4 x 24 GPU Cores (96 GPU Cores) (Cut down models with 76 and 86) (Not using M4 Max; as stated by Mark Gurman in the past), 1024 bit Memory Bus
M4 Extreme - 8 x 12P 4E; dedicated GPU Of choice, 1536 bit for CPU only

Apple DGPU Gen 1 choices:

108 GPU Cores (5 x 24, 10% disabled) (380mm^2), 150W, 256 bit (32GB GDDR7 28Gbps) - $2399
120 GPU Cores (5 x 24) (380mm^2), 180W, 256 bit (48GB GDDR7 28Gbps) - $2999
150 GPU Cores (7 x 24, roughly 10% disabled), (555mm^2), 230W, 384 bit (48GB GDDR7 28Gbps) - $3999
168 GPU Cores (7 x 24) (555mm^2), 275W, 320 bit (60GB GDDR7 36Gbps) - $4999
200 GPU Cores (9 x 24) (728mm^2), 280W, 384 bit (72GB GDDR7 36Gbps) - $5999
216 GPU Cores (9 x 24) (728mm^2), 330W, 384 bit (96GB GDDR7 36Gbps) - $7499
264 GPU Cores (11 x 24) (851mm^2), 400W, 512 bit (128GB GDDR7 36Gbps) - $9999

M5: 4P, 6E, 12 GPU Cores, 128 bit bus
M5 variant 2: 6P, 6E, 16 GPU Cores, 192 bit bus
M5 Pro: 12P, 4E, 24 GPU Cores, 256 bit bus
M5 Max: 16P, 4E, 52 GPU Cores, 512 bit bus
M5 Ultra: 4 x 16P 4E, 5 x 24 GPU Cores (120 GPU Cores), 1024 bit bus
M5 Extreme: 10 x 16P 4E, dedicated GPU of choice, 1536 bit.

And DGPU Gen 2; just change it from 24 Cores per module to 28. I cba atm as it's 1:30am.

Confused-User · Nov 18, 2024

hoodlum90 said:
It looks like the M5 will be a minor performance update over the M4, as the M5 will use N3P next year. The bigger update will need to wait for the N2 in 2026.

You really don't know what the M5 will be, or when it will ship. It won't get a big bump from a node change, though a shift from N3E to N3P seems fairly likely assuming the timing is right, and that's not nothing.

Either way, you don't know what Apple's been working on, but they've shown pretty clearly that they're not resting on their laurels. I would not assume minimal changes.

crazy dave said:
The rumors are a little confusing. The best I can make out parsing the oracle of Bloomberg is that there will indeed be an Ultra that is 2x Max (has any 3rd party die shots with Ultra Fusion happened yet?), but there is also a desktop specific Hidra chip in the works maybe destined for the Mac Pro only, maybe by the end of 2025.

That's not how I was reading it. (And though I hate to say it, he's been pretty solid on large-picture ASi info since the M1, though I wouldn't trust him as far as I can throw him on most other stuff - which means it may actually be worth paying attention to him for this.)

My read is that - just as he said - they made a base chip, Pro, and Max for mobile (and while we don't know yet, we're likely to see some major similarities between Pro and Max, much like the M1/2 generations). And they're also making a "Hidra" chip for desktop that will be roughly 2x the Max in power. Then they'll be doubling up the Hidra, just like they did the Max in gen 1-2 to get the Ultra. In that case, the desktops have the Max, Hidra (presumably called "Ultra" as it fills the same niche), and 2xHidra (what most people have been calling "Extreme", though I doubt Apple will use that name).

I have no particular faith that the M4 Ultra will be exactly 2x the Max, since it will be a separate design, but I expect it to be close. Probably not 2x the E cores - either a single cluster of 6, or maybe they go big with 2-4 clusters as they're so cheap in space and power, though I wouldn't put money on that. Probably not double the NPU, as they've shown no taste for that so far, though it's also possible that this is where they try something new. Maybe no ISP, as desktops have no integrated cameras? Different configuration of media engine, etc.

And again, all this assumes that Gurman got it all right. That's not a bad bet at this point but it's far from guaranteed.

technomacs said:
Right. Going to update my DGPUs, M5 lineup and M4 Ultra and Extreme Predictions based on what we know about M4, M4 Pro and M4 Max now.

Lol. That ridiculous fantasy isn't based on anything. Reminds me of the "LPDDR6 EVERYWHERE THIS YEAR" guy.

Not to say that mix-and-match chiplets aren't a good idea, if Apple can pull them off. Probably a bit too early, though it would be really nice if it weren't.

crazy dave · Nov 18, 2024

Confused-User said:
That's not how I was reading it. (And though I hate to say it, he's been pretty solid on large-picture ASi info since the M1, though I wouldn't trust him as far as I can throw him on most other stuff - which means it may actually be worth paying attention to him for this.)

He screwed up pretty badly in the M2 generation. He said that there would be no M2 Studio even pretty close to launch and indeed even before that he had said that the M2 was to be a stop gap before M3 came out, which turned out to be much more true for the M3. And a lot of his mistakes stem from thinking the M1 Extreme was a thing long after it was clear it wasn't (as in the M1 Max was never designed to go to 4 dies). To be fair, he did say the M1 Extreme wasn't going to be released, but it's also clear that whatever development had happened for it was stopped long before the M1 Max's release which is not what he had said. This then led him down a highly illogical set of deductions for the M2 Ultra and Mac Studio/Pro line.

Overall though, yes, he's gotten more right than wrong.

Confused-User said:
My read is that - just as he said - they made a base chip, Pro, and Max for mobile (and while we don't know yet, we're likely to see some major similarities between Pro and Max, much like the M1/2 generations). And they're also making a "Hidra" chip for desktop that will be roughly 2x the Max in power. Then they'll be doubling up the Hidra, just like they did the Max in gen 1-2 to get the Ultra. In that case, the desktops have the Max, Hidra (presumably called "Ultra" as it fills the same niche), and 2xHidra (what most people have been calling "Extreme", though I doubt Apple will use that name).

I have no particular faith that the M4 Ultra will be exactly 2x the Max, since it will be a separate design, but I expect it to be close. Probably not 2x the E cores - either a single cluster of 6, or maybe they go big with 2-4 clusters as they're so cheap in space and power, though I wouldn't put money on that. Probably not double the NPU, as they've shown no taste for that so far, though it's also possible that this is where they try something new. Maybe no ISP, as desktops have no integrated cameras? Different configuration of media engine, etc.

And again, all this assumes that Gurman got it all right. That's not a bad bet at this point but it's far from guaranteed.

Recently I think he said explicitly that there would be an M4 Ultra that was 2x M4 Max or at the very least that the Max was destined for the lower end Studio, but maybe I'm misremembering and Gurman just said something less specific.

EDIT:

The latest I could find was this:

Apple Finally Finds Its Gaming Console With the New Mac Mini

The new Mac mini could become Apple’s long-awaited answer to the PlayStation and Xbox. Also: The company’s next AI features are planned for early December; it delivers disappointing earnings guidance and agrees to acquire Pixelmator; and Peloton’s new CEO is Apple’s former services heir apparent.

www.bloomberg.com

When Apple brings ray tracing to the Mac Pro next year — complete with a chip that probably goes up to 32 CPU cores and 80 graphics cores

Which is pretty anemic as prophecies go - though if we ignore the "probably", the "up-to" would just be 2xMax cores and though in theory Apple could increase clocks and the rest of the SOC could be different, Hidra would be very similar to the Max die and 2xHidra would be very similar to a 2xMax Ultra ... if the Hidra is even what he's referring to here.

leman · Nov 19, 2024

hoodlum90 said:
It looks like the M5 will be a minor performance update over the M4, as the M5 will use N3P next year. The bigger update will need to wait for the N2 in 2026.

We don't really know anything about M5 family. It could follow the same design principles as M-series before it, or it could be something very different. We know for example that Apple has been working on stacked chip designs that would allow more dense logic arrangements — they have patents for that (whether that will en dup in real tech is another question). And there is some low-hanging fruit they can go after to massively boost their GPU performance, for example. There is really no way to predict it.

ader42 · Nov 19, 2024

Given that this year the speed bump appears to be more CPU based than GPU, I can’t help but wonder if Apple will move to a more CPU focused increase one year and then a GPU focused one the following year. Kind of like the old iPhone tick/tock cycle.

Or (more likely?) perhaps it will be that the M/MnPro/MnMax chips are more CPU focused and then the Ultra/Extreme variants are more GPU focused.

DrWojtek · Nov 19, 2024

ader42 said:
Given that this year the speed bump appears to be more CPU based than GPU, I can’t help but wonder if Apple will move to a more CPU focused increase one year and then a GPU focused one the following year. Kind of like the old iPhone tick/tock cycle.

Or (more likely?) perhaps it will be that the M/MnPro/MnMax chips are more CPU focused and then the Ultra/Extreme variants are more GPU focused.

If I were in charge, I definitely wouldn’t up the CPU core counts in future generations. Let arch based improvements handle performance increase on same amount of cores. And use the extra space for more GPU cores/performance. That is clearly where it’s needed.

When the base has the Pro GPU performance of today, Pro has Max and Max has Ultra etc… then we’re talking great balance between CPU/GPU performance. Hopefully we get closer with M5

Antony Newman · Nov 19, 2024

If the Mac Studio & Pro are on 2-3 year cycles - and the NPU from the previous years iPhone is used - would that hamper their ability to perform AI & ML tasks?

2024 Q4 : 38 TOPS (INT8)
2025 Q4 : ?50 TOPS iPhone : ?38 TOPS Mac Pro
2026 Q4 : ?80 TOPS iPhone : ?38 TOPS Mac Pro
2027 Q4 : ?120 TOPS iPhone : ?38 TOPS Mac Pro

M4 'Ultra' & 'Extreme' : If the quantity of low QoS OS tasks remains the same as the Mini, and power users are not running more concurrent Apps, but want more performance from existing apps - would they need any more E core than the M4 Max (with 12P/4E)? Would doubling the CPU compute for Apps only require 24P/4E.

How much faster would a 2025 MacPro have to be over a Studio before you seriously considered buying one? Is anyone going to hold off buying a Studio until you see what a MacPro can do?

DrWojtek · Nov 19, 2024

how large would a hypothetical x4 Max / ’Extreme’ chip be? Roughly 12x12 cm?

komuh · Nov 19, 2024

Antony Newman said:
If the Mac Studio & Pro are on 2-3 year cycles - and the NPU from the previous years iPhone is used - would that hamper their ability to perform AI & ML tasks?

2024 Q4 : 38 TOPS (INT8)
2025 Q4 : ?50 TOPS iPhone : ?38 TOPS Mac Pro
2026 Q4 : ?80 TOPS iPhone : ?38 TOPS Mac Pro
2027 Q4 : ?120 TOPS iPhone : ?38 TOPS Mac Pro

M4 'Ultra' & 'Extreme' : If the quantity of low QoS OS tasks remains the same as the Mini, and power users are not running more concurrent Apps, but want more performance from existing apps - would they need any more E core than the M4 Max (with 12P/4E)? If the Studio & Pro design is not based on a M4 Max - could doubling the CPU compute for Apps only require 24P/4E.

How much faster would a 2025 MacPro have to be over a Studio before you seriously considered buying one? Is anyone going to hold off buying a Studio until you see what a MacPro can do?

You are dreaming about 120 TOPS iPhone in 2027. I would say even 50 TOPS Int8 won't be added. We will be stuck to slowly a few TOPS added via a slightly higher clock each generation, but nothing dramatic. Maybe Int4 will be x2 Int8 in a few generations, but right now memory is the main limiting factor for like 99% of models especially on phones.

NPU is wasted potential without good software support. Even 100 TOPS won't do anything if you can only access it via the crappy CoreML framework.

leman · Nov 19, 2024

Antony Newman said:
If the Mac Studio & Pro are on 2-3 year cycles - and the NPU from the previous years iPhone is used - would that hamper their ability to perform AI & ML tasks?

If you are getting this type of machine for ML work, you are not really interested in the NPU. Your ML processing will be done on the GPU/CPU.

leman · Nov 19, 2024

komuh said:
NPU is wasted potential without good software support. Even 100 TOPS won't do anything if you can only access it via the crappy CoreML framework.

I’d say that CoreML and Apple NPU are among the best supported and most popular on-device ML inference frameworks. Software can always get better, but it’s perfectly adequate for its intended use. These are not for general-purpose ML programming.

caribbeanblue · Nov 19, 2024

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

DrWojtek · Nov 19, 2024

caribbeanblue said:
I've seen a few tests where the M4 Max is already 30% faster than the 4070. In Blender's benchmark database the M4 Max's median score is 5209.01 while the 4080 mobile's score on OptiX is 5321.43, already nearly identical. The M4 Ultra will probably be competitive with the 4090 in raster+RT workloads and if the quad-M4 Max Mac Pro chip is real that will blow the 4090 out of the water.

More likely the Ultra/Extreme is M5? Due to the found Mac17 identifiers? No way they’d put M5 on the Air first. Absolutely no way. And if the Ultra/Extreme is M5, it might be even better!

EugW · Nov 19, 2024

DrWojtek said:
More likely the Ultra/Extreme is M5? Due to the found Mac17 identifiers? No way they’d put M5 on the Air first. Absolutely no way.

They put M4 in the iPad Pro first, many months before any M4 Mac even launched. Anyhow, the Air will be getting M4 next. It's still on M3.

komuh · Nov 19, 2024

leman said:
I’d say that CoreML and Apple NPU are among the best supported and most popular on-device ML inference frameworks. Software can always get better, but it’s perfectly adequate for its intended use. These are not for general-purpose ML programming.

Apple’s Neural Processing Unit (NPU) is undoubtedly a powerful tool, and in certain cases, it offers the best on-device machine learning capabilities. However, CoreML, Apple’s framework for machine learning, has been a significant hindrance to the potential of AI on Apple devices for the past few years. (I have been working in this field for approximately eight years, and CoreML and all Apple Deep Learning technologies have been heavily influenced by graph-based TensorFlow versions starting from version 0.2 onwards. While some improvements have been made, such as the recent addition of dynamic inputs and faster compilation times, many of the fundamental issues remain, including the absence of dynamic inputs until recently, slow compilation times, and the presence of random not-implemented operations that can be added without extensive workarounds.)

In recent times, Apple has introduced MPSGraph, an MLIR-inspired framework that draws inspiration from XLA and LLVM. This framework enables us to occasionally use ANE and appears to be integrated into all newer CoreML versions. While MPSGraph offers some progress, it still falls significantly behind other leading players like Jax and PyTorch. Additionally, Intel OneAPI and Intel’s other NPU projects demonstrate remarkable capabilities, particularly in the software development aspect <and sadly super weird SYCL for all kernel/compute stuff which i'm not a fan of compared to Metal or even CUDA>.

leman · Nov 19, 2024

komuh said:
Apple’s Neural Processing Unit (NPU) is undoubtedly a powerful tool, and in certain cases, it offers the best on-device machine learning capabilities. However, CoreML, Apple’s framework for machine learning, has been a significant hindrance to the potential of AI on Apple devices for the past few years. (I have been working in this field for approximately eight years, and CoreML and all Apple Deep Learning technologies have been heavily influenced by graph-based TensorFlow versions starting from version 0.2 onwards. While some improvements have been made, such as the recent addition of dynamic inputs and faster compilation times, many of the fundamental issues remain, including the absence of dynamic inputs until recently, slow compilation times, and the presence of random not-implemented operations that can be added without extensive workarounds.)

In recent times, Apple has introduced MPSGraph, an MLIR-inspired framework that draws inspiration from XLA and LLVM. This framework enables us to occasionally use ANE and appears to be integrated into all newer CoreML versions. While MPSGraph offers some progress, it still falls significantly behind other leading players like Jax and PyTorch. Additionally, Intel OneAPI and Intel’s other NPU projects demonstrate remarkable capabilities, particularly in the software development aspect <and sadly super weird SYCL for all kernel/compute stuff which i'm not a fan of compared to Metal or even CUDA>.

The main purpose of CoreML is integrating simple ML models into applications. I wouldn't refer to it as a general-purpose framework for machine learning, nor is it designed to be one. Neither is MPSGraph, which targets GPU only. I fully agree with you that CoreML can be massively improved, however, it does what it is suppose to do — take your pre-trained simple models and run them on the NPU.

If you are looking for a general-purpose ML framework similar to Jax or PyTorch, you might want to look at MLX.

Confused-User · Nov 19, 2024

crazy dave said:
Which is pretty anemic as prophecies go - though if we ignore the "probably", the "up-to" would just be 2xMax cores and though in theory Apple could increase clocks and the rest of the SOC could be different, Hidra would be very similar to the Max die and 2xHidra would be very similar to a 2xMax Ultra ... if the Hidra is even what he's referring to here.

Why would they make a Hidra that's like a Max? I think the whole point was that Hidra was a monolithic die close to 2x the Max.

ader42 said:
Given that this year the speed bump appears to be more CPU based than GPU, I can’t help but wonder if Apple will move to a more CPU focused increase one year and then a GPU focused one the following year. Kind of like the old iPhone tick/tock cycle.

Or (more likely?) perhaps it will be that the M/MnPro/MnMax chips are more CPU focused and then the Ultra/Extreme variants are more GPU focused.

Maybe, but I think it's far more likely they'll just push out whatever they've got at the time. It doesn't serve them to hold back advances for no reason.

DrWojtek said:
If I were in charge, I definitely wouldn’t up the CPU core counts in future generations. Let arch based improvements handle performance increase on same amount of cores. And use the extra space for more GPU cores/performance. That is clearly where it’s needed.

When the base has the Pro GPU performance of today, Pro has Max and Max has Ultra etc… then we’re talking great balance between CPU/GPU performance.

Everything you say is true... *If* you work in certain fields. ML, some types of graphics and video, gaming, some scientific computing, and a few others. For everyone else, big GPU resources are a waste of silicon. This is a huge drawback to Apple's generally successful strategy of integrating GPU and CPU onto one die: You can't mix and match. In x86 world, plug-in GPUs make that the default approach, and that's a big part of why Apple hasn't made as much headway as you'd otherwise assume, given their tech advantage.

They're heading towards better flexibility (at least pre-purchase) in the future, where you can make a reasonable selection of CPU cores and GPU cores (at least; possibly NPU and other resources too eventually) by picking which chiplets are in your system. But it may be a few years yet before we get there. I *hope* it comes sooner than that, and it wouldn't be astonishing if they delivered in the M5 generation, but I'm not counting on it.

Antony Newman said:
If the Mac Studio & Pro are on 2-3 year cycles - and the NPU from the previous years iPhone is used - would that hamper their ability to perform AI & ML tasks? [...]

That's an argument for NOT being on such a long cycle (though not a good one, as the NPU is not relevant to anything except running existing models, for the most part). We don't really know what that cycle will be. I suspect that Apple knows that 2-3 years won't cut it, but I don't know if they actually care. It depends on whether or not they think they can rebuild a user base on the high end.

M4 'Ultra' & 'Extreme' : If the quantity of low QoS OS tasks remains the same as the Mini, and power users are not running more concurrent Apps, but want more performance from existing apps - would they need any more E core than the M4 Max (with 12P/4E)? Would doubling the CPU compute for Apps only require 24P/4E.

That's the most likely scenario, 4-6 E cores. I still think we're headed for segregated E cores for OS-only use at some point, and that would be a good machine in which to introduce that feature, but I make no predictions about the upcoming models. I also think there's a low chance that they might decide to try for good coverage of embarrassingly parallel workloads, with many many E cores, but I would be very surprised if they tackled that before they get to mix+match chiplets.

DrWojtek said:
how large would a hypothetical x4 Max / ’Extreme’ chip be? Roughly 12x12 cm?

I don't know why it would be square. Any such device would be two or more chiplets, so much more likely to be rectangular.

DrWojtek said:
No way they’d put M5 on the Air first

Why would you say that?!? They did exactly that with the M2.

I think that ultimately they'd be best served if they could ship high-end chips before low-end, but if there's one thing we've learned about Apple since the M1 shipped, it's that they'll ship what they can when they can. If they think they can't get the larger M5s out as early as the base M5, the base M5 is what we'll see first.

komuh · Nov 19, 2024

leman said:
The main purpose of CoreML is integrating simple ML models into applications. I wouldn't refer to it as a general-purpose framework for machine learning, nor is it designed to be one. Neither is MPSGraph, which targets GPU only. I fully agree with you that CoreML can be massively improved, however, it does what it is suppose to do — take your pre-trained simple models and run them on the NPU.

If you are looking for a general-purpose ML framework similar to Jax or PyTorch, you might want to look at MLX.

MPSGraph can run on both GPU and NPU (You probably think about MPS which is pure Metal which indeed can't run on NPU).
MLX is cool but who want to train models on Apple devices it is not worth way too slow GPU and no sane way to do multi-gpu training

DrWojtek · Nov 19, 2024

Confused-User said:
Why would they make a Hidra that's like a Max? I think the whole point was that Hidra was a monolithic die close to 2x the Max.

Maybe, but I think it's far more likely they'll just push out whatever they've got at the time. It doesn't serve them to hold back advances for no reason.

Everything you say is true... *If* you work in certain fields. ML, some types of graphics and video, gaming, some scientific computing, and a few others. For everyone else, big GPU resources are a waste of silicon. This is a huge drawback to Apple's generally successful strategy of integrating GPU and CPU onto one die: You can't mix and match. In x86 world, plug-in GPUs make that the default approach, and that's a big part of why Apple hasn't made as much headway as you'd otherwise assume, given their tech advantage.

They're heading towards better flexibility (at least pre-purchase) in the future, where you can make a reasonable selection of CPU cores and GPU cores (at least; possibly NPU and other resources too eventually) by picking which chiplets are in your system. But it may be a few years yet before we get there. I *hope* it comes sooner than that, and it wouldn't be astonishing if they delivered in the M5 generation, but I'm not counting on it.

That's an argument for NOT being on such a long cycle (though not a good one, as the NPU is not relevant to anything except running existing models, for the most part). We don't really know what that cycle will be. I suspect that Apple knows that 2-3 years won't cut it, but I don't know if they actually care. It depends on whether or not they think they can rebuild a user base on the high end.

That's the most likely scenario, 4-6 E cores. I still think we're headed for segregated E cores for OS-only use at some point, and that would be a good machine in which to introduce that feature, but I make no predictions about the upcoming models. I also think there's a low chance that they might decide to try for good coverage of embarrassingly parallel workloads, with many many E cores, but I would be very surprised if they tackled that before they get to mix+match chiplets.

I don't know why it would be square. Any such device would be two or more chiplets, so much more likely to be rectangular.

Why would you say that?!? They did exactly that with the M2.

I think that ultimately they'd be best served if they could ship high-end chips before low-end, but if there's one thing we've learned about Apple since the M1 shipped, it's that they'll ship what they can when they can. If they think they can't get the larger M5s out as early as the base M5, the base M5 is what we'll see first.

If they release an M5 Macbook Air next spring, 6 months after release of M4 MBPs, that will almost wipe sales for the base M4 and M4 Pro MBPs, and those who already bought them will be rightfully pissed.
If it were me I’d throw my M4 MBP against the glass windows of an Apple Store screaming NOOOO at the top of my lungs. Like that gif with some climate activist screaming when they’re about to chop down some tree.

leman · Nov 19, 2024

komuh said:
MPSGraph can run on both GPU and NPU (You probably think about MPS which is pure Metal which indeed can't run on NPU).

My knowledge is possibly outdated. How would you run MPSGraph on the NPU? I am only aware of the Metal device type.

crazy dave · Nov 19, 2024

Confused-User said:
Why would they make a Hidra that's like a Max? I think the whole point was that Hidra was a monolithic die close to 2x the Max.

I dunno, I ain't the one saying it.

It's Gurman who said that the Mac Pro would have a chip that "probably" has "up-to" a 2x Max core count and he's also the one talking about a desktop specific Hidra chip destined for the Mac Pro.

But that's why I am saying that interpreting Gurman's various musings on the topic is non trivial and the last Mac Pro generation was the last time he screwed up. Maybe there is a Hidra, maybe its M5 instead of M4 and M4 will have Maxes and Ultra, maybe that's wrong and Apple for some reason is doing an M4 Hidra that otherwise looks like an M4 Max but tweaked, maybe there's an M4 Ultra and a M4 Hidra and he was only talking about the former in the quote above, or maybe none of the above. For the record, I personally doubt a monolithic die too close to the 2x Max size simply by virtue of die cost, but I'd also concede it is possible. Until the Oracle of Bloomberg gets a little more specific than "probably" and "up-to" and starts creating more coherent prophecies with names and core counts (and of course if we get 3rd party die shots of the M4 Max with or without the Ultrafusion connector), I'll remain in wait-and-see mode.

M4+ Chip Generation - Speculation Megathread [MERGED]

macrumors G5

macrumors 68000

macrumors P6

macrumors 68000

macrumors newbie

macrumors 6502a

macrumors 68000

macrumors Core

macrumors 6502

macrumors 6502

macrumors member

macrumors 6502

Suspended

macrumors Core

macrumors Core

Cancelled

macrumors 6502

macrumors P6

Suspended

macrumors Core

macrumors 6502a

Suspended

macrumors 6502

macrumors Core

macrumors 68000

Our Staff