@sv8 wondered about the Ultra (and possible quad-chip M3) being on N3E. Their reasoning was incorrect, but I noticed something just a little while ago that bears on that.
Looking at the Max's die shot, where is UltraFusion?? Did they just hide it (IIRC they did in some of the early pictures of the M1 Max)? If so it would have to be at the bottom, under the GPUs and SLC/RAM controllers, which is reasonable.
Apple has been somewhat duplicitous about the Mn Max die twice before. At the Introduction of the M2 Max ....
The "substantively less than truthful" pattern is pretty clear now. They can't "delight and surprise" later if they reveal it so they 'hide it' because it is a 'secret'. Doing it once that works well, Doing it twice is a 'seen that trick before' event , so a not really surprising non-trick. Doing it a third time smacks of a magican that lost their skill to do anything different. It is just a boring part of the 'act'.
But what if there's another explanation?
They've already changed their pattern (if two generations makes a pattern) - the Pro and Max are clearly quite different from each other, not sharing nearly as much of their design as previous generations.
But .... the basic floor plan of the M2 Max and M3 Max do share tons of overlap. Yes, the Pro went in a slightly different direction from its previous iteration, but the Max sticking to the same 'fotmula'.
Not sharing is a dual edged sword. That means that the Max loosing a 'share some R&D overhead' partner. So would Apple also cut it off from the Studio deployments? If they don't ( Max variants of Studio are sharing with laptops ), then that balloon squeeze's "Lost my R&D overhead sharing partner' to just the Ultra systems to sort out all by their substantively lower volume selves.
If the changes to Pro push up monolithic only Max sales then dropping from the Mac Studio could work.
Perhaps the chip that will be the basis of the Ultra is NOT the Max, this generation. Maybe there's no Ultrafusion connector in the M3 Max die shot because it doesn't exist.
This would be pretty surprising - I wouldn't expect them to do a whole new floorplan for chips making up an Ultra,
Not too surprising. The Pro is a smaller die (more affordable). Loosing the connector would make the Max a smaller die also. (not going to make the Max 'cheap' , but Max has more transistors than Nvidia H100. Smaller because on N3B , but being on N3B costs more too. )
They don't necessarily need a "whole new floorplan". Slice the I/O interfaces off and put UltraFusion on both sides. ( the 'messy' part is likely which side of the 'cut' do they put the Display Engines . that would probably be the biggest 'reshuffling'. So not trivial cut. But the further away from I/O section the less perturbed things would get. ) Can have one , two , or three of the compute core dies and one or two dies of I/O bracketing on either side.
Just need to disaggregate what they already worked out in the base Max design. The center of gravity around the connectivity of the GPU cores , memory+system cache , NPU , and CPU cores could all stay the same. The 'top'
and the 'bottom' edges are only what is needed to be more chiplet friendly.
The bigger external problem for Apple though is that there isn't access to making a "bigger than two" packages since the AI boom has lead to Nvidia , AMD, and others to basically buy up all the CoWoS capacity for years. With Info-LIS limited in size , they really only can economically get to access to a ' 2 x max class size' die packaging. So if can only make something in the 1 reticle size zone the single sided UltraFusion is the cheapest , less risky path. Keeps the Max laptops + Studio + Mac Pro bundled into a larger group to aggregate costs over.
The even bigger problem is that thees larger SoCs are a 'one and done' product. Once the next gen MBP 14/16 , Studio , and/or Mac Pro shows up the number now 'old' M(n-1) craters like a rock. No 'hand me down' products to stuff these into. So if replace them every 18-20 months ... that's it... only have 18-20 months to get 100% return on investment. It isn't just the relative low volume of the products but the relatively short lifespan also. ( e.g., the M2 Max died off in MPB 14/16 in less than a year. That isn't a sustainable thing over long term. ) . The 'quick death' thing is deep problem that seems to spur the odd chase for a Rube Goldberg 4-way system.
as it would seem to be way too low volume for that. But... They obviously know a lot more than I do, maybe they see a good reason to do this. Maybe they count it as another learning step towards a future full of high-density chiplet interconnects, and therefore worthwhile just for that.
The Mac Pro is certainly a 'way too low volume' product to support a highly forked die design. But is the Mac Studio also in that category. Mac Studio very likely is not big enough to chop into 'two' and give one '1/2' to overlap with the laptops and the other half to overlap with the Mac Pro.
But if disaggregating for a good chiplet design things should still be largely organized like had a magical 2x-4x bigger reticle limit and one big die. Then subdivide can make it using more practical to make chunks and just attach the virtual single die network back together again. Should not be disaggregating ( 'slicing') across boundaries that the design made very tight and extremely highly coupled.
If all it is mostly about is just 'practice' for the future then the somewhat superflous , 'tacked on at the end' UlfraFusion already was practice for two iterations. it is time to do some real disaggregation. Doesn't have to be chasing 4 way ... better disaggregation at just "2 way" would be an incremental step forward.
Maybe they need to do more work because they want it to go 4-way as well as 2-way?
There is no TSMC production capacity to do a 'single unified image' 4 way ... so it is curious why it keeps popping up on the radar as plausible factor. ( The 'freight train' to this CoWoS bottleneck should have been pretty years ago if was paying attention to the evolution path the AI/ML market was on. )