Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Boil

macrumors 68040
Oct 23, 2018
3,478
3,173
Stargate Command
I would think Apple will get more bang for their bucks spending engineering resources making sure their AS Mac Pro motherboard has tons of high speed unified memory with tons of processing cores rather than trying to squeeze more performance out of a tiny PCIe channel for specialised work. Imagine 1-2TB of 800GB/s unified memory with a 40/128 CPU/GPU cores AS Mac Pro.

The rumored M1 Max-based Jade 4C SiP/MCM/whatever would be 40-core CPU/128-core GPU, with 1.6TB/s memory bandwidth...

Who is certain? The MP is a challenge and therefor the most interesting one to follow. Does current MPX module support infinite fabric? Yet people put two GPU into it so the PCI bus must be sufficient for something.

I was more thinking of 120 CPU/384 GPU ;) based on Jade 4C rumours.

120 CPU/384 GPU are not the Jade 4C rumors...

Yes, you can use infinity fabric with MPX GPUs, but not via the MPX socket itself. There is an extra bridge. Scroll tot he bottom of this support page: https://support.apple.com/en-ph/HT212546

The MPX slot is basically for more power to the GPU(s) and Thunderbolt connectivity (and to avoid ugly internal power cabling, because Apple)...?

From Apple's website:

"A second connector. An industry first.

The MPX Module starts with an industry-standard PCI Express connector. Then more PCIe lanes integrate Thunderbolt and additional power provided to increase capability. With up to 500 watts, the MPX Module has power capacity equivalent to that of the entire previous-generation Mac Pro.
"
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
The MPX slot is basically for more power to the GPU(s) and Thunderbolt connectivity (and to avoid ugly internal power cabling, because Apple)...?

From Apple's website:

"A second connector. An industry first.

The MPX Module starts with an industry-standard PCI Express connector. Then more PCIe lanes integrate Thunderbolt and additional power provided to increase capability. With up to 500 watts, the MPX Module has power capacity equivalent to that of the entire previous-generation Mac Pro.
"

If I understand it correctly, an MPX module has 24 PCI-e lanes. With PCI-e 5.0, that could provide 94GB/s in each direction. If they use PCI-e 6.0 instead, that's 189GB/s in each direction. Should be more than enough bandwidth to communicate with other modular compute boards, common user-expandable quad-channel DDR5 RAM and route display output signals.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
Yes one on the mother board (master) and one in each MPX module (slaves).

If they decide to go modular route, there won't be any masters and slaves, and there won't be a SoC on the board. Just one or more modules each with their on SoC, working as a single computer with multiple CPUs and GPUs.
 

thenewperson

macrumors 6502a
Mar 27, 2011
992
912
If they decide to go modular route, there won't be any masters and slaves, and there won't be a SoC on the board. Just one or more modules each with their on SoC, working as a single computer with multiple CPUs and GPUs.
Could they go this route for the big Pro (assuming it remains)?
 

iPadified

macrumors 68020
Apr 25, 2017
2,014
2,257
If they decide to go modular route, there won't be any masters and slaves, and there won't be a SoC on the board. Just one or more modules each with their on SoC, working as a single computer with multiple CPUs and GPUs.
Of course if Apple can solve "a single computer" scalable beyond Jade 4C, that would be fine. The lower hanging fruit would be to use SoCs as GPUs are used today.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
The MPX module may well fit additional SOCs daughter cards and I do remember a big thread on the fate of the Mac Pro from a few years ago… someone had mentioned a clamshell design and ‘cartridges’…when the new Mac Pro arrived I assumed the cartridges’ are the MPX modules (no idea what happened to the clam shell rumour though)

There were several loopy "rumors" on the Mac Pro about 'lego' (snap bricks of CPU or GPU .etc) and some other
shapes that made no sense. The notion that those were or are precusors of what Apple is going to do with their own SoC are likely just as far off from what Apple actually did with the Mac Pro 2019


Now Apple could well release daughter SOC mpx modules (and yes the current ones support 500w cards. No coincidence there.

This is a huge handwaving stab at causality that isn't there. The Vega II Duo and 6800X Duo is what primarily drives the appropriate 500W power delivery of the the MPX subsystem. Large , power hungry discrete GPUs need lots of power given the design choices made over the last 6-8 years. If want to put two of those onto a single card then need a solution past the 1-2 eight PIN Molex legacy solution. PCI-e v5 is moving to a 12-pin which is higher still for upcoming discrete GPU cards.

It is a bit twisted to spin that into some notion that Apple is so deeply in love with dGPUs that they want to move essentially the whole computer onto the dGPU card. More than likely there are two things much higher on Appel's priority list. First, getting rid of a 400W power hog of a GPU. Apple has already said explicitly numerous times that Perf/Watt is what they are primarily focused on. Not bigger and even more power hungry GPUs. Second, the MPX connector doesn't just provide power to the card. It also provisions two x4 PCI-e v3 lanes to enable the on board Thunderbolt controllers on the full width MPX cards. Apple has moved Thunderbolt controllers and GPU cores to the CPU cores. The main motivator for the MPX connector is because the CPU was decoupled from the GPU. Apple entire effort is going in the opposite direction from that ( more tighter coupling between CPU and GPU). If anything there is a lowering of a "need" for what MPX provides will all of Apples efforts. ( likewise another feature of MPX is providing DisplayPort streams back to embedded TB controllers in main system ).

Apple "4 GPU cluster" is going to run under 360W ( 4 * 90 ). There is no need for 500W there. All the Thunderbolt has moved back to the SoC. ( cuts way down on complex logic board routing and switching. ). Apple whole point has been to remove that complexity. 2/3 of the MPX functionality is rendered "useless" if drag the SoC onto the add in board. Even if repurpose the MPX x4 PCI-e and Display Provisioning to export all of the host now have to export to the motherboard.

The modular at all costs is what drives higher power consumption (and lower Perf/W ) which is exactly where Apple said they did not want to go.










It was obvious with the modular talk and the subsequent release of said modules that’s how Apple saw expansions going forward.)
But if they do, we still don’t know how the system will see these modules :
Nodes ? Does that mean increased license costs ?

There are over a hundred non GPU cards that work with Macs. The notion that the CPU and GPU are the only aspects of modularity for workflows on Macs is highly myopic. That is one of main MPX design objectives to decouple the augment for an extremely narrow subset of Apple GPU modules from provisioning slots for those other cards.
There is a much bigger problem with the current SoC only provisioning a small handful of PCI-e v4 lanes ( nothing x8 or x16 ) than in trying to push the SoC onto some alter to the modularity gods. Don't fix that and whatever hand waving at will be a fail.

[ Leaning on a single x16 PCI-e connection provision general I/O needs as a Mac Pro would be about as flawed as leaning on three TBv2 controllers to "solve the problem". It is just about as too narrow and inflexible for the real scope of the problem. ]

Additionally, Folks tend to forget how that Mac Pro 2012 got banned from EU because Apple didn't meet regulations on protecting fan blades. There is a newer California energy board regulation that gives "get out of jail free" cards to systems that scale along with the number of slots provided. Going to eight slots just gave them more clearance on the Intel/AMD GPUs used in the MP 2019 system. Not the sole reason shot up from 4 to 8 slots ( also better justification for the 100% base price increase among a few others ). However, there is a spin that somehow Apple caught some ultimate love of modularity at all costs that is not well motivated.



What interface ? PCI-e gen 5/6/7 etc or proprietary?

Highly likely not if based on a "M1" foundation. Apple dribbled in some low scale PCI-e v4 into the M1. Doubtful that will see anything like PCI-e v5 before either M2 or M3 sequence. Pragmatically PCI-e v5 isn't as useful without CXL and Apple has shown extremely little movement there. PCI-e v6 without CXL is substantially even less useful ( since that is one of main drivers to v6 .... it isn't going mainstream. )

Apple's M1 internal SoC already is a bit maxed out just handling the mix of function elements they have now. There is no indication can handling someting like PCI-e v5 thrown on top. The notion that hey are trying to build some EPYC, Xeon SP , AMD Neoverse derivative 'killer' is



Besides will the end user be ok with extra CPU (or GPU) cores they may not need for their work ? Certainly it would be cheaper to sell the Mac Pro SOC cards as is with all bells and whistles intact ( no need to design extra options ). Apple could well price them good enough so it doesn’t matter whether your expansion needs CPUs or GPUs ( you get the other one free )

If Apple is just going to make the end user pay for the "extra" CPU/GPU cores up front anyway..... it is just cheaper (for Apple) to just solder the SoC dow and skip the card.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
Could they go this route for the big Pro (assuming it remains)?

It wouldn't make sense anywhere else, yes. But it is far from certain that they will choose to deliver a modular Mac Pro of this type. As others have pointed out, using multi-chip modules for gradually mode powerful non-modular systems-on-a-package might be a "simpler" solution.

But in case they do want to build a beefy modular workstation, I have tried to describe how I think this could be done in the most Apple-like fashion :)
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
MPX modules are basically PCIe cards with additional PINs for power delivery if I'm not wrong, with has a max bandwidth of 32 GB/s. It is an order of magnitude less bandwidth than what the M1 Max can achieve today.

The MPX modules have both a standard x16 PCI-e v3 connector and a "MPX connector". The "MPX connector" has three components to it. Power. two x4 PCI-e v3 ( for on board TB connectors ). and (up to) 4 DisplayPort v1.4 streams off the card . Some MPX card/modules don't consume the 'extra" PCI-e data streams (no TB controllers on board the "Half wdith" MPX cards). Some cards don't output the full DP v1.4 channels out (and leave the host system TB controllers under provisioned).

Apple providing power is more driven by Apple's dislike of "messy discrete wires" cluttering up the interior of their system) and wanting to still persue dual (and quad) GPU set ups.



I would think Apple will have better results going wild with the mainboard design with their SoC.

The MPX modules will likely be reserved for storage or network expansions.

For non GPU modules don't really need MPX (hence the standard connector on a "MPX Slot". ) That Promise , full width MPX module with HDDs is a super narrow corner case (that really doesn't effectively use space/volume well. Just useful where super obsessed with a self contained 4 wide , RAID , HDD storage. ).

Nothing in MPX is needed for any reasonable network card at all. ( 75W can get off the standard x16 slot is plenty except the most egregious network controller(s) that extremely likely don't have macOS drivers anyway. You'll be hard pressed to find anything at the >= 40GbE level bandwidth that works on macOS. ) . Does not need extra power, Network provisions no DP, and has bandwidth on the x16 PCI-e to put onto Ethernet 'wire' coming out of the box.

Far more Apple needs at least 1-2 slots for M2/U2 SSD storage cards. A/V capture output card ( e.g. SDI video out... not generic monitor output . DSP powered audio input/output . etc. ),

Apple's zero effort on 3rd party GPU drivers for macOS M-series means there is no GPU workload driver here. Also not likely Apple is looking to "fork" their own GPU drivers into "non uniform memory" or "much , much higher NUMA" GPU drivers.

The bigger issue is that their SoC doesn't have multiple x16 PCI-e v4 provisioning abilities rather than trying to do some "MPX" provisioning.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
When I am thinking about a modular Apple Silicon Mac Pro, I am thinking about a NUMA system with one or more compute boards (each with their own SoC and memory) that communicate via a high-speed bus. Apple would need introduce some new APIs to support such systems, so that apps can manage tasks across SoCs.

Decent chance Apple needs to be far more worried and put more effort into whatever their multiple chip module(MCM) solution (interconnect and comm) is. AMD's MI250 dual die GPUs present to the user as two distinct GPU dies on a MCM and they are less than inches apart. Apple would have to remove the NUMA issues completely from their own MCM before moving on to even substantially more further distances between the dies.

AMD has Infinity band and multiple years of evolutionary improvements and they didn't solve it. Apple hasn't even have a mature first generation inter-chip communication solution. They should be in a 'walk before run' mode on M1.



There is already a precedent here: paired GPUs with the Infinity Fabric Link. Metal has an API that tells you wich GPUs support fast data exchange so that you can write your apps taking advantage of this. A modular Mac Pro might work in a similar way.

That's Metal which is only a GPU core solution. The CPU/NPU/IMG/etc code isn't. Which means would need new Apple GPU drivers with a different set of assumptions built into them. Is Apple really going to chase a new, 'forked' driver stack when still trying to get the bulk of the Mac software to port over? Probably not.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
Also not likely Apple is looking to "fork" their own GPU drivers into "non uniform memory" or "much , much higher NUMA" GPU drivers.

Why would they need to though? Metal already has support for NUMA GPU setups. And if Apple ever goes the NUMA route, they will have to solve this at the SoC memory hierarchy level and not at the GPU level. GPU is just another consumer where the memory subsystem is concerned.

The bigger issue is that their SoC doesn't have multiple x16 PCI-e v4 provisioning abilities rather than trying to do some "MPX" provisioning.

Not in this generation, they don't. But future Apple Silicon generations might support more PCI-e lanes. Moreover, if Apple goes truly modular route (with multiple SoC boards communicating with each other), a single SoC itself won't even need to support that many lanes to begin with. In such a design, a SoC is just another device on the PCI-e bus. There would be a separate system root device (likely on the mainboard) responsible for configuring and starting the actual SoCs. If the connector features 24 PCI-e lanes (like in the current MPX), that's all the individual SoC needs.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
Decent chance Apple needs to be far more worried and put more effort into whatever their multiple chip module(MCM) solution (interconnect and comm) is. AMD's MI250 dual die GPUs present to the user as two distinct GPU dies on a MCM and they are less than inches apart. Apple would have to remove the NUMA issues completely from their own MCM before moving on to even substantially more further distances between the dies.

But these are different issues. A multi-chip system would be presented to the OS as a single large "chip". While compute boards I describe here would be presented as different devices. That each of these are their own challenges is clear. But I do not see the second problem (multiple GPUs) as a bigger one — after all, that's how most Intel Macs operate. So the software stack is already there.

That's Metal which is only a GPU core solution. The CPU/NPU/IMG/etc code isn't. Which means would need new Apple GPU drivers with a different set of assumptions built into them.

I am not sure why new GPU drivers are needed. Every GPU is a separate metal device. Apple would simply need to add an API to distinguish between "local" and "remote" GPU devices. As to the rest, I am completely with you. They would need a set of NUMA APIs and a consolidated approach for coordinating work and data between different SoCs. This is certainly non trivial. And that's why I am skeptical that we will see a product like that. But if they pull something like that off, it would be a very impressive skill of engineering.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
If I understand it correctly, an MPX module has 24 PCI-e lanes. With PCI-e 5.0, that could provide 94GB/s in each direction. If they use PCI-e 6.0 instead, that's 189GB/s in each direction. Should be more than enough bandwidth to communicate with other modular compute boards, common user-expandable quad-channel DDR5 RAM and route display output signals.

First, 24 lanes provisioned from where? Apple throughs gobs of SoC die edge space at just keeping up with the Memory I/O required for the CPU+GPU cores stacked on the die. Where are they going to get the space and power budge to do even x16 PCI-e v5 ?

Second, there TB provisioning of the x8 is decoupled from the x16 getting from the standard PCI-e connector. Chopping up data into the narrower slots isn't going to be well synchronized with the data running over the x16. It isn't like getting a x24 coherent bundle here. The bigger the mismatch from the internal bus width inside the SoC (and/or DRR5 bus mismatches ) the more not going to see that kind of real bandwidth throughput with minimal latency.

If doing something more propietary anyway two x16 PCI-e like couplings to a card would likely work 'better' . Or a single x16 and can follow some of the Open Compute (OCP) OAM ( open accelerator ) format standards. Not really a big need for a MPX baseline design constraint to the solution.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
First, 24 lanes provisioned from where?

I wrote it in the previous post: from some central root component. The SoC is only a device the bus, it does not need that many lanes itself.


Second, there TB provisioning of the x8 is decoupled from the x16 getting from the standard PCI-e connector. Chopping up data into the narrower slots isn't going to be well synchronized with the data running over the x16. It isn't like getting a x24 coherent bundle here. The bigger the mismatch from the internal bus width inside the SoC (and/or DRR5 bus mismatches ) the more not going to see that kind of real bandwidth throughput with minimal latency.

All good points. What I was trying to say is that there is no reason an MPX for a possible upcoming Mac Pro should be the same as the MPX in the current Mac Pro. We have a precedent showing that putting 24 lanes on a card is possible and feasible. The specific implementation of a hypothesized hardware could be whatever. It could bundle all these 24 lines together. It could use 32 lanes. It could reserve some lanes for display signal. Or it could use a proprietary bus.
 
  • Like
Reactions: singhs.apps

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
The rumored M1 Max-based Jade 4C SiP/MCM/whatever would be 40-core CPU/128-core GPU, with 1.6TB/s memory bandwidth...
That would likely mean soldered LPDDR5 modules with a max of 256GB and 2048-bits data bus! Likely it will not sport ECC memory as well.

I doubt Apple will go that route as existing Mac Pro already support more than 1 TB of ECC memory. Otherwise Apple will paint themselves into a memory corner this round.
 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
For non GPU modules don't really need MPX (hence the standard connector on a "MPX Slot". ) That Promise , full width MPX module with HDDs is a super narrow corner case (that really doesn't effectively use space/volume well. Just useful where super obsessed with a self contained 4 wide , RAID , HDD storage. ).

Nothing in MPX is needed for any reasonable network card at all. ( 75W can get off the standard x16 slot is plenty except the most egregious network controller(s) that extremely likely don't have macOS drivers anyway. You'll be hard pressed to find anything at the >= 40GbE level bandwidth that works on macOS. ) . Does not need extra power, Network provisions no DP, and has bandwidth on the x16 PCI-e to put onto Ethernet 'wire' coming out of the box.

Far more Apple needs at least 1-2 slots for M2/U2 SSD storage cards. A/V capture output card ( e.g. SDI video out... not generic monitor output . DSP powered audio input/output . etc. ),

Apple's zero effort on 3rd party GPU drivers for macOS M-series means there is no GPU workload driver here. Also not likely Apple is looking to "fork" their own GPU drivers into "non uniform memory" or "much , much higher NUMA" GPU drivers.

The bigger issue is that their SoC doesn't have multiple x16 PCI-e v4 provisioning abilities rather than trying to do some "MPX" provisioning.
Well MPX may not even be in the AS Mac Pro. Likely just standard PCIe4/5 slots and that's about it. The rest of 'low-bandwidth' expansion will be handled by TB4 ports.
 

JouniS

macrumors 6502a
Nov 22, 2020
638
399
I doubt Apple will go that route as existing Mac Pro already support more than 1 TB of ECC memory. Otherwise Apple will paint themselves into a memory corner this round.
It's far from certain that the future Mac Pro will be a workstation. The 4x M1 Max rumor sounds like the CPU performance will be in the workstation territory, while GPU performance and memory capacity will be closer to high-end consumer desktops. Either Apple has decided that workstations are rarely necessary anymore, or they will introduce some new hardware instead of simply scaling the M1 Max.
 

pshufd

macrumors G4
Oct 24, 2013
10,150
14,574
New Hampshire
I saw a video with a Threadripper and the size of that chip was monstrous. I was wondering if they had to use more than one tube of thermal paste on it. I image the Jade 4C could be really huge too.

I was talking to my son about these chips and was saying that there aren't a lot of people that need the power of the M1 MAX MacBook Pro, maybe 5% of users at most. The percentage of people that need a Jade 4C has to be tiny.
 

UBS28

macrumors 68030
Oct 2, 2012
2,893
2,340
I saw a video with a Threadripper and the size of that chip was monstrous. I was wondering if they had to use more than one tube of thermal paste on it. I image the Jade 4C could be really huge too.

I was talking to my son about these chips and was saying that there aren't a lot of people that need the power of the M1 MAX MacBook Pro, maybe 5% of users at most. The percentage of people that need a Jade 4C has to be tiny.

You realise that a Xbox Series X has 20% more GPU power than a M1 Max right? So it's not 5% of users that needs such power.

That you cannot game on MAC is a different story however. But in theory, a lot of people could make use of such power.
 

pshufd

macrumors G4
Oct 24, 2013
10,150
14,574
New Hampshire
You realise that a Xbox Series X has 20% more GPU power than a M1 Max right? So it's not 5% of users that needs such power.

That you cannot game on MAC is a different story however. But in theory, a lot of people could make use of such power.

Does the Xbox require the CPU, 256 GB RAM, memory bandwidth, 12 Thunderbolt controllers, and ProRes encode/decode of the J4C?

I need the additional ports and display capacity of the M1 PRO along with the RAM options. I don't need the CPU or GPU horsepower though.

The M2 Air may address those issues.

Apple doesn't make gaming PCs.
 

UBS28

macrumors 68030
Oct 2, 2012
2,893
2,340
Does the Xbox require the CPU, 256 GB RAM, memory bandwidth, 12 Thunderbolt controllers, and ProRes encode/decode of the J4C?

I need the additional ports and display capacity of the M1 PRO along with the RAM options. I don't need the CPU or GPU horsepower though.

The M2 Air may address those issues.

Apple doesn't make gaming PCs.

The statement was that only 5% need the power of the M1 Max. And this is not true since gamers have needs that exceed the power of the M1 Max.
 

UBS28

macrumors 68030
Oct 2, 2012
2,893
2,340
Only gamer needs matter, as always.

The gaming revenue is more than 4 times larger than the revenue of the movie industry. They are not a small group of people.

Really, the power of the M1 Max can easily be exceeded by many people.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.