When Will Apple Silicon Support PCIe 5?

theorist9 · Jan 17, 2024

joevt said:
HDMI comes from the GPU, not PCIe.

But what about this from the documentation, where it says the MP's two HDMI ports come from the I/O card, which is PCIe:

This seems like a useful article:

Mac Pro (2023 - M2 Ultra) - Limits and possibilities of PCIe slots

The 2023 Mac Pro allows users to directly connect PCIe cards. Apple gives some recommendations on how to install and configure them and which cards can be installed in the Mac Pro. Below are some m...

softron.zendesk.com

deconstruct60 · Jan 17, 2024

theorist9 said:
But what about this from the documentation, where it says the MP's two HDMI ports come from the I/O card, which is PCIe:

View attachment 2337916

Note the part that says the Apple I/O card can not be installed in another slot. The edge of the card carrys more than just PCI-e. Same thing was true of the MP 2019's I/O card. It is not a generic PCI-e standard card. That is why it doesn't install.

Also likely doing some 'double counting'. Often in Apple systems where there is TB video out and a HDMI socket it is a either or. (e.g., MP 2013 , MPX video cards, etc.). The MBP's were there are only 3 TB sockets but 4 controllers are somewhat the exception to the general rule.

If someone wanted two HDMI sources on a Studio Ultra they can just slap an adapter/mini-dock on one of the TB ports. The Video sources don't come out over PCI-e. Not sure why trying to entangle that with same bandwidth pile as PCI-e data flow.

theorist9 · Jan 17, 2024

Here's another way to think about it: If we get rid of the I/O card (2 x HDMI + 2 x USB-A + audio jack which, with 8 TB4 ports, you might not need), that leaves us with 22 PCIe4 lanes, since the 2 x SATA6, 2 x 10 Gb Ethernet, Wi-Fi, BT, and 1 x USB-A collectively use 2 lanes, and the SSD uses 8.

In that case, here's the I/O the Ultra Studio (US) has that the MP doesn't (I've ignored the SD card reader and headphone jack) :
HDMI 2.1 + 1 x USB-A

And here's the I/O the MP has that the US doesn't (the MP has 2 x 10 Gb ethernet, giving it one more than the US):
22 x PCIe4 + 2 x TB4 + 1 x 10 Gb ethernet + 2 x SATA6 (2 x 6 Gb/s).

Thus the difference is:
22 x PCIe4 + 2 x TB4 + 1 x 10 Gb ethernet + 2 x SATA6 (2 x 6 Gb/s) – 1 x HDMI 2.1 – 1 x USB-A.

How does the bandwidth from an HDMI 2.1 port compare with that of TB4? Is TB4's significantly more, b/c HDMI is 48 Gb/s simplex (plus a small return signal for control), while while TB4's is 40 Gb/s duplex?

joevt · Jan 19, 2024

theorist9 said:
How does the bandwidth from an HDMI 2.1 port compare with that of TB4? Is TB4's significantly more, b/c HDMI is 48 Gb/s simplex (plus a small return signal for control), while while TB4's is 40 Gb/s duplex?

HDMI 2.1 is only 42.67 Gbps after 16b/18b encoding.
Thunderbolt 4 is 40 Gbps after 64b/66b encoding.
The max display output we've seen from Thunderbolt 4 is ≈38.9 Gbps (XDR without DSC using dual HBR3 x4).
The max display output you can normally get from Thunderbolt 4 is 34.56 Gbps from two displays. A dual tile 5K60 display (dual HBR2 x4) is less than that.
The max display output you can normally get from Thunderbolt 4 for a single display is 25.92 Gbps (single HBR3 x4).

I don't know what the max bandwidth is for Apple Silicon HDMI 2.1 ports. Do they really support HDMI 2.1, or are they using a HBR3 x4 adapter?

theorist9 · Jan 19, 2024

joevt said:
HDMI 2.1 is only 42.67 Gbps after 16b/18b encoding.
Thunderbolt 4 is 40 Gbps after 64b/66b encoding.
The max display output we've seen from Thunderbolt 4 is ≈38.9 Gbps (XDR without DSC using dual HBR3 x4).
The max display output you can normally get from Thunderbolt 4 is 34.56 Gbps from two displays. A dual tile 5K60 display (dual HBR2 x4) is less than that.
The max display output you can normally get from Thunderbolt 4 for a single display is 25.92 Gbps (single HBR3 x4).

I don't know what the max bandwidth is for Apple Silicon HDMI 2.1 ports. Do they really support HDMI 2.1, or are they using a HBR3 x4 adapter?

I was wondering more about the difference between duplex vs. simplex.

Separately, what did you make of the statement from the documentation that HDMI is run over PCIe instead of coming from the GPU?

joevt · Jan 19, 2024

theorist9 said:
I was wondering more about the difference between duplex vs. simplex.

I don't think it matters since Thunderbolt can't be simplex, unless it's outputting DisplayPort HBR3 which is 32.4 Gbps (or 25.92 Gbps after 8b/10b decoding).

theorist9 said:
Separately, what did you make of the statement from the documentation that HDMI is run over PCIe instead of coming from the GPU?

The documentation implies that HDMI subtracts from PCIe. However, It's only the USB on the USB/HDMI card that is using PCIe.

@deconstruct60 already explained that. The USB/HDMI card is connected to a special slot that has separate lines for the display signal. I don't know if those lines are DisplayPort and the card converts that to HDMI or if those lines are HDMI (the GPU produces HDMI or there's a DisplayPort to HDMI converter before the signal gets to the slot). In Mac Pro 2019, the signals to the I/O card are DisplayPort (dual HBR3 x4 to the discrete Thunderbolt controller on the card).

deconstruct60 · Jan 21, 2024

joevt said:
The documentation implies that HDMI subtracts from PCIe.

That 3rd party website is using “USB/HDMI” as a substitute for “Apple I/O” as the proper adjective in front of the word ‘card’. . It is just an adjective , not a prescription of the edge connector(s) . at best it is the external connectors that are being referenced , not the interior ones.

joevt said:
@deconstruct60 already explained that. The USB/HDMI card is connected to a special slot that has separate lines for the display signal. I don't know if those lines are DisplayPort and the card converts that to HDMI or if those lines are HDMI (the GPU produces HDMI or there's a DisplayPort to HDMI converter before the signal gets to the slot). In Mac Pro 2019, the signals to the I/O card are DisplayPort (dual HBR3 x4 to the discrete Thunderbolt controller on the card).

The MP 2019 IO card

OWC Teardown of the 2019 Mac Pro

Our OWC friends Tom and Brady have torn apart a 2019 Mac Pro. Not only do we have it on video, but check out our high-res photo gallery of all the parts as well!

eshop.macsales.com

On the Mac Pro 2019 the Apple I/O card was set up the following way

“…
The proprietary portion of slot 8 apparently carries 2x 4-lane DisplayPort main links, 2x DP AUX channels, 2x DP HPD signals, 2x S2I connections to the T2 chip for the audio codec, 2x USB 3.0 SuperSpeed signaling pairs, 4x USB 2.0 D+/D- pairs, and probably some sort of low-speed GPIO for the Thunderbolt controller. …”

Are Thunderbolt managed only by PCI POOL B ? (Solved)

I wouldn't say it cuts anything short. Every USB port that can actually carry data that you include in any PC design needs a D+/D- signal pair from somewhere. There are several more or less decent photos of the Apple I/O Card available, and on one side it has an Intel JHL7540 Thunderbolt 3...

forums.macrumors.com

[ Note: that old thread is mostly about the Thunderbolt controllers all being assigned to Pools B putting ‘load‘ on the indicator. Apple has appears to have just swapped that with dropping the other discrete port controller allocations onto pool B . I suspect to make board layout easier and more independent of SoC package dimension specifics. ]

unlike the MPX connector , the second proprietary connector Apple had in the MP 2019 is likely just being reused. Strip the TB controller off and put discrete USB 3.0 controller on ( so swapping bill of materials costs ) . There are DP inputs that just can be run to DP to HDMI converters . So toss two HDMI connectors on ( to consume the old TB port edge space ) .

Slot 8 in MP 2019 is basically the same as slot 7 in MP 2023. The current ‘slot 8’ is 100% proprietary now .

the M1 ( and AppleTV ) devices used a DP to HDMI 2 converter ( Kinetic Technologies MCDP2920 ) .

MacBook Pro 14" 2021 Chip ID

Apple introduced its M1 Pro/Max processors in the 2021 MacBook Pros. How do these new in-house SoCs change the rest of the logic board’s silicon?...

www.ifixit.com

Apple TV 4K 2021 Teardown

It’s been almost four years since we had a new Apple TV to tear down—an eternity in tech time, and even longer if you remember to be a goldfish....

www.ifixit.com

( the chip ID effort on tear downs seems to be receding now. M2 mini stops tear down with covers still on the chips next to the HDMI ports. )

Parade Tech has a similar one that does enough of HDMI2.1 to cover 8k . ( Realtek was reported to be working one one in 2019-20 also ) .

PS196 - DisplayPort™ 2.0 to HDMI™ 2.1 Protocol Converter - Parade Technologies, Ltd.

PS196 is a DisplayPort™ 2.0 to HDMI™ 2.1 Protocol Converter that receives both video and audio streams over a DisplayPort link and converts to HDMI supporting either TMDS or FRL output signaling. The DP receiver supports up to 8.1Gbps link rate over 4 lanes. The HDMI output port can operate as a...

www.paradetech.com

( there are a several DP-to-HDMi 2.1 dongles out there . Pretty likely there is some chipset converter that meets Apple’s specs. Going to need one for the eventual AppleTV update anyway . ) .

if Apple was putting HDMI ports on the much higher volume MBA ( and iPad) devices then maybe they would mess with HDMI inside the SoC , but too easy to just put an embedded dongle in where They use HDMI ( and buy same IC across multiple product lines to driven down costs )

[ the MP 2019 card a
so sent USB 3 From the PCH to the type A slots because had pragmatically already paid for it anyway . It would be easy to skip those pins . ]

theorist9 · Jan 21, 2024

joevt said:
The documentation implies that HDMI subtracts from PCIe. However, It's only the USB on the USB/HDMI card that is using PCIe.

@deconstruct60 already explained that. The USB/HDMI card is connected to a special slot that has separate lines for the display signal. I don't know if those lines are DisplayPort and the card converts that to HDMI or if those lines are HDMI (the GPU produces HDMI or there's a DisplayPort to HDMI converter before the signal gets to the slot). In Mac Pro 2019, the signals to the I/O card are DisplayPort (dual HBR3 x4 to the discrete Thunderbolt controller on the card).

deconstruct60 said:
Note the part that says the Apple I/O card can not be installed in another slot. The edge of the card carrys more than just PCI-e. Same thing was true of the MP 2019's I/O card. It is not a generic PCI-e standard card. That is why it doesn't install.

OK, I think I see what the two of you are saying:

While the HDMI is output via the I/O card, that card is not pure PCIe. The 2 x USB are delivered to the card via PCIe, but the 2 x HDMI are not—they could be coming directly from the GPU, or via GPU->DP->HDMI. And that would also explain why that slot, exceptionally, can't take a standard PCIe card—it accommoates Apple's I/O card only. It would also explain why that card doesn't require more than 4 PCIe lanes, since the full bandwidth of 2 x HDMI + 2 x USB-A ≈ 100 Gbps, while this is only a PCIe3 x4 card, and can thus deliver only 32 Gbps.

But if the card is only delivering 2 x 5 Gbps USB-A, then it does seems wasteful to use 4 PCIe lanes for that. Could they have made it a 2-lane card, enabling you to fill up the remainder of Pool B with a 2-lane card plus a 4-lane card?

Blue Quark · Jan 21, 2024

Thanks to everyone here for this thread. Very educational.

joevt · Jan 21, 2024

theorist9 said:
OK, I think I see what the two of you are saying:

While the HDMI is output via the I/O card, that card is not pure PCIe. The 2 x USB are delivered to the card via PCIe, but the 2 x HDMI are not—they could be coming directly from the GPU, or via GPU->DP->HDMI. And that would also explain why that slot, exceptionally, can't take a standard PCIe card—it accommoates Apple's I/O card only. It would also explain why that card doesn't require more than 4 PCIe lanes, since the full bandwidth of 2 x HDMI + 2 x USB-A ≈ 100 Gbps, while this is only a PCIe3 x4 card, and can thus deliver only 32 Gbps.

But if the card is only delivering 2 x 5 Gbps USB-A, then it does seems wasteful to use 4 PCIe lanes for that. Could they have made it a 2-lane card, enabling you to fill up the remainder of Pool B with a 2-lane card plus a 4-lane card?

ifixit has a teardown video which shows the slots and the backsides of the two cards:
https://www.ifixit.com/News/77003/2023-mac-pro-teardown-still-grate

The USB/HDMI card looks like PCIe 8 lanes + extra pins for video and maybe audio. Maybe it's 8 lanes physical but only 4 lanes electrical.
The Thunderbolt card is not PCIe at all.

A PCIe card doesn't subtract from a pool unless it is transmitting data so it doesn't matter how many lanes it has. If it can only transmit 10 Gbps then that's less than one lane of PCIe gen 4.

Since it's an actual PCIe slot, you can replace the card with something more useful and get video+usb from the Thunderbolt ports.

theorist9 · Jan 21, 2024

joevt said:
ifixit has a teardown video which shows the slots and the backsides of the two cards:
https://www.ifixit.com/News/77003/2023-mac-pro-teardown-still-grate

The USB/HDMI card looks like PCIe 8 lanes + extra pins for video and maybe audio. Maybe it's 8 lanes physical but only 4 lanes electrical.

That's my understanding as well. I count 49 PCIe pins in the I/O card at top, corresponding to physical 8 PCIe lanes; the HDMI imputs are, I assume, the gold-plated pins on the left. But why would they connect four electrical PCIe lanes for 2 x USB-A?

joevt said:
A PCIe card doesn't subtract from a pool unless it is transmitting data so it doesn't matter how many lanes it has.

It sounds like you're saying that what the MP's Expansion Slot Utility is reporting is simply the number of electrical PCIe lanes connected, not the number of lanes that are reserved or used. Thus when it shows an 88% Pool B Allocation for the non-TB built-in I/O plus the I/O card, it just means that 7 PCIe electrical lanes are connected (4 from the I/O card plus 3 for the built-ins = 7; 7/8 = 88%). And if you plug in an additional 8-lane card and assign it to pool B, then the allocation should go to 15/8 = 188%, which is fine.

joevt said:
If it can only transmit 10 Gbps then that's less than one lane of PCIe gen 4.

Just FYI, the I/O card is Gen 3, and it would actually need >1 PCIe3 lane for 10 Gbps. Not sure how that works if you have a Gen 3 card plugged into a Gen 4 slot.

More broadly, how does Pool allocation work?

For simplicity, let's imagine a toy model where you have 2 lanes of PCIe4 capacity (32 Gbps) in a given pool, and two x2 cards plugged in and running (call them cards X and Y) on that pool. Is the system limited to allocating 1 lane each to X and Y? Or, with pooling, can the entire 32 Gbps bandwidth be freely and dynamically shared between X and Y, such that if X needs 10 Gpbs (≈ 0.6 lanes) and Y needs 20 Gbps (≈1.3 lanes), the system can fully supply both.

deconstruct60 · Jan 22, 2024

theorist9 said:
Here's another way to think about it: If we get rid of the I/O card (2 x HDMI + 2 x USB-A + audio jack which, with 8 TB4 ports, you might not need), that leaves us with 22 PCIe4 lanes, since the 2 x SATA6, 2 x 10 Gb Ethernet, Wi-Fi, BT, and 1 x USB-A collectively use 2 lanes, and the SSD uses 8.

That is an 'Alice in Wonderland , down the rabbit hole ' way of looking at it. If you remove a PCI-e card from a host system, the change in bandwidth of that host system is ZERO. nothing. Nada. zip.

That card, detached from the host system has zero bandwidth ( no power -> no pragmatically useful bandwidth).

Don't like Apple's I/O card. Another standard, bus powered, x4 PCI-e card that has suitable driver support will/should work just fine. The bandwidth isn't being 'lost' (or decremented) ; just unused.

Also doubtful that all of that MP 2023 I/O is using just 2 lanes. The SATA is likely a independent discrete controllers. The Ethernet stuff is at best 1 independent controller (if not two). WiFi/BT ... again and independent controller . The USB A again likely a small independent controllers ( like with iMac/Mini/etc. )
Intel rolls lots of that up into a PCH chip (along with some 'secondary' PCI-e lane provisioning). Modern PCH chips have 'dual use' lane which are either SATA or something else or USB or something else. Apple doesn't really build PCH chips.

Apple's SSD modules don't use pure, standard PCI-e. There is a customer 'storage PCI-e ' protocol they use. From the descriptions it is suggestive that 8 lanes out of 16 are 'dual use' lanes. (the modules are not SSDs, but Apple does need to transmit data to from the SSD controller to the module with the NAND chips. It is a long enough distance that a subset of the PCI-e protocol is useful but the whole thing is overhead they don't need. There is a max of just two module and the controller. They needs an expansive addressing. hot plug. etc. etc. etc. The cheaper and smaller they can make the data transceiver chip, the better. )

There is no SSD module then can reuse those 8 pins for regular PCI-e ( e.g., second die in an ultra). That makes the whole set up more die edge space efficient. But that is a dual edged sword.

It is really just a x4 worth of bandwidth out to the SSD modules. the two x4 is more so for concurrency ( reading/writing to mulitple NAND chips at the same time and more capacity (address more NAND chips) ) then it is for plain 'raw' RAID 0 speed. Mac Studio/Pro SSD times are in the sub 8,000MB/s range. That isn't x8 PCI-e v4 times.
Pragmatically this is somewhat like putting a x8 PCI-e v3 card in a xPCI-e v4 host system on a direct link. There is a potential vs actually used gap in bandwidth.

The major substantive different between the Mac Pro Ultra and the Mac Studio Ultra is that second Max die's x16 is hooked to nothing. It is basically a 'bridge to nowhere. A bit like the UltraFusion connector on a solo Max die SoC. Mega 'potential' bandwidth that is electrically hooked to nothing. Just soaks up extra die space in that context.

The Mac Pro appears to detach all the 3rd party discrete controllers from the SoC. There are upsides if attach two x2 PCI-e v3 controllers to a PCI-e v4 capable switch, then it possible the switch can merge those two x2 data streams into a single faster x2v4 data stream. There is less 'waste' on direct connections. ( Apple used this general approach on the MP 2013 where x4 v3 is used to provide backhaul for two v2 Thunderbolt controllers. )

theorist9 said:
In that case, here's the I/O the Ultra Studio (US) has that the MP doesn't (I've ignored the SD card reader and headphone jack) :
HDMI 2.1 + 1 x USB-A

The headphone jack is only missing because you tossed it with the i/O card. Ditto with the HDMI and USB. Some folks don't need those specific ones that Apple provides standard. Besides the rather tame, moderate speed SD card, there is really nothing that is lacking on the Mac Pro option.

The empty slots means end users can add what they want. Firewire card. 4 Type-A 20 gbs sockets. Optical audio and with custom DSP. etc. There are over 80 PCI-e add in cards that work just fine with the Mac Pro 2023. If the end users has a higher value on discrete USB/Audio/SD card controllers that Apple does not embed (bulk buy) then the Mac Pro has no 'deficit' here at all.

theorist9 · Jan 22, 2024

deconstruct60 said:
That is an 'Alice in Wonderland , down the rabbit hole ' way of looking at it. If you remove a PCI-e card from a host system, the change in bandwidth of that host system is ZERO. nothing. Nada. zip.

Seriously, dude—do you live in your mother's basement and have zero social skills, or do you just like being an a**hole? Everyone else on this thread has been friendly and collegial. But you decided to spoil it by using gratuitiously insulting language. Just saying "I don't think that's the right way of looking at it" would have been sufficient.

Imagine going to a scientific conference, and during the Q/A afterwards hearing a member of the audience say "That is an 'Alice in Wonderland , down the rabbit hole ' way of looking at it." I've never heard something like that happen, because it's just an obvious slap in the face, and people at those conferences understand that intuitively. Not saying it never happens — I'm sure it does sometimes — but its rarity should tell you something.

And it's worth pointing this out, because this isn't the first time you've pulled that crap with me.

Plus the problem is that in you were so eager to deliver your put-down that you didn't bother to actually understand what I was saying. I was simply saying that if you want to calculate the difference in bandwidth between the MP and Studio, you can do the calculation either with the I/O card or without.

mr_roboto · Jan 22, 2024

theorist9 said:
More broadly, how does Pool allocation work?

I think you may need to take a step back, because my impression is that you're asking a lot of unfocused questions whose answers would be obvious with a better background in PCIe fundamentals. I might be wrong though, I haven't read all your posts in detail.

PCIe is a packet switched networking protocol, with three layers that map neatly into three of the seven layers of the OSI standard networking model: a physical layer, a data link layer, and a transaction layer. I found a decent looking overview here:

PCI Express Tutorial - Verien Design Group

A PCI Express Tutorial which covers topology, link layers, transactions, and flow control.

www.verien.com

A PCIe bus/network consists of a root complex, switches, links, and endpoints. The root complex (RC) is a bridge between the host computer's memory system and one PCIe bus/network. Switches are optional, and broadly similar to ethernet switches. Links are point-to-point conduits for packets. Endpoints (EPs) are PCIe devices like add-in cards. A RC may provide multiple links for switches or endpoints to connect to. Each link runs at a speed and width independent from all other links. The network topology cannot include loops, it has to be a tree whose single root is the RC. (Leaf nodes in the tree are EPs, non-leaf nodes are switches.)

With all that in mind, the "pools" are simply two different PCIe links between the Apple SoC's RC and the Microchip PCIe switch on the Mac Pro's motherboard. There's 24 physical PCIe gen 4 lanes available to make that connection, and Apple groups them into one 16-lane link (pool A) and one 8-lane link (pool B). (As I believe has already been mentioned somewhere, it's not possible to aggregate all 24 lanes into one link because PCIe link aggregation only permits 1, 2, 4, 8, 16, and 32-lane wide links. Yes, 32 lane links really are allowed, but nobody ever implements them in the real world.)

Because PCIe does not permit loops in the network topology, it's illegal for the Microchip switch to route a single PCIe slot's packets through a combination of both the pool A and pool B links. Therefore, the switch provides configuration to allow the host (macOS) to program the switch's virtual connectivity. Each 'downstream' port on the switch (the ones connected to PCIe slots) has its packets routed through only one of the 'upstream' or pool ports. This effectively splits the switch into two independent switches.

joevt · Jan 22, 2024

theorist9 said:
I count 49 PCIe pins in the I/O card at top, corresponding to physical 8 PCIe lanes; the HDMI imputs are, I assume, the gold-plated pins on the left. But why would they connect four electrical PCIe lanes for 2 x USB-A?

The other options are x8 - too many for what's available, x2 - no slot uses x2 but some cards are x2, and x1 - too few for lots of stuff. With x4, you can connect gen 3 cards and get 31.5 Gbps, with gen 2 cards you get 16 Gbps, and with gen 1 cards you get 8 Gbps.

theorist9 said:
It sounds like you're saying that what the MP's Expansion Slot Utility is reporting is simply the number of electrical PCIe lanes connected, not the number of lanes that are reserved or used. Thus when it shows an 88% Pool B Allocation for the non-TB built-in I/O plus the I/O card, it just means that 7 PCIe electrical lanes are connected (4 from the I/O card plus 3 for the built-ins = 7; 7/8 = 88%). And if you plug in an additional 8-lane card and assign it to pool B, then the allocation should go to 15/8 = 188%, which is fine.

The bandwidth that a card can use is a product of the link width (number of lanes) AND link rate. So an x16 gen 3 card can use only up to half the bandwidth of a x16 gen 4 card.
I don't know if the utility is smart enough to take into consideration the protocol of a device. For example: USB 3.0 is only 4 Gbps; SATA is only 4.8 Gbps; USB 3.1 gen 2 is only 9.7 Gbps; Thunderbolt is ≈24 Gbps. etc.

theorist9 said:
Just FYI, the I/O card is Gen 3, and it would actually need >1 PCIe3 lane for 10 Gbps. Not sure how that works if you have a Gen 3 card plugged into a Gen 4 slot.

A gen 3 card in a gen 4 slot will operate at gen 3 link rate.
The ASMedia ASM1142 is a PCIe gen 3 x1 or PCIe gen 2 x2 device that can do 10 Gbps USB (9.7 Gbps) only up to ≈8 Gbps.

theorist9 said:
More broadly, how does Pool allocation work?

For simplicity, let's imagine a toy model where you have 2 lanes of PCIe4 capacity (32 Gbps) in a given pool, and two x2 cards plugged in and running (call them cards X and Y) on that pool. Is the system limited to allocating 1 lane each to X and Y? Or, with pooling, can the entire 32 Gbps bandwidth be freely and dynamically shared between X and Y, such that if X needs 10 Gpbs (≈ 0.6 lanes) and Y needs 20 Gbps (≈1.3 lanes), the system can fully supply both.

PCIe is like Ethernet or USB. It's a network of devices connected together. A device can use all the bandwidth if it's the only device doing anything at that moment. The devices take turns sending packets. The packets are sent at max speed (say 32 Gbps), but the packets can't be sent consecutively if packets from other devices are also being transmitted (they share time), so the average bandwidth decreases while a device is waiting for its turn to transmit again.

If the total bandwidth that multiple devices connected to a pool can use is less than the bandwidth available to that pool, then they should have little issue performing at their max speed. This is what the Expansion Slot Utility indicates with < 100% allocation to a pool.

PCIe link rate and link width can be different between upstream link and downstream link. You could have 20 downstream links at gen 4 x16 and a single upstream link of gen 1 x1 which would be horrible for some devices but should still work. People do this with GPUs mining bitcoin (except the downstream links are x1 because why give them x16 lanes when the upstream is only x1?).

The pools in the Mac Pro 2019 and 2023 are provided by a PCIe switch.
The Mac Pro 2019 has 4 pools (each gen 3 x16) from the CPU PCIe lanes: slot 1, slot 3, Pool A, and Pool B.
Pool A and Pool B are two upstream links to a 96 lane gen 3 PCIe switch which leaves 64 lanes for downstream links. The Expansion Slot Utility lets you choose which upstream link (Pool A or Pool B) each of the downstream links are connected to (except some downstream links are hard coded to Pool B to make the device tree and ACPI simpler).

theorist9 · Jan 22, 2024

joevt said:
PCIe is like Ethernet or USB. It's a network of devices connected together. A device can use all the bandwidth if it's the only device doing anything at that moment. The devices take turns sending packets. The packets are sent at max speed (say 32 Gbps), but the packets can't be sent consecutively if packets from other devices are also being transmitted (they share time), so the average bandwidth decreases while a device is waiting for its turn to transmit again.

Ah, you're saying different PCIe cards in a given pool share bandwidth via time-sharing rather than lane-sharing.

Thus suppose you have two x16 cards (call them X and Y) allocated to Pool B (16 lanes total) and, during a certain time interval, card X needs 3/16 of the Pool B bandwidth, and card Y needs 13/16 of the Pool B bandwidth. Then, rather than assigning 3 lanes to card X and 13 to card Y during that time, the controller would instead allocate 3/16 of the transmission time to card X, and 13/16 to card Y (less the small amount of time lost due to switching).

Further, since it's time-sharing, the allowed division of bandwidth is of course not restricted based on fraction of lanes. E.g., it could be 34.5% to card X and 65.5% to card Y. And that can obviously change dynamically as the bandwidth needed by each card changes. Plus if both cards need >50% of the bandwidth, I assume that priority can be assigned, as is commonly done with other applications of of time-sharing.

mr_roboto · Jan 22, 2024

theorist9 said:
Ah, you're saying different PCIe cards in a given pool share bandwidth via time-sharing rather than lane-sharing.

Thus suppose you have two x16 cards (call them X and Y) allocated to Pool B (16 lanes total) and, during a certain time interval, card X needs 3/16 of the Pool B bandwidth, and card Y needs 13/16 of the Pool B bandwidth. Then, rather than assigning 3 lanes to card X and 13 to card Y during that time, the controller would instead allocate 3/16 of the transmission time to card X, and 13/16 to card Y (less the small amount of time lost due to switching).

Further, since it's time-sharing, the allowed division of bandwidth is of course not restricted based on fraction of lanes. E.g., it could be 34.5% to card X and 65.5% to card Y. And that can obviously change dynamically as the bandwidth needed by each card changes. Plus if both cards need >50% of the bandwidth, I assume that priority can be assigned, as is commonly done with other applications of of time-sharing.

Correct - this is a packet switched system. At any instant in time, the lanes in a link are never fractionated out, they're always ganged together and being used to transmit a single packet. (Technically two as PCIe links are full duplex, so receive and transmit can and do overlap.)

The way that transmission time is allocated is both more and less complicated than percentages. PCIe uses a credit-based flow control system, in which endpoints periodically receive allocations of credits from the other end of the link. Before transmitting a packet, an EP must consult its credit bucket for that kind of packet (there are several kinds, each with its own credit system). If the EP has nonzero credits available, it uses a credit and transmits, otherwise it has to delay transmission until the other end of the link gives it a credit. This permits the RC and switches to implement a very wide variety of priority and bandwidth allocation schemes.

joevt · Jan 22, 2024

mr_roboto said:
Correct - this is a packet switched system. At any instant in time, the lanes in a link are never fractionated out, they're always ganged together and being used to transmit a single packet. (Technically two as PCIe links are full duplex, so receive and transmit can and do overlap.)

The way that transmission time is allocated is both more and less complicated than percentages. PCIe uses a credit-based flow control system, in which endpoints periodically receive allocations of credits from the other end of the link. Before transmitting a packet, an EP must consult its credit bucket for that kind of packet (there are several kinds, each with its own credit system). If the EP has nonzero credits available, it uses a credit and transmits, otherwise it has to delay transmission until the other end of the link gives it a credit. This permits the RC and switches to implement a very wide variety of priority and bandwidth allocation schemes.

Does this credit system (or one of the bandwidth allocation schemes) work with isochronous data such as for video input/output where a guaranteed amount of bandwidth is required to eliminate frame drops? Maybe not - in that case you really want to make sure your video input/output device doesn't need to compete with something else that would cause frames to drop.

mr_roboto · Jan 23, 2024

joevt said:
Does this credit system (or one of the bandwidth allocation schemes) work with isochronous data such as for video input/output where a guaranteed amount of bandwidth is required to eliminate frame drops? Maybe not - in that case you really want to make sure your video input/output device doesn't need to compete with something else that would cause frames to drop.

Hmm, after looking things up to refresh my memory I might have misspoken a bit. The credit based flow control is real, but seems to be more about lower level problems than "I want my video stream to have guaranteed bandwidth". An example to illustrate what FC credits are for: Consider a PCIe switch in store-and-forward mode. Each switch port will have a receive buffer than can store N packets to be forwarded to other ports. To prevent the downstream EP or switch connected to a port from overflowing the port's receive buffer, the switch hands out an initial allotment of N credits, and only replenishes a credit each time it's able to forward a packet from the receive buffer to a different port's transmit queue.

The mechanism for priority seems to be traffic classes and virtual channels. These can be used to give some packet streams routing priority over others. They seem to be at least partially optional features so they might not be available in all systems. I don't think there's anything built into PCIe to support truly isochronous mode traffic, where there is a guaranteed amount of bandwidth allocated to a specific flow - and yes, that's why someone might really want to optimize their Mac Pro with the Expansion Slot utility, because you can make sure all your capture cards are located in a pool that is undersubscribed and let lower priority stuff sit in the oversubscribed pool.

In general you have to remember that PCIe was designed to emulate and preserve the semantics of PCI, a fairly straightforward and simple parallel expansion bus. It supports read and write on three address spaces: memory, IO, and configuration. Memory is the normal transaction type, IO is a legacy x86 thing that people almost universally pretend doesn't exist (even on x86) because memory-mapped IO is better, and configuration is a special address space used during startup to probe the bus for devices and configure their MMIO address ranges. Parallel PCI didn't have any notion of priority that I can recall, so I think serial PCI (aka PCIe) only has it as an afterthought.

sideshowuniqueuser · Mar 22, 2024

dmccloud said:
Given that PCIe 4 support is just now being rolled out (including for AMD and Intel-based systems), it will be a while before anyone starts supporting PCIe 5 in any significant manner.

Intel Gen 12 chips (released Oct 2021) have PCIe 5.
https://en.wikipedia.org/wiki/PCI_Express#PCI_Express_5.0

Apple M3 still doesn't have PCIe 5. In fact, the base M3 chip is still on PCIe 3, you have to get an M1/2/3 Pro or better chip to get PCIe 4.

Why, oh why, has Apple become the laughing stock of the latest tech. Back in Steve Job's day, they were often first adopters. Now they are often last adopters. Premium prices, for older tech.

Just hurry up and die Tim.

thenewperson · Mar 22, 2024

sideshowuniqueuser said:
Apple M3 still doesn't have PCIe 5. In fact, the base M3 chip is still on PCIe 3, you have to get an M1/2/3 Pro or better chip to get PCIe 4.

The base chips from M1 do have PCIe 4, but the Macs themselves use PCIe 3 flash.

Why, oh why, has Apple become the laughing stock of the latest tech. Back in Steve Job's day, they were often first adopters. Now they are often last adopters. Premium prices, for older tech.

Just hurry up and die Tim.

deconstruct60 · Mar 22, 2024

sideshowuniqueuser said:
Intel Gen 12 chips (released Oct 2021) have PCIe 5.
https://en.wikipedia.org/wiki/PCI_Express#PCI_Express_5.0

Apple M3 still doesn't have PCIe 5. In fact, the base M3 chip is still on PCIe 3, you have to get an M1/2/3 Pro or better chip to get PCIe 4.

The M1 had (a couple of ) x1 PCI-e v4 lanes. ( go back to post #2 of this thread https://forums.macrumors.com/threads/when-will-apple-silicon-support-pcie-5.2288812/post-29699069 and some others on the first page. )

In terms of 'end user' available PCI-e data ... yeah that is buried in the TBv3 implementation on the plain Mn packages so for.

Apple has used v4 more so to reduce the number of lanes out of the SoC die more so than win some bandwidth 'war'. 10GbE fits comfortably on x1 v4. WiFi 6E plus bluetooth would too. So single lane(s) out to the support chip(s) that Apple uses for embedded basic I/O. ( if squeezed Wi-FI 7 for every last drop of bandwidth x1 v4 would be a limitation, but > 10Gb/s Wifi would be saturate the vast majority of Mac deployment location RF capacity. Never mind Macs/iPad won't antenna structures to do 'max 7' either. Real-life Wifi 7 for most folks is going to be about 4-7 Gb/s .. which x1 v4 will handle. )

sideshowuniqueuser said:
Why, oh why, has Apple become the laughing stock of the latest tech. Back in Steve Job's day, they were often first adopters. Now they are often last adopters. Premium prices, for older tech.

Except for the Mac Pro, no Mac system has user access to 'pure raw' PCI-e ... so of those non SSD , non dGPU chips which ones grossly need PCI-e v5? "maximum Wi-Fi 7" on a client doesn't really exist yet. Apple isn't a huge 20GbE fanboy.

The Apple SSD modules are on a proprietary offshoot of PCI-e . Individual NAND chips aren't going that much faster. ( PCI-e v5 SSDs are doing lots of 'tricks' to crank speeds higher that are not all following out of the NAND chips. ). Apple is a due for a storage interface increase. Likely, part of the delay is trying to speed up without dramatically changing the engery consumption / waste heat parameters. Few other folks are running faster and even clooler than before for PCI-e v5 SSDs.

dGPUs isn't really a "early" or "late" adoption. Apple has chosen a completely different path where there just aren't any dGPUs period. Some folks may not like it, but it is new tech.

PCI-e v4 via TBv4 ... that isn't widely deployed now either ( early 2024). 2025 will likely be any kind of substantive footprint for that ( some stuff will dribble out as 2024 progresses, but it isn't going 'wide'. AMD isn't going to do much and Intel will only dribble it to a few configurations. ).

sideshowuniqueuser · Mar 22, 2024

leman said:
Apple Silicon has no need for PCIe internally, so it’s only relevant to external connectors, meaning Thunderbolt. Current Thunderbolt3 is still based on PCIe 3.0 though. I suppose once the USB consortium decides to integrate newer versions of PCIe, Apple’s implementation will follow. It might take a while.

Why do you ask anyway?

All computers use PCIe for connections to all kinds of internal chips and external peripherals. For Apple Silicon, it is certainly relevant for the connection to the SSD, as NVMe connects via PCIe. Newer, faster PCIe versions, support newer, faster, SSDs.

mr_roboto · Mar 23, 2024

sideshowuniqueuser said:
All computers use PCIe for connections to all kinds of internal chips and external peripherals. For Apple Silicon, it is certainly relevant for the connection to the SSD, as NVMe connects via PCIe. Newer, faster PCIe versions, support newer, faster, SSDs.

Apple does not use off the shelf PCIe NVMe SSDs. The "rules" you've internalized from PCs do not apply.

Apple integrates their own SSD controller directly into the M-series SoC. It connects to the internal "Apple Fabric" bus that links CPUs and high bandwidth peripherals to DRAM.

This SSD controller connects to flash modules over standard x1 PCIe links, one per module. (No, it's not a proprietary offshoot of PCIe, @deconstruct60, please don't confuse things like you so often do. Lots of your post up there was just nonsense.)

An Apple flash module is several flash memory die packaged together with a PCIe-to-NAND bridge IC designed by Apple. These bridges are not full SSD controllers - they don't implement the flash translation layer which does all the LBA-to-physical mapping and wear leveling. They just make it possible to read and write flash memory over PCIe.

The number of x1 PCIe lanes available for flash modules depends on which SoC you have. For base M1/M2/M3, it's 4 lanes (so, 4 modules maximum). For M1/M2/M3 Pro/Max/Ultra, it's 8 lanes, or 8 modules maximum. This means Apple already has the potential for SSD throughput equivalent to Gen5 x4 M.2 SSDs - gen4 x8 is the same bandwidth.

Getting angry because Apple hasn't jumped on PCIe gen 5 yet is silly. There's not much market penetration of gen 5 yet, and the most important thing which might make use of it (GPUs) isn't a big deal on Apple Silicon since Apple has chosen to not support third party GPUs.

deconstruct60 · Mar 23, 2024

sideshowuniqueuser said:
All computers use PCIe for connections to all kinds of internal chips and external peripherals. For Apple Silicon, it is certainly relevant for the connection to the SSD, as NVMe connects via PCIe.

THere are no conventional NVMe SSDs in a Mac. Apple's SSD is not a conveniental NVMe SSD. It may present to the user at the OS utilities as such, but it is not. There is no NVMe SSD slot in any of the plain Mn powered Mac systems at all. Not in the Mn Pro or Mn Max either.

The modules in a Mac Studio or Mac Pro are NOT SSDs.

sideshowuniqueuser said:
Newer, faster PCIe versions, support newer, faster, SSDs.

For the Mac Pro are no native M.2 slots on that system either. To add a M.2 (or other standard format SSD; ES1 , U.2 , etc. ) you would need a card. If select a card with a PCI-e v5 to v4 switch on it is already enables that you could put two x4 PCI-e v5 SSDs in a Mac Pro and it would work ( in a x16 PCI-e v4 slot. x4 v5 == x8 v4. 2 * x8 = x16 . Done. )

You can't to 3-4 v5 SSDs , but 1-2 is already covered. It won't be the cheapest possible path, but the Mac Pro at $6K is already not the cheapest possible path.

When Will Apple Silicon Support PCIe 5?

macrumors 68040

macrumors G5

macrumors 68040

macrumors 604

macrumors 68040

macrumors 604

macrumors G5

macrumors 68040

macrumors regular

macrumors 604

macrumors 68040

macrumors G5

macrumors 68040

macrumors 6502a

macrumors 604

macrumors 68040

macrumors 6502a

macrumors 604

macrumors 6502a

macrumors 68030

macrumors 6502a

macrumors G5

macrumors 68030

macrumors 6502a

macrumors G5

Our Staff