OK, I found an official source saying that 8 Gb/s is preallocated for video:
That's the crappy marketing material I linked previously.
"...to fill this Thunderbolt link, the silicon extracts and routes up to 4 lanes of PCI Express Gen 3(4 x 8 Gb/s) and up to two full (4 lane) links of DisplayPort out over the Thunderbolt cable and connector to the device(s) attached downstream from the host system."
Yup.
Thus, since its max bandwidth is 40 Gb/s, but the parts that can carry data are limited to 32 Gb/s, the remaining 8 Gb/s is video only.
Just because your car can only go 32 mph on a 40 mph highway doesn't mean the other 8 mph is reserved for someone else's car. It just means your car is not as fast as it could be.
All we know is, for a discrete Thunderbolt controller, the upstream for the controller as a host is PCIe gen 3 x4, and the downstream for the controller as a peripheral is also PCIe gen 3 x4. A single PCIe device can do up to 31.5 Gbps (after 128b/130b encoding) or ≈28 Gbps (after PCIe protocol).
That tech brief (too brief) was written before integrated Thunderbolt controllers. The upstream of a integrated host Thunderbolt controller is not PCIe gen 3 x4. For integrated Thunderbolt controllers with two ports (Ice Lake and Tiger Lake) it is known that the upstream of the controller is greater then PCIe gen 3 x4. So is there a way for them to get more than PCIe gen 3 x4 from a single port if there's multiple devices connected to the port? Probably not but but it would technically be possible since the Thunderbolt connection is 40 Gbps.
Now what about the 2750 MB/s limit for TB3/4 data devices? The doc doesn't say, but it could be that, of the 32 Gb/s available for data, TB additionally reserves 10 Gb/s for USB Gen 2 devices.
If anything is reserved for USB, it is reserved as tunnelled PCIe data since the USB controllers are PCIe XHCI controllers built into the Thunderbolt controller chip.
Thunderbolt doesn't reserve 10 Gbps for USB. Apple can use up to 38.9 Gbps for the Apple Pro Display XDR. You can connect two displays that use up to 34.56 Gbps total. When I tried, the max display bandwidth I could extract using two HBR2 connections over a single Thunderbolt cable was 31 Gbps (4096 x 2304 x 68.595 Hz = 681.76 MHz pixel clock x 24 bpp = 15.5 Gbps).
That would leave 22 Gb/s = 2750 Mb/s for the TB3 device.
Except we've seen 25 Gbps benchmarks.
Finally, the document explains that TB gives priority to video over data (makes sense--a frozen display would be much more irritating than having to wait longer for data transmittal) and, from one of their graphics (screenshotted below), it appears it can allocate up to 32 Gb/s for video (less headroom), which means it uses the 8 Gb/s from the Display Port and borrows an additional 24 Gb/s from PICe. This allows it to drive two 4096 x 2160 displays @30bpp*/60 Hz, without compression (*30 bits per pixel = 10-bit color with fulll color depth, i.e., 4:4:4):
Yes, 32 Gbps is about the limit of two HBR2 connections (34.56 Gbps total).
But Apple can do two HBR3 connections up to 38.9 Gbps.
For my dual HBR2 tests, instead of using 30bpp, I used 24bpp so I wouldn't have to worry about macOS switching from 10bpc to 8bpc. Then I just increased the refresh rate from 60Hz until it would stop allowing the resolution (68.595 MHz).
Anyway this talk of reserved or allocated bandwidth is overly complicated and not very descriptive of what we're seeing. Isn't it simpler to say that DisplayPort uses what is required for the current display mode of each connected display, then whatever remains up to ≈22 - 25 Gbps (less than 26 Gbps so far) can be used for PCIe?
For Thunderbolt 4 and USB4, new tests need to be done and the tests should add cases that include USB tunnelling. For example, what's the max total bandwidth that can be achieved using USB tunnelling and PCIe tunnelling as measured by a benchmark like ATTO Disk Benchmark.app that can measure multiple devices at once? Maybe this has more of a chance of exceeding 26 Gbps since the tunnelled USB is totally separate from tunnelled PCIe so there's no squeezing (however unnecessarily) of multiple Thunderbolt devices into a PCIe gen 3 x4 bottleneck.