Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
I've mapped all the non-ground non-power pins for the front first layer of the CPU Riser Card. The text isn't quite legible but it is easy enough to correlate it with the color-coded spreadsheet or the portion of it I used for the mapping (I'll include that here as well). Who knew connect-the-dots, 'staying in the lines', using a line to traverse a maze, would come in so handy later in life!

I've confirmed there are 16 PCIe 3.0 transmit pairs and 4 PCIe 2.0 transmit pairs mapped on this layer.

top layer front side of board. I've circled the 47 pads in the spreadsheet. I guess I will list them here too for completeness. The PCIe 3.0 transmit pairs are in bold.

C45|DMI_TX_DP[3]|PCIEX|O
E45|DMI_TX_DN[3]|PCIEX|O

B44|DMI_TX_DP[2]|PCIEX|O
D44|DMI_TX_DN[2]|PCIEX|O

C43|DMI_TX_DP[1]|PCIEX|O
E43|DMI_TX_DN[1]|PCIEX|O

B42|DMI_TX_DP[0]|PCIEX|O
D42|DMI_TX_DN[0]|PCIEX|O

J51|PE3A_TX_DP[1]|PCIEX3|O;2B74
L51|PE3A_TX_DN[1]|PCIEX3|O;2B75

H50|PE3A_TX_DP[0]|PCIEX3|O;2B78
K50|PE3A_TX_DN[0]|PCIEX3|O;2B79

P52|PE3B_TX_DP[4]|PCIEX3|O;2B62
T52|PE3B_TX_DN[4]|PCIEX3|O;2B63

R51|PE3B_TX_DP[5]|PCIEX3|O;2B58
U51|PE3B_TX_DN[5]|PCIEX3|O;2B59

P50|PE3B_TX_DP[6]|PCIEX3|O;2B54
T50|PE3B_TX_DN[6]|PCIEX3|O;2B55

R49|PE3B_TX_DP[7]|PCIEX3|O;2B50
U49|PE3B_TX_DN[7]|PCIEX3|O;2B51

P48|PE3A_TX_DP[3]|PCIEX3|O;2B66
T48|PE3A_TX_DN[3]|PCIEX3|O;2B67

R47|PE3A_TX_DP[2]|PCIEX3|O;2B70
U47|PE3A_TX_DN[2]|PCIEX3|O;2B71

P46|PE3C_TX_DP[8]|PCIEX3|O;2B45
T46|PE3C_TX_DN[8]|PCIEX3|O;2B46

R45|PE3C_TX_DP[10]|PCIEX3|O;2B37
U45|PE3C_TX_DN[10]|PCIEX3|O;2B38

P44|PE3D_TX_DN[15]|PCIEX3|O;2B14
T44|PE3D_TX_DP[15]|PCIEX3|O;2B15


R43|DDR_SDA_C23|ODCMOS|I/O
U43|DDR_SCL_C23|ODCMOS|I/O

AA47|PE3C_TX_DP[9]|PCIEX3|O;2B41
AC47|PE3C_TX_DN[9]|PCIEX3|O;2B42

Y46|PE3C_TX_DP[11]|PCIEX3|O;2B33
AB46|PE3C_TX_DN[11]|PCIEX3|O;2B34

AA45|PE3D_TX_DP[12]|PCIEX3|O;2B27
AC45|PE3D_TX_DN[12]|PCIEX3|O;2B28

Y44|PE3D_TX_DP[13]|PCIEX3|O;2B23
AB44|PE3D_TX_DN[13]|PCIEX3|O;2B24

AA43|PE3D_TX_DN[14]|PCIEX3|O;2B19
AC43|PE3D_TX_DP[14]|PCIEX3|O;2B20


AT50|FRMAGENT|CMOS|I
AT48|BIST_ENABLE|CMOS|I

BA55|TEST4||I

BD50|ERROR_N[0]|ODCMOS|O

BJ53|PWRGOOD|CMOS|I

Update (added pins from the other 3/4 of the CPU):

AE27|DDR_RESET_C23_N|CMOS1.5v|O

D4|TEST3||O

F2|TEST2||O

CB18|DDR_RESET_C01_N|CMOS1.5v|O

CW17|DRAM_PWR_OK_C01|CMOS1.5v|I

DB4|TEST0||O

CW1|TEST1||O

full-size
https://www.dropbox.com/s/kjoo7grsaqb8azm/top01.png?dl=0

top01_blah.jpg

overlay.jpg

quarter.png
 
Last edited:
Vega Frontier is here only took one business day to arrive. The only working PC I have at home has one external power plug and this requires two. I have the correct power supply at work so I can test tomorrow. The yellow print flakes off which is kind of lame.

20180423_202524.jpg
20180423_202626.jpg
20180423_202818.jpg
20180425_231110.jpg
20180425_231117.jpg
 
Last edited:
  • Like
Reactions: Halbertus
I horizontally reversed the scan of the top layer of the bottom of the CPU Riser Card. I also tweaked the orientation of the top layer of the front and bottom of the CPU Riser Card (the orientation is off by less than 1 degree). Then I merged the layers.

It looks like there is another layer tweaking the placement of the traces before the VIAs travel all the way through to the top layer. But I should be able to wing it and my guesstimation will be the correct CPU pads (fingers crossed). I'll eventually document the corresponding layer to verify.

top layer back side of board. I've circled the 40 pads in the spreadsheet. 4 PCIe 2.0 receive lane pairs. 16 PCIe 3.0 receive lane pairs (in bold). Though the PCIe 3.0 receive lane pairs don't correspond to the transmit lane pairs of the top layer of the front of the board - they are likely going to the other GPU.

B50|DMI_RX_DP[3]|PCIEX|I
D50|DMI_RX_DN[3]|PCIEX|I

C49|DMI_RX_DP[2]|PCIEX|I
E49|DMI_RX_DN[2]|PCIEX|I

B48|DMI_RX_DP[1]|PCIEX|I
D48|DMI_RX_DN[1]|PCIEX|I

C47|DMI_RX_DP[0]|PCIEX|I
E47|DMI_RX_DN[0]|PCIEX|I

L55|PE2A_RX_DP[0]|PCIEX3|I;1A81
N55|PE2A_RX_DN[0]|PCIEX3|I;1A80

T56|PE2A_RX_DP[2]|PCIEX3|I;1A73
V56|PE2A_RX_DN[2]|PCIEX3|I;1A72

U55|PE2A_RX_DP[3]|PCIEX3|I;1A69
W55|PE2A_RX_DN[3]|PCIEX3|I;1A68

T54|PE2A_RX_DP[1]|PCIEX3|I;1A77
V54|PE2A_RX_DN[1]|PCIEX3|I;1A76

AE57|PE2B_RX_DP[7]|PCIEX3|I;1A53
AF58|PE2B_RX_DN[7]|PCIEX3|I;1A52

AB56|PE2B_RX_DP[5]|PCIEX3|I;1A61
AD56|PE2B_RX_DN[5]|PCIEX3|I;1A60

AC55|PE2B_RX_DP[6]|PCIEX3|I;1A57
AE55|PE2B_RX_DN[6]|PCIEX3|I;1A56

AB54|PE2B_RX_DP[4]|PCIEX3|I;1A65
AD54|PE2B_RX_DN[4]|PCIEX3|I;1A64

AK58|PE2C_RX_DP[9]|PCIEX3|I;1A43
AM58|PE2C_RX_DN[9]|PCIEX3|I;1A44

AJ57|PE2C_RX_DP[10]|PCIEX3|I;1A39
AL57|PE2C_RX_DN[10]|PCIEX3|I;1A40

AH56|PE2C_RX_DP[8]|PCIEX3|I;1A47
AK56|PE2C_RX_DN[8]|PCIEX3|I;1A48

AT58|PE2D_RX_DP[12]|PCIEX3|I;1A29
AV58|PE2D_RX_DN[12]|PCIEX3|I;1A30

AR57|PE2C_RX_DP[11]|PCIEX3|I;1A35
AU57|PE2C_RX_DN[11]|PCIEX3|I;1A36

AP56|PE2D_RX_DP[13]|PCIEX3|I;1A25
AT56|PE2D_RX_DN[13]|PCIEX3|I;1A26

AY58|PE2D_RX_DP[14]|PCIEX3|I;1A21
BA57|PE2D_RX_DN[14]|PCIEX3|I;1A22

AY56|PE2D_RX_DP[15]|PCIEX3|I;1A16
BB56|PE2D_RX_DN[15]|PCIEX3|I;1A17


dual01_preview01.jpg

dual01_preview02.jpg

quarter.png
 
Last edited:
  • Like
Reactions: Halbertus
Here is a flatbed scan of a junk Xeon E5 v2 (I had to depopulate it to scan properly). I tweaked the orientation after scan to align on either the vertical or horizontal. The horizontal and vertical aren't meeting up perfectly, if you really need them to line up perfectly you will have to do a skew transform.

For reference, on the vertically-aligned image you would change the alignment to horizontal by rotating the image clockwise by 0.23 degrees (but then fixing the horizontal messes up the vertical). Either image should work as-is for most purposes. I scanned at 1200 dpi.

update: i don't like the epson software but even with their native driver installed 3rd party scanner capture tools seem to be limited to 1200 dpi. i caved in later and re-installed the epson scanner software which is horrendous from a ui usability perspective and even feels a bit malware-y but i need 2400 dpi. what still sucks is the scanner is native 4800 optical dpi but it only lets me use 4800 dpi if the surface area is under a certain limit which is smaller than most boards. even when running on the mac pro 2013 with 64 gigabytes of ram so the limitations are artificial not based on the limitations of the computer running the app. lame. anyway i should rescan the xeon at a dpi of at least 2400.

update: oh i remember now 4800 dpi optical scanning, when even possible, the quality of the scan is garbage, at least on the Epson V39 Perfection l. In terms of acceptable quality 2400 dpi is the maximum.

(uncompressed - vertically aligned)
https://www.dropbox.com/s/71j7hukhtwslapz/xeon_e5_v2_v.png?dl=0

(uncompressed - horizontally aligned)
https://www.dropbox.com/s/7tsjb0dmds7u8yv/xeon_e5_v2_h.png?dl=0

(compressed - vertically aligned)
xeon_e5_v2_v.jpg

(compressed - horizontally aligned)
xeon_e5_v2_h.jpg
 
Last edited:
  • Like
Reactions: Halbertus
There is still a lot about PCIe that I don't know, and the electromechanical specification is impossible to find floating online for anything newer that PCIe 2.0. If anyone on the thread is part of the PCI-SIG maybe they could shed some light on this for me.

Ok I'm going to start mapping the PCIe lanes on the logic board. My saving grace is I took many pics in-between layers, so if you add up all the pics the only omission is on the edges where the metal file eroded the board away. I had to rip the connector off the top of the board but the top of the board doesn't appear to have PCIe lanes until layer three so ripping the connector off had minimal impact.

I don't yet have a mapping between the two sides of a flex cable, of which there are two types. Nor do I have a mapping between the CPU Riser Card pins and the Logic Board pins, which is probably the most complicated mapping in all of the Mac Pro with 324 pins and I'm pretty sure all of those pins are used. I'll need all three of those mappings to fully understand the flow of the PCIe lanes.

Preliminary mapping of the top layer of the back of the logic board (I need to do some more work before I know which PCIe lanes are being mapped and double check which pins of the flex cable connector they are going to)
20171207_132228_map01.jpg
[doublepost=1524708529][/doublepost]With a matrix of SPAD arrays the latency is nanoseconds, and so the flow of photons can be modeled. But to do it in realtime requires processing power on the order of petaflops. And not single precision, because the breath of scale from quantum on up requires double precision or better. We've got a long way to go before a single computing appliance offers petaflop double precision compute power.
 
Last edited:
  • Like
Reactions: Halbertus
the full size Sintech ST-NGFF2013-C adaptor
...
I installed it in my client's machine and it worked every boot. Case closed finally!
Ok I am going to remove the non-C version 'ST-NGFF2013' from the first post of the thread since that version seems to have issues and so many people have confirmed the C version has no issues.
 
To reduce confusion, in addition to updating the first post to only reference ST-NGFF2013-C, I have now integrated the NVMe adapter identifier 'ST-NGFF2013-C' directly in the title of the thread.

[doublepost=1524725053][/doublepost]Not really popcorn worthy but I did the most basic sanity check of the Vega Frontier card. It was pointless to do a GPU mark test on such a slow CPU but also it was doing an OS update and I was lazy (though I ended up doing a GPU stress test later that night).

I looked up the spec for the set of LEDs known as 'GPU TACH'. It seems to be a meter of sorts. One [red or blue] LED means the GPU is on and under minimal load. As load becomes more substantial more LEDs will come on simultaneously. I verified this around 1 am a few hours after making the video.

There are three DIP switches. One turns the GPU TACH meter on and off. The second toggles the color of the meter LEDs between red and blue. The third toggles between the dual-bios modes, though I've rebooted after flipping it (a few times) and it doesn't seem to affect anything (even for gpu stress test).

https://www.dropbox.com/s/qu99yd2p17plqx8/20180425_215447.mp4?dl=0
 
Last edited:
  • Like
Reactions: Halbertus
I'm guessing the reason Apple designed the Mac Pro 2013 with less thermal overhead is they thought the next generation of GPU would have a smaller manufacturing process. Smaller scale manufacturing process generally means higher efficiency, less heat. A11 is already at 10 nm scale and they are trying to go down further. In addition to architecture improvements it is how iOS processors are staying ahead of the curve.

Vega GPU is still using 14 nm scale. They made the wrong bet. To future-proof they can't make that kind of assumption. Clients want more power faster than the chips can be shrunk down, regardless of efficiency.
 
Last edited:
Please stop changing the title of this thread - if the topic changes that much, perhaps a new thread should be started.

There are several unrelated topics here - what does NVMe have to do with the PCB delayering?

Also, instead of trying to connect a different GPU to the MP6,1 - why not focus on putting generic PCIe x16 slots on the GPU connectors of the MP6,1? There are many companies selling external PCIe cages that connect to an x16 slot - you could make it possible to add something like 16 CUDA cards

onestop.JPG
to a trash can.
 
  • Like
Reactions: CodeJingle
Please stop changing the title of this thread - if the topic changes that much, perhaps a new thread should be started.
...
There are several unrelated topics here - what does NVMe have to do with the PCB delayering?
...
Also, instead of trying to connect a different GPU to the MP6,1 - why not focus on putting generic PCIe x16 slots on the GPU connectors of the MP6,1?
Sure I'll try not to rename the thread again unless the ST-NGFF2013-C stops working or is pulled off the market. I think I am going to keep the combined 'NVMe' + 'Document Mac Pro' topics of the thread for now. I am random that way, it suits my nature. If I successfully connect one external PCIe device over the internal bus then yes I will finally pull that half of the topic into a completely separate thread, though at that point I might do one final thread rename.

There are only two topics here as far as I know, NVMe and an effort to document the Mac Pro 2013 to the point where we are able to connect other external PCIe devices over the internal bus. The two topics are not related other than both topics are regarding PCIe devices.

Yes the goal is generic PCIe x 16 slots where the GPU once were. I am testing the slots using external GPU. Though for your specific example I really think 16 cards each having only one lane of bandwidth is kind of crap. Unless all the work is being done on the cards and there is minimal data transfer between the cards and the CPU.

You want two production-ready standard PCI 3.0 x 16 slots on a Mac Pro 2013? A PCI-SIG login is $4,000. It costs at least $300 for every round of hardware prototyping, and I am assuming there is at least three rounds of prototyping involved here. Are you gonna get me a working PCI-SIG login? Or donate hardware instead?
 
Last edited:
Yes the goal is generic PCIe x 16 slots where the GPU once were. I am testing the slots using external GPU. Though for your specific example I really think 16 cards each having only one lane of bandwidth is kind of crap. Unless all the work is being done on the cards and there is minimal data transfer between the cards and the CPU.
Actually, that expansion box has dual x16 interconnects, so it's two lanes per GPU. ;)

Second, rather than being "crap" it's actually a commercially viable product. If each GPU spends about 10% of its time moving data over PCIe, and there's little synchronicity, then each GPU gets roughly the performance of a dedicated x16 link.

Obviously some applications won't scale as well, but with large RAM GPUs and common workloads putting 8 GPUs on a single x16 slot is a sweet spot - you'll find lots of expansion chassis at that point.

A switch splits the available bandwidth according to the instantaneous load. Everyone understands that an 8 port GigaBit Ethernet switch doesn't limit each port to 125 Mbps - why would they think that if you put eight GPUs on an x16 PCIe switch that each GPU gets only 2 lanes?
 
Last edited:
Actually, that expansion box has dual x16 interconnects, so it's two lanes per GPU. ;)
...
you'll find lots of expansion chassis at that point.
...
A switch splits the available bandwidth according to the instantaneous load.
Two lanes per device in a dual x 16 chassis sounds more reasonable.

The PCIe expansion chassis are rather expensive for my purposes unless someone provides one for me to test. The cheapest device that I can use to fully utilize all 16 lanes of a PCIe connector is probably a GPU.

Yeah I know the switch reroutes bandwidth instantaneously. But still only 16 lanes for 16 high-bandwidth devices is bad. Having 32 lanes for 16 high-bandwidth devices is more reasonable, switch or no switch.
 
Last edited:
Two lanes per device in a dual x 16 chassis sounds more reasonable.
My point is that it's not a static allocation of two lanes per device. If a GPU needs 16 lanes, and they're available - it gets 16 lanes.

It's a dynamic allocation of 16 lanes across 8 devices (since x32 isn't common (or even possible) I expect that the 16 GPU x32 chassis is really two 8 GPU x16 chassis).

If the traffic is bursty - each card can see more or less full bandwidth.
 
Thanks. I just like to make sure the less technical readers understand that "two lanes per GPU" really means "that in an unlikely worst case scenario of all GPUs running at full memory bandwidth they'll get about two lanes per GPU".
I assume the worst case so am much happier when there are more lanes than devices. Yeah when not in worst case there is bandwidth to spare. But still when it comes to PCIe lanes I am anti-switch. Someone with a PCIe logic analyzer and some time to spare can convince me my switch fears are unfounded. Speaking of, I could really use a PCIe 3.0 logic analyzer if you would loan me yours.
 
Here is the mapping from CPU Riser Card to Logic Board for the back of the board (the B pins). Not as hard as I thought it would be
back_logic_board_cpu_riser_mapping.jpg

Here is the mapping from CPU Riser Card to Logic Board for the front of the board (the A pins).
160.png
 
Last edited:
I am mixing and matching bits and pieces from different shots from several layers to figure this out. Carefully tweaking each piece to align orientation etc. This is quite satisfying for my OCD tendencies.

Pretty exciting to think about the possibilities. If it works anyway. We could keep 32 PCIe 3.0 lanes free, only use the 8 remaining PCIe 3.0 lanes and the 4 PCIe 2.0 lanes to run the machine, and turn the Mac Pro 2013 into a PCIe 3.0 pass-thru logic analyzer (16 lanes in, 16 lanes out, record each packet as it passes through). I've never heard of a macOS native PCIe logic analyzer. At least if the macOS API supports such a thing. But I would probably have to write a hardware driver and I've only written hardware drivers on Windows not macOS.

I could do the same thing for the iMac Pro. It actually doesn't matter that the GPU is on the logic board with the CPU. I just find all the traces going to the GPU and cut them off from the original destination, reroute them to a standard PCIe x 16 connector. I could do crowdfunding and give away the first modded iMac Pro machine (if the crowdfunding pays for the machine then give it back to the community). Huh, well even for the Mac Pro I could probably auction it off to get back some of the money I put into the project (can't afford to give it away). But wherever the money comes from I also need $4k to get a PCI-SIG login. I think that is the most important thing.

It is pretty damn exciting to have a vision of a Mac Pro that can do nearly anything instead of the grand vision being prematurely limited by Apple. For me being a 'pro' or professional means you do whatever it takes to get the job done, even if it means modding your $10k computer. I think five years is enough time to wait for someone else to do it, at this point have to do it yourself (lol the cliches).

It was good to dream even if nothing comes of it.

mix_and_match.jpg
 
Last edited:
  • Like
Reactions: Flint Ironstag
Other than ground pins, and the 40 PCIe 3.0 lanes, and the 4 PCIe 2.0 lanes, and the bus bars, there appears to be only 43 other pins coming from the CPU. I've mapped at least 7 of them, so only 36 pins left to map from the CPU Riser Card (and possibly a couple pins from another chip depending on where the PCIe CLOCKREF signal is coming from). I've color coded this group of 43 pins in turquoise and black.

For PCIe also I should double check I didn't swap the P and N pins of any of the 44 receive pairs or 44 transmit pairs.

20171207_144540_001_blah.jpg
 
Last edited:
To map all the pins of the CPU simultaneously on the CPU Riser Card I need to properly model the physical break in the two halves of the CPU socket. Keeping the single spreadsheet but also duplicating it twice and having each copy be only one half of the CPU based on the break. I've crossed the hump I'm so close!

(split in progress)
splitting.png

update - split complete. lost the center square in the split but i think that is supposed to happen. center square is just a bunch of vdd and ground pins so not an issue. the original unsplit is still available.

https://www.icloud.com/numbers/0VVCX9gx6KPG0i6PMxZ_AK17w#XeonE5v2NoImageColorCodedLeft2

https://www.icloud.com/numbers/0qGfV2aYlgy9xIosfhYgShoXA#XeonE5v2NoImageColorCodedRight2

https://www.icloud.com/numbers/0QQRHBd3TI8QlGuL3iW412iPQ#XeonE5v2NoImageColorCoded

https://www.dropbox.com/s/5q9dt93jw0zgwzx/cpu_color_left.psd?dl=0

https://www.dropbox.com/s/836wmvlx1obe2px/cpu_color_right.psd?dl=0
 
Last edited:
full spreadsheet mapped to cpu socket layout! now i know what all the pins are on the cpu riser card. at a glance [and even with no text labels] i know exactly what each and every pin is due to the relationship between the color groups and the unique color layout throughout the cpu. (use the original numbers spreadsheet as a reference)

the psd has empty space instead of a background color which makes for easy layering. I recommend using 'color burn' at 100% rather than layering normally as partially transparent.

https://www.dropbox.com/s/niqhry577jvjyyd/split_socket.psd?dl=0

split_socket.jpg


full_mapping_preview.jpg
 
Last edited:
  • Like
Reactions: yellowbunny
BCLK1 is the PCIe reference clock. Pins AW45 and BA45. I just found this out. I am back to thinking I don't need to know anything about the platform controller chip to get a 3rd party GPU up and running.

https://www.intel.com/content/dam/w...s/4th-gen-core-i7-lga2011-datasheet-vol-1.pdf

https://www.intel.com/content/dam/w.../datasheets/xeon-scalable-datasheet-vol-1.pdf

https://www.intel.com/content/dam/w...atasheets/xeon-e5-2400-v2-datasheet-vol-1.pdf

https://www.inet.se/files/pdf/5311100_0.pdf

"100 MHz typical BCLK0 is the Intel QPI reference clock (system clock) and BCLK1 is the PCI Express* reference clock."

Not sure what the '*' caveat is about I couldn't find the corresponding footnote in the document.
 
Last edited:
  • Like
Reactions: toru173 and jclmavg
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.