Mikas, I know this is a lot of text. However, this is probably going to give you some if not all answers you are looking for. It's a direct copypasta from my initial work:
Update:
Killed a GPU card needlessly (probably) for the purposes of understanding the failure methods. Not sure what that will yield - yet.
Replaced interconnect cable on GPU A, and interconnect cable on GPU B with a replacement card.
Still had trouble with High C, and Mojave, but still could safemode boot.
Tore the unit down completely.
Popped in a new BR2032 battery (NOT CR2032!) while I was in there and cleaned everything up.
It wasn't until I replaced the CPU with an out of band model that I got a declared memory error with the 5 second beep code.
No tools I ran nor the built in diag tool were able to catch it.
Symptom was general slowness and random odd behavior.
Pulling stick and reconfiguring RAM resulted in an operational state.
Started at rev 129.0.0.0, went to 133.0.0.0 through initial Mojave, then moved into Big Sur.
Firmware updated to current position without an issue through Big Sur.
System has been stable over 4 days.
Wakes and sleeps no issue. Works fantastically.
Hope this helps someone find their way.
Side notes:
I followed the apple service manual with a science, and moved through the troubleshooting flowcharts to solve my problems.
One standoff did come out of the core on the CPU riser, however I was able to remove it with a clamping wrench gently attached to the ring of the standoff, not the threads, and wrapped the board with static proof bag, held in place with Kapton tape. This process was done carefully to make sure there was no damage, excessive force, or torsion applied to the board.
The core was repaired by chasing the threads, and certifying they were fine, then applying Loctite 242 to the threads of the standoff and securing. Level measurements were taken across all 4 standoffs during several steps of tightening the standoff, and a final level measurement taken on all 4 to ensure it was true. I stopped applying pressure just slightly after the standoff bottomed out in the core to avoid completely stripping the core and rendering it useless. The core was allowed to have the 242 cure for over 24 hours before additional re-assembly steps were taken.
It is vital that any core repairs are done with utmost concern and caution.
Clean board fingers with a lintless cotton bud and 91% isopropyl alcohol. I also used this to clean the GPUs up. All boards were given ample time to dry out, with verification in several ways prior to re-assembly.
Every step was taken with extreme care and a great attention to detail.
All specs followed completely and tightening patterns to the apple service manual.
All fasteners that had traces of blue on them received a very small amount of Loctite 242 on reassembly.
Separators were made for the IO board and others out of cardboard surrounded by antistatic bag material for reassembly.
MX4 thermal compound was used for both GPUs and the CPU.
OEM VRAM thermal pads were discarded after measuring in every possible way. These were replaced with Thermalright 2MM type. I used 2x of the 85x45x2MM packs. These were cut down to match the entirety of the GPU VRAM copper interface areas, test fitted, and then applied to the GPUs before re-assembly.
I run the system with an advanced fan curve through Macs Fan Control, using Sensor-based value, assigned to GPU 2 Diode. Exhaustive testing revealed that this was the best temp sensor to key off of for all cases I could figure out. Fan speed starts to increase from 55c, and Maximum temperature has been defined as 75c. The sound footprint is louder in some cases, but the system stays much cooler. This is a trade-off I am happy with and doesn't require 3rd party solutions to be involved.
What's the fan noise like with these settings? Idle duty is indistinguishable from ambient noise (With the meter on the top this was mid 20s DB) virtually silent. Light duty catches in the low 30s DB. Moderate usage is high 30s DB. Heavy usage is mid to upper 40s DB.
I take no responsibility for what you do with this information, as the 6,1 is a fragile wonder of the world. Don't do any of this if you aren't moderately or highly advanced in electronics repair and have a strong ability for attention to detail. This information has been provided because I couldn't find the answers I needed.
Overall, I love this thing, even if I facepalmed at the engineering excessively. It was worth it.
Additional information:
If anyone is wondering what processor was installed and why:
I settled on the 8 core E5-2667 V2.
This processor has the highest single core performance index, and has the same amount of cache as the 10 core E5-2690 V2. Meaning more possible cache per core. It can also punch in at 4GHz. It's a clock for clock winner.
12 core was out of the running quickly, having only really pondered the E5-2697 V2 loosely.
Most applications need higher single core scores, and the ones I use are pretty poor at multiprocessor utilization anyway. I'm not using it for rendering. YMMV.
No <115w or 150w processors were considered. The E5-2687W V2 isn't worth the extended TDP when weighted against the design of the thermal core and single fan, for such mild gains. I scored the E5-2667 V2 for about half the cost of that model.
Storage information:
The 6,1 arrived with no ssd. Picked up a reasonably sized official SSUAX ssd, and it was cheap. Good for testing. Going to probably use it another few days then replace it with a 1tb SSUBX. I didn't want to mess with any adapters or anything during rebuild, as it was stuck on 129.0.0.0 at the beginning. Not currently planning on using anything but OEM drives on the inside. I'll support that with solid nvme drives as externals and network attached storage.