Apple Silicon is nothing like the trays in a Mac Pro. For Unified Memory to work, all GPUs and all CPUs in the system would need to use the same bank of RAM. Putting different RAM on different trays doesn't solve that. You're just putting multiple Mac Minis in a single case. Nothing makes that unified.
This a bit over the top of what Unified memory means or not. DIMM cards are RAM modules on different daughter logic boards . Using DIMMs doesn't necessarily "disallow" unified memory.
The irregularity that federated memory implementations brings constraints that a relatively high performance GPU probably doesn't want to have to deal with if out to maximize "frame rate refresh" and a few other GPU specific functionality.
Can still have unified memory if the nodes of the system are separated by distance. Non Uniforma Memory access (NUMA ) adds synchronization overhead to cache coherency and/or write updates it is still can be a "flat" uniform memory access if the memory mapping ( MMUs ) is all coherent across the nodes. It isn't maximum Perf/Watt though. All that "long distance" travel is going to soak up more power than a more physically compact system. ( Ryzen (with iGPU) can be Uniform but it isn't a Perf/Watt efficient. )
Unified goes back to how the bus(es) that carry the memory input/output are coordinated and "unified". A modern four package Xeon SP system still has unified memory across all the CPU cores even though there are four CPU sockets and an order of magnitude more DIMM slots. Each package has a UPI network between them and an internal ring networks to move memory requests and coherency information around so that all can access.
Basically it's adding a bunch of discrete SoCs into the same box, and then handwaving and going "and unification magic happens somehow."
The different nodes off on different "trays" isn't the important part. Lack of a bus to do requests and transfers on would be an issue.
Apple could use something like Infinity Fabric to link multiple SoCs together, like AMD is doing. But that's not true unified memory space, it just makes the CPUs and GPU's act like they're in a single unified space. Still interesting, but it's not the same performance outcome.
Again, Infinity Fabric isn't a "dis qualifier" in and of itself. Can lead to some NUMA impacts , but NUMA is tangential to "Unified". How the memory data traffic bus transparently routes requests is bigger factor.
A different performance outcome doesn't make something "unified". That is a tangent dimension of measure. Can be unified and slower performance. Both Unified and Performance have slightly different set of trade-offs. Go "too far" on one , then can start to loose the other.
It's also not really necessary. The rumored "quad" M1 Max chip will be plenty hot and plenty fast. And probably extremely expensive.
Likely collectively hot, but the components are likely going to be relatively low ( relation to the total ) so that it allows Apple to dense pack the components. Not the cheaper PCB mounting that AMD is doing on Ryzen non-iGPU packages, but much , much closer together. Part of that trade off is probably maximizing scale out to four ( or maybe just 2 for M1 if abandon Jade4C in the first M-series iteration. ).