I don't believe making huge amounts of RAM (~1 TB) available to the GPU was ever was intended as one of the purposes (points) of UMA, since it would be a rare use case where the GPU would need that much RAM, and Apple doesn't design its architecture for rare use cases.
Why not?
Apple are positioning themselves as a Hollywood 3d production company and access to GPU memory is a massive constraint for high end modelling.
The Mac Pro in itself is aimed at a rare use case. Afterburner is aimed at a rare use case, and they created those cards for the prior Mac Pro and turned them into on-die processors in the M1 Pro/Max.
The only reasons we don't have GPUs with 1 TB of Ram today is due to impracticality and expense - not because it isn't desirable to have the GPU able to work on ANYTHING the CPU has access to as/when required without needing to do data copies around the place.
Even if the GPU itself doesn't need 1Tb, the whole point is that it can process stuff the CPU may need it to process, which could be anywhere inside the 1TB of memory the CPU has.
Apple's developer talks at WWDC explain the purpose of UMA was so the GPU and CPU could have access to the same RAM pool, obviating the need to copy data back and forth between GPU and CPU RAM
You even posted (quoting Apple which backs up my point) that the whole point is that the GPU can access the memory the CPU has. If the CPU has 1 TB (and that's not a massive amount in 2021) then the GPU needs access to that 1 TB for the whole unified model to work.
If you have a seperate island of memory (some fraction of the the 1TB for the CPU) that the GPU can't access in its entirety, you're back to copying data around which is inefficient.