Correction: in some of his videos Max Yuryev has baselessly speculated that MMU TLB bottlenecks etc. He's not a good source for technical information, he's a youtube talking head who doesn't even seem to really understand what a TLB is, much less how it could affect graphics performance.Thanks for pointing that out, it is very informative. The last slide mentioned that XCode has several new performance counters related to the MMU (Memory Mapping Unit), which is the hardware block on a CPU that translates logical to physical addresses. Those were MMU Limiter, MMU Utilization Counter, and MMU TLB Miss Rate. TLB = Translation Lookaside Buffer, which is the MMU's cache for recent logical-to-physical translations.
The speaker threw that in at the last moment of his talk, with no other explanation. However it implies that besides the normal GPU-related bottlenecks, there are possible MMU-related bottlenecks the programmer must be aware of. In some of his videos Max Yuryev has speculated that MMU TLB bottlenecks may be limiting GPU scalability on the M1 Max and Ultra.
You shouldn't take the existence of performance counters for various pieces of the system as corroboration for Yuryev's ideas.