This could be solved with high bandwidth system RAM (e.g. HBM), a large enough cache and more memory controllers. Such approach won’t be cheap, but costs are less of an issue for Apple - their products are already priced at a premium and they don’t have to compete with other chip makers as they produce for themselves only.
Are you referring to TBDR? That’s an interesting topic. From my layman understanding, TBDR renderers didn’t establish themselves in the desktop segment because they are much more complex and because with a larger thermal budget a forward renderer can just brute force its way through. A criticism often brought up with TBDR is poor geometry throughput - less of an issue with mobile applications and their traditionally lower polygon counts, but critical for high-poly PC games. But that was the state of the art ten years ago. Apple seems to have solved it by utilizing the unified shader pipeline - since geometry, compute and fragment processing runs asynchronously on the same hardware, it’s easier to balance out the eventual bottlenecks. As to why Nvidia and co don’t use it - well, probably because they were not interested in this tech. Their stuff works well enough and they did borrow some ideas like tiling (but without deferred fragment shading) to make their GPUs more efficient. Revolutions sometimes come simply because someone has tried (and succeeded) something that others thought would not work. Again, think about MacBook Air or HiDPI screens. Those were laughed at in the beginning.