We've both agreed that AMD and Nvidia have immediate mode GPUs. That's fundamentally different than the work a deferred GPU can optimize out. AMD and Nvidia don't make deferred mode GPU. Deferred mode is not one of the optimizations they've made.
If crawl down deep into the "mud" of any particular GPU vendor's implementation they all have branches off the 'pure' basic modes.
"...
NVIDIA Maxwell/Pascal/Turing GPUs doesn't have PowerVR's "deferred tile render" but it has immediate mode tile cache render.
...
...
AMD Vega Whitepaper:
The Draw-Stream Binning Rasterizer (DSBR) is an important innovation to highlight. It has been designed to reduce unnecessary processing and data transfer on the GPU, which helps both to boost performance and to reduce power consumption.
...
PowerVR's deferred tile render is patent heavy. "
AMD Radeon VII Detailed Some More: Die-size, Secret-sauce, Ray-tracing, and More
I concur, however I was pointing out that the IMC has less consequences in a TBR & L2-ROP design. AMD would certainly be able to clock the gpu higher in case they integrated TBR, but also most of Nvidia's advantage is due to r:w amplification through TBR, not frequency alone. They can only write...
www.techpowerup.com
Even between Nvidia's 'Tile caching' and AMD DSBR there will be L2/L3 hit rate difference on Tile size variances. If crawl too far down into the weeds all implementations are different. AMD/Nvidia not trying to exactly implement TBDR because don't want a patent war.
But using a cache to avoid trips to memory in general... that is doable by anybody that puts in the effort. Whether there is a special cache chunk with "tile" tweaked content replacement parameters, or not , kind of missing the forest for a tree.