Thanks for sharing the couple of links but I am too afraid to click them because I am going to spend the rest of my day figuring out TBDR. I've never ever got into the Apple side of development back in the day and as much as I am tempted I am going to refrain as I have too many things to do this week.
Judging from the response of Otoy when they announced the PR1 of Octane-X, they were impressed with the gains achieved from the metal API and still had more optimisations to do. I am guessing that the metal API is one half of the GPU equation for Apple. Otoy are going to release Octane-X to be utilised on Iphone/Ipad. The idea is that you can offload rendering on to the iPad. So what does that indicate about the metal API and AS? The degree of complexity of what was being rendered wasn't specified by Otoy.
The other part of me (wishful thinking) is the Afterburner card. The card has an open API that any software vendor can use to execute computations on the card. Perhaps this is maybe the foundation of an apple discrete GPU in the future. I haven't dug into the Afterburner API either.
I'm no expert either but TBDR relies on pixel shader effects but its weakness is high polygon counts: "Any game developer looking to put out a successful title is going to make sure it runs well on iOS hardware. Game developers will likely rely on increasing visual quality through pixel shader effects rather than ultra high polygon counts."
"Immediate mode renderers (IMRs) brute force the problem of determining what to draw on the screen. They take polygons as they receive them from the CPU, manipulate and shade them. The biggest problem here is although data for every polygon is sent to the GPU, some of those polygons will never be displayed on the screen. A character with thousands of polygons may be mostly hiding behind a pillar, but a traditional immediate mode renderer will still put in all of the work necessary to plot its geometry and shade its pixels, even though they'll never be seen. This is called overdraw. Overdraw unfortunately wastes time, memory bandwidth and power - hardly desirable when you're trying to deliver high performance and long battery life. Immediate mode renderers work in a very straightforward manner. They take vertices, create polygons, transform and light those polygons and finally texture/shade/blend the pixels on them.
Tile based deferred renderers take a slightly different approach. TBDRs subdivide the scene into smaller tiles on the order of a few hundred pixels. Vertex processing and shading continue as normal, but before rasterization the scene is carved up into tiles. This is where the deferred label comes in. Rasterization is deferred until after tiling and texturing/shading is deferred even longer, until after overdraw is eliminated/minimized via hidden surface removal (HSR). Hidden surface removal is performed long before we ever get to the texturing/shading stage. If the frontmost surface being rendered is opaque, there's absolutely zero overdraw in a TBDR architecture. Everything behind the frontmost opaque surface is discarded by performing a per-pixel depth test once the scene has been tiled. In the event of multiple overlapping translucent surfaces, overdraw is still minimized. Only surfaces above the farthest opaque surface are rendered. HSR is performed one tile at a time, only the geometry needed for a single tile is depth tested to keep the problem manageable.
With all hidden surfaces removed then, and only then, is all texture data fetched and all pixel shader code executed. Rendering (or more precisely texturing and shading) is deferred until after a per-pixel visibility test is passed. No additional work is expended and no memory bandwidth wasted. Only what is visible in the final scene is rasterized, textured and shaded on each tile. The application doesn't need to worry about the order polygons are sent for rendering when dealing with a TBDR, the hidden surface removal process takes care of everything. In memory bandwidth constrained environments TBDRs do incredibly well. Furthermore, the efficiencies of a TBDR really shine when running applications and games that are more shader heavy rather than geometry heavy. As a result of the extensive hidden surface removal process, TBDRs tend not to do as well in scenes with lots of complex geometry."
www.anandtech.com