Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

PortoMavericks

macrumors 6502
Jun 23, 2016
288
353
Gotham City
Word on the street is AMD RT implementation is different then nvidia.
I wonder if a Apple Core is the same as a Shader Engine for AMD.
Ray tracing itself does require additional functional hardware blocks, and AMD has confirmed for the first time that RDNA2 includes this hardware. Using what they are terming a ray accelerator, there is an accelerator in each CU. The ray accelerator in turn will be leaning on the Infinity Cache in order to improve its performance, by allowing the cache to help hold and manage the large amount of data that ray tracing requires, exploiting the cache’s high bandwidth while reducing the amount of data that goes to VRAM.

source: https://www.anandtech.com/show/1620...-starts-at-the-highend-coming-november-18th/2
 

MrGunnyPT

macrumors 65816
Mar 23, 2017
1,313
804
View attachment 975891


Microsoft is claiming that the AMD ML solution uses a very small area on the die.

Also, they’re claiming it’s a DirectX feature but instead is an open solution with Sony on board as well. When I say open, means Nvidia could use it on the tensor cores if they want to.

This looks pretty cool. I'm tempted to get a 6900XT
 

EntropyQ3

macrumors 6502a
Mar 20, 2009
718
824
If it was on par AMD would have said something about it during the presentation.
Thing is, AMDs approach is different from nVidias. Getting to grips with what that means, and how to get the best out of either is work that remains to be done.
All RT game code up until now has been financed and co-written by nVidia. All RT in future AAA titles will target next generation consoles (AMD). It’s very early days for RT as a way of dealing with some lighting issues, and I’m personally not convinced it’s a great idea for consumers. The interest of the graphics IHVs to come up with new stuff to sell doesn’t necessarily align with the consumer interest in energy efficient, cheap and performant graphics.
 

diamond.g

macrumors G4
Mar 20, 2007
11,437
2,665
OBX
wouldn’t those differences be hidden from the developer if they are using DXR (or Metal)? I know Vulcan added NV specific extensions to enable RT (this is what Crysis uses while still using D3D11).
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
I expect nothing stronger than Ryzen’s APUs.

Without dGPUs, Macs are gonna be even more incompetent for graphically demanding tasks.

Already the iPhone 12 has a faster GPU than most Ryzen APUs. We are talking about a 5Watt chip here. Scale up the GPU cores, give them a large cache (not unlike AMD’s Infinity Cache) and reasonably fast RAM, and you can easily compete with any mid-range GPU at half the power draw.
 
  • Like
Reactions: StellarVixen

StellarVixen

macrumors 68040
Mar 1, 2018
3,254
5,779
Somewhere between 0 and 1
Already the iPhone 12 has a faster GPU than most Ryzen APUs. We are talking about a 5Watt chip here. Scale up the GPU cores, give them a large cache (not unlike AMD’s Infinity Cache) and reasonably fast RAM, and you can easily compete with any mid-range GPU at half the power draw.
You might be right, we’ll see.
 

Unregistered 4U

macrumors G4
Jul 22, 2002
10,610
8,629
Already the iPhone 12 has a faster GPU than most Ryzen APUs. We are talking about a 5Watt chip here. Scale up the GPU cores, give them a large cache (not unlike AMD’s Infinity Cache) and reasonably fast RAM, and you can easily compete with any mid-range GPU at half the power draw.
For TBDR, you don’t even need fast RAM. The GPU’s only working on a small section of the image at a time.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
For TBDR, you don’t even need fast RAM. The GPU’s only working on a small section of the image at a time.

You still need bandwidth if you want to break beyond certain performance threshold. Apple won’t have much problems reaching the performance levels of something like the 1650 with LPDDR bandwidth (especially with their state of the art cache technology), but if they want to go faster, they will need faster RAM.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
One more thing, was looking now, GPU inside A14 seems to be on par with RX 480, maybe even RX 580.

Not bad at all.
That’s a bit too good :) where did you look? The A14 iPhone GPU is roughly comparable to an Nvidia 950M. The RX 580 is a completely different weight class.
 

Unregistered 4U

macrumors G4
Jul 22, 2002
10,610
8,629
You still need bandwidth if you want to break beyond certain performance threshold. Apple won’t have much problems reaching the performance levels of something like the 1650 with LPDDR bandwidth (especially with their state of the art cache technology), but if they want to go faster, they will need faster RAM.
I don’t think you can look at what IMR requires and extend that to TBDR solutions. For example TBDR not only deals with part of the screen at once, it only renders/textures actual triangles intended to be visible. IMR solutions draw and texture the entire screen. For example, if your view is of a castle wall, all the scenery beyond the wall is also drawn with IMR solutions, THEN it removes the hidden surfaces before outputting to the screen. That’s why they require so much bandwidth.
 

diamond.g

macrumors G4
Mar 20, 2007
11,437
2,665
OBX
I don’t think you can look at what IMR requires and extend that to TBDR solutions. For example TBDR not only deals with part of the screen at once, it only renders/textures actual triangles intended to be visible. IMR solutions draw and texture the entire screen. For example, if your view is of a castle wall, all the scenery beyond the wall is also drawn with IMR solutions, THEN it removes the hidden surfaces before outputting to the screen. That’s why they require so much bandwidth.
I don’t think we need faster video ram, we need more of it. Otherwise we are going to be stuck with low resolution/detail textures.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
I don’t think you can look at what IMR requires and extend that to TBDR solutions. For example TBDR not only deals with part of the screen at once, it only renders/textures actual triangles intended to be visible. IMR solutions draw and texture the entire screen. For example, if your view is of a castle wall, all the scenery beyond the wall is also drawn with IMR solutions, THEN it removes the hidden surfaces before outputting to the screen. That’s why they require so much bandwidth.

TBDR approach does save a lot of work and memory bandwidth, and it naturally coalesces memory requests, but it's not magic. It's savings are not exponential, they are linear. As you try to make your GPUs faster, the need for moving data on and off the chip will increase. For example, you mention that TBDR only deals with a part of the screen... but that's an oversimplification. A GPU is still a parallel processor. Each GPU core will work a different tile (or even multiple tiles) and so will need to access different areas on the memory. As you increase the number of GPU cores, you have to proportionally increase the available memory bandwidth or your cores will be stalled. So if you want to break a certain performance barrier, faster RAM (or at least tons of ultra-fast cache, like Navi 2 does) is a hard requirement. And of course, TBDR only helps with rasterization performance. Complex geometry, compute tasks, raytracing — all this needs memory bandwidth too.

Besides, let's not dismiss IMR GPUs as some kind of super-dumb devices. They have a lot of sophisticated tech to save memory bandwidth. Memory compression, early depth/stencil rejection, complex cache hierarchies... and of course, modern IMR GPUs also use tiling (working on portions of the screen only) to optimize memory access behavior. Many games do a depth-only render pass anyway, which provides reasonably effective hidden surface removal.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
I don’t think we need faster video ram, we need more of it. Otherwise we are going to be stuck with low resolution/detail textures.

You need both, really. More detailed assets = more RAM needed to store them, more bandwidth needed to use them (mip-mapping takes care of it to some extent, but still)...
 

Unregistered 4U

macrumors G4
Jul 22, 2002
10,610
8,629
As you increase the number of GPU cores, you have to proportionally increase the available memory bandwidth or your cores will be stalled. So if you want to break a certain performance barrier, faster RAM (or at least tons of ultra-fast cache, like Navi 2 does)
It’ll be interesting to see what graphic capabilities Apple provides as their initial Mac offering. That certainly won’t utilize fast RAM BUT the level of performance could reset folks’ expectation of TBDR.

Besides, let's not dismiss IMR GPUs as some kind of super-dumb devices. They have a lot of sophisticated tech to save memory bandwidth. Memory compression, early depth/stencil rejection, complex cache hierarchies... and of course, modern IMR GPUs also use tiling (working on portions of the screen only) to optimize memory access behavior. Many games do a depth-only render pass anyway, which provides reasonably effective hidden surface removal.
No, they’re not super-dumb. I’d wager than TBDR is “dumber” as it’s simpler. With IMR there are a lot of things that need to be done and done FAST (huge bandwidth) when you’re dealing with the entire screen at once (while also decreasing the need as much as possible for expensive power hungry fast ram). Even the games factor into it as they’re spending additional processing time performing a depth check that wouldn’t be required for TBDR. There’s a LOT here that doesn’t apply directly to TBDR.

I don’t think we need faster video ram, we need more of it. Otherwise we are going to be stuck with low resolution/detail textures.
Interesting point, I need to watch Apple’s presentation again as I pick something new out of it each time.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
It’ll be interesting to see what graphic capabilities Apple provides as their initial Mac offering. That certainly won’t utilize fast RAM BUT the level of performance could reset folks’ expectation of TBDR.

I agree. 3DMark Wild Life ( a benchmark I tend to trust, because it does exactly the same work across the platform) suggest that the 4-core GPU in the iPhone is a match for any AMD APU (and significantly faster than the Iris plus in the current 13" MBP). A higher-clocked, more core version of that will definitely be a big upgrade for Apple's entry-level systems.

No, they’re not super-dumb. I’d wager than TBDR is “dumber” as it’s simpler. With IMR there are a lot of things that need to be done and done FAST (huge bandwidth) when you’re dealing with the entire screen at once (while also decreasing the need as much as possible for expensive power hungry fast ram). Even the games factor into it as they’re spending additional processing time performing a depth check that wouldn’t be required for TBDR. There’s a LOT here that doesn’t apply directly to TBDR.

Not really. From engineering (and system design) perspective, TBDR is much more complicated. With TBDR, you need to deal with binning, tile rasterization (while tracking the front-most primitive for each pixel), pixel shading while fetching primitive data for each pixel... all while treating tricky corner cases like transparent pixels and overflowing tile buffers. Basic IMR rendering is much more simpler in comparison: you fetch a primitive, you rasterize it, you shade the rasterized pixels, done.

And of course a TBDR GPU needs to do a depth check — how would they detect which objects are in the front and which are in the back otherwise? If you mean a depth pre-pass instead, then yes, TBDR GPU doesn't benefit from it (actually, it will suffer a performance penalty from it), but there are other things that developers need to keep in mind. Like correctly annotating render pass attachments or drawing transparent objects in a proper order.
 
  • Like
Reactions: Brazzan

Unregistered 4U

macrumors G4
Jul 22, 2002
10,610
8,629
Not really. From engineering (and system design) perspective, TBDR is much more complicated. With TBDR, you need to deal with binning, tile rasterization (while tracking the front-most primitive for each pixel), pixel shading while fetching primitive data for each pixel... all while treating tricky corner cases like transparent pixels and overflowing tile buffers. Basic IMR rendering is much more simpler in comparison: you fetch a primitive, you rasterize it, you shade the rasterized pixels, done.
Yeah, “simple” and “complex” can only ever be relative when considering complex things like modern GPU’s. :) In the above, I’d say that the goals of TBDR and IMR, to get an image to the screen, are identical. But, the systems they’re built into and the price, power, and speed requirements of THOSE are more likely to define what’s possible. I mean, I don’t think we’ve seen a high performance mass produced TBDR part in awhile (outside of what Apple’s been doing, I mean).
 

jeanlain

macrumors 68020
Mar 14, 2009
2,463
958
I agree. 3DMark Wild Life ( a benchmark I tend to trust, because it does exactly the same work across the platform) suggest that the 4-core GPU in the iPhone is a match for any AMD APU (and significantly faster than the Iris plus in the current 13" MBP). A higher-clocked, more core version of that will definitely be a big upgrade for Apple's entry-level systems.
I couldn't find results from AMD APUs. The 3DMark database is broken. The search filters don't appear to work.

EDIT: ok, I can filter out the best results and find devices that yield results that are close to the A14 results reported on other sites. Indeed, the best mobile AMD are about as fast as the A14 (score of 8600). The intel iris are nowhere to be seen.
 
Last edited:

Unregistered 4U

macrumors G4
Jul 22, 2002
10,610
8,629
I thought with TBDR there is no more video RAM.
That’s correct, in Apple Silicon Macs, there’s one memory location, like on iPhones, where the GPU and CPU both access. They call it Unified Memory.

It’s rumored that they will come with at least 16 Gigs which would be shared between the CPU/GPU.
 

dmccloud

macrumors 68040
Sep 7, 2009
3,146
1,902
Anchorage, AK
I don’t think we need faster video ram, we need more of it. Otherwise we are going to be stuck with low resolution/detail textures.

Throwing more RAM at the problem is pointless if you can't swap data in and out of it fast enough to keep up with the system demands. If you have a parking lot that can hold 1000 cars, it would take forever to fill up if only one car could enter or leave at a time. If the parking lot next door holds the same number of cars but allows 4 cars at a time instead of one, everyone would be using it instead since there is less of a wait to either enter or exit the lot.
 
  • Like
Reactions: leman
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.