Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

ChromeCloud

macrumors 6502
Jun 21, 2009
359
840
Italy
People always harped on AMD for being "hot and loud" but now there's nothing but silence about the fact that their lowest power card announced draws damn near the amount of power their high-end card did two gens ago...
It's about performance per watt. AMD cards were criticized as being hot and loud because they were drawing more power while also being slower than competing Nvidia cards.
Isn't it a pastime to roast Intel over making nuclear furnace processors? Despite the fact that they still hold the performance crown in games? What makes NVidia different?
Again, what makes Nvidia different from Intel is performance per watt. Quoting directly from their press website: "The GeForce RTX 30 Series, NVIDIA’s second-generation RTX GPUs, deliver up to 2x the performance and 1.9x the power efficiency over previous-generation GPUs."

Intel has been struggling for years now trying to improve the efficiency of their CPUs and therefore was able to bring only modest performance upgrades with each new generation of processors while at the same time substantially rising the TDP under full load.
And for $600.00 WAOW!
The MSRP of the Nvidia 3070 is actually 499$. And it genuinely looks like a good deal considering the level of performance declared by Nvidia.
 

Boil

macrumors 68040
Oct 23, 2018
3,477
3,173
Stargate Command
The interesting thing is that this year's Apple Metal ships with a rather feature-complete raytracing support, including goodies like dynamic linking of shader functions, user-programmable intersection shaders and recursive shader execution. Right now it runs in "software" (GPU-accelerated of course), but it seems to suggest that hardware raytracing is coming.

Octane X for macOS

Which also runs REALLY fast on an iPhone 11, so make of that what you will... ;^p

I do not game but I use several Pro 3D software, some of this 3D software also have a mobile version(AutoCAD, FormZ, Rhino, ecc), I‘ve performed several test a few years ago(probably late 2015) using the first gen iPadPro and a maxed out MacPro 6.1, the results were better on the iPadPro, butter smooth scrolling in AutoCAD, and smooth spinning of 3D files in FormZ/Rhino. Of course when dealing with large files the iPadPro reach the limit much easier compared to a desktop but it’s mostly due to the small RAM/VRAM amount. I’m pretty confident that even an integrated GPU will do a more than decent job with 3D handling and after all ther’s no need to guess... we already seen that from the latest keynote, there was a Maya production scene running smoothly(and in emulation!) on the A12 equipped developer kit.

See above, Octane X is a ripper, and its performance on an A-series chip (iPhone 11) is incredible...

multi-gpu hasn’t been a popular solution to increasing graphics performance (as Apple found out with the trash can Mac Pro) for a while now.
And if you look at the number of shader processors it has doubled (2080 to 3080). Even the 3070 has more shader processors than the 2080.

The problem with the Trashcan was that the second GPU was set up as a GPGPU, compute-centric over raster, but software was not taking advantage of that yet...?

I guess the ability to run two dual chip GPUs in the Mac Pro means nothing...?
 
  • Like
Reactions: 2Stepfan

leman

macrumors Core
Oct 14, 2008
19,516
19,664
As multiple people mentioned here, Ampere effectively doubles the amount of CUDA cores. Some new information was released today, so now it’s clear how they do it. As usual, it’s mostly marketing.

You see, current Nvidia GPUs are ultimately organized around processing units (four such units make up a quad-code streaming multiprocessor and the GPU is then built from these quad-core units). In Turing, a single processing unit contains a 16-wide FP32 ALU and a 16-wide INT32 ALU. This allows it to process floating point and integer (e.g. address calculation) operations simultaneously. In Ampere, Nvidia simply “taught” the INT32 ALU to do FP32 math. So on paper a processing unit can now do either a FP32+INT32 or a FP32+FP32 operation. This indeed doubles the peak theoretical performance, but it’s very far from magical performance doubling Nvidia marketing suggests. A lot of games are FP32 heavy though, so this design definitely makes sense. I’d expect an improvement of 25-30% on average, more in games with very heavy shader math.

To sum it up, no, they didn’t double the amount of cores. The cores were there already. They just taught them some new tricks. Looking forward to real world benchmarks.

P.S. And now it turns out that 8K@60FPS gaming is actually ML-upscale 1440p gaming..
 

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,658
OBX
As multiple people mentioned here, Ampere effectively doubles the amount of CUDA cores. Some new information was released today, so now it’s clear how they do it. As usual, it’s mostly marketing.

You see, current Nvidia GPUs are ultimately organized around processing units (four such units make up a quad-code streaming multiprocessor and the GPU is then built from these quad-core units). In Turing, a single processing unit contains a 16-wide FP32 ALU and a 16-wide INT32 ALU. This allows it to process floating point and integer (e.g. address calculation) operations simultaneously. In Ampere, Nvidia simply “taught” the INT32 ALU to do FP32 math. So on paper a processing unit can now do either a FP32+INT32 or a FP32+FP32 operation. This indeed doubles the peak theoretical performance, but it’s very far from magical performance doubling Nvidia marketing suggests. A lot of games are FP32 heavy though, so this design definitely makes sense. I’d expect an improvement of 25-30% on average, more in games with very heavy shader math.

To sum it up, no, they didn’t double the amount of cores. The cores were there already. They just taught them some new tricks. Looking forward to real world benchmarks.

P.S. And now it turns out that 8K@60FPS gaming is actually ML-upscale 1440p gaming..
You have a link for the upscaling 8k?
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
You have a link for the upscaling 8k?

Straight from Nvidia:




explained here:


It's still one hell of a powerful GPU of course, no question. But let's not bite into marketing propaganda more than we have to.
 
Last edited:

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,658
OBX
Straight from Nvidia:




explained here:


It's still one hell of a powerful GPU of course, no question. But let's not bite into marketing propaganda more than we have to.
I dislike those Nvidia slides. They seem to be mixing titles that use DLSS and ones that don’t. Never admitting that the DLSS titles are being upscaled.
Game play “benchmarks” should clear things up.
 

Kostask

macrumors regular
Jul 4, 2020
230
104
Calgary, Alberta, Canada
multi-gpu hasn’t been a popular solution to increasing graphics performance (as Apple found out with the trash can Mac Pro) for a while now.
And if you look at the number of shader processors it has doubled (2080 to 3080). Even the 3070 has more shader processors than the 2080.

Correction:

"multi-gpu hasn’t been a popular solution to increasing graphics performance (as Apple found out with the trash can Mac Pro) for a while now."

It actually was for quite a while on the PC side of things. That is why SLI and Crossfire exist in the first place. It started declining when single GPUs became powerful enough to not need the extra help. It was very popular for a number of years on the higest end gaming rigs. We also saw a brief burst of cards with 2 GPUs on board. And yes, I am talking about PCs, because we aren't going to see nVidia's RTX 30X0 on Macs any time soon.

The implementation on the trashcan MacPro was just not good. Simple as that. There are multi-GPU implementations on the current MacPro that seem to work well. Again, doesn't matter, because there will not be any nVidia GPUs on Macs.

I can see AMD pushing Crossfire with two or more of their Big Navi GPUs, depending on how they actually perform. Most PC gaming motherboards have more than one GPU slot, still. If the Big Navi GPUs can perform competitively with the RTX 3070, a pair of them may be able to compete with an RTX 3080ti or even a 3090. The current thought on the Flagship RX6000 GPU is that it will come in a $549, so two of them at about $1100 vs. $1500 for an RTX 3090. It all depends on what Big Navi performs like.
 

tdar

macrumors 68020
Jun 23, 2003
2,102
2,522
Johns Creek Ga.
Well we have had Apple Silicon on iPads for a while now, with the GPU using shared memory with the CPU, so it's not all that new. The scaling to desktop components it's going to be interesting though and likely well ahead of the competition but calling it a paradigm shift is a bit much IMHO.
John Scully former Apple CEO , who you can bet still has contacts with in the company has said AS will deliver 25 times the performance of the fastest intel chips
.
Apple GPU , when a pixel has to be changed on the display, only changes that one pixel. Every other provider has to redraw the entire screen.
If those are not paradigm shifts, I don’t know what is.
 
  • Like
Reactions: 2Stepfan

leman

macrumors Core
Oct 14, 2008
19,516
19,664
John Scully former Apple CEO

((He was never Apples or Intel's CEO. He was a manager at Intel. He was hover considered for CEO position by Intel once.))

Edit: My mistake, I was thinking about Johny Srouji

Apple GPU , when a pixel has to be changed on the display, only changes that one pixel. Every other provider has to redraw the entire screen.
If those are not paradigm shifts, I don’t know what is.

No, it redraws the entire screen like everyone else.

Where did you get these “facts” anyway?
 
Last edited:

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,658
OBX
John Scully former Apple CEO , who you can bet still has contacts with in the company has said AS will deliver 25 times the performance of the fastest intel chips
.
Apple GPU , when a pixel has to be changed on the display, only changes that one pixel. Every other provider has to redraw the entire screen.
If those are not paradigm shifts, I don’t know what is.
How would that help improve rendering performance of say Cyberpunk 2077? I can see that being useful for website rendering where a large amount of what is displayed is whitespace and unchanging.
 

Andropov

macrumors 6502a
May 3, 2012
746
990
Spain
John Scully former Apple CEO , who you can bet still has contacts with in the company has said AS will deliver 25 times the performance of the fastest intel chips

John Sculley left Apple in 1993 so anything he saids about Apple now is largely irrelevant. Besides, desktop CPU performance was increasing at about 22% per year last decade (it's even slower now), so for Apple CPUs to be 25 times more performant it would mean they have a 15 year lead over everyone else. As good as they CPUs may be (and I think they'll be very, very good) you're off by an order of magnitude.

Apple GPU , when a pixel has to be changed on the display, only changes that one pixel. Every other provider has to redraw the entire screen.
If those are not paradigm shifts, I don’t know what is.

That's not really how it works. Besides, we've had Apple GPUs on iPads for years now and while they're good it's not a "paradigm shift".
 

dmccloud

macrumors 68040
Sep 7, 2009
3,138
1,899
Anchorage, AK
Your biggest problem with Apple silicon will be lack of software you can actually use. It’ll be more like a chromebook running iPad type apps which could be fine for the masses. Will be locked down to App Store to load anything anyways.

It doesn't even work that way on the Developer Transition Kit...
 

jinnyman

macrumors 6502a
Sep 2, 2011
762
671
Lincolnshire, IL
John Scully former Apple CEO , who you can bet still has contacts with in the company has said AS will deliver 25 times the performance of the fastest intel chips
.
Apple GPU , when a pixel has to be changed on the display, only changes that one pixel. Every other provider has to redraw the entire screen.
If those are not paradigm shifts, I don’t know what is.
Don't underestimate how competitive GPU market once was. That TBDR concept, I believe, is not something new coming out recently.

What's good if you have to constantly change every pixel such as real time rendering?
The level of performance shown by RTX 3080 and 3090 seems staggering.
If one concept is truly superior to others, industries usually adapt to that. I'm not competent in computer graphics and rendering, but by that description only, there must be some kind of line that make one way to rendering more beneficial to others. What if you are playing games where every pixcel has to be redrawn 120 times a sec? What if some graphic designer has to render their art work? Based on how much and how fast pixel has to draw there must be the line.

I see Apple's TBDR can be so good at power consumption and efficiency. If you are working on iPad where most of screen stays the same (like web pages or home screen), that method gotta be less power consuming. What if we have to go full power? nVidia must have considered all possibilities and chosen their method because that's the best /most efficient way for their target demographics?
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
Don't underestimate how competitive GPU market once was. That TBDR concept, I believe, is not something new coming out recently. [...] If one concept is truly superior to others, industries usually adapt to that.

TBDR is certainly not new, but it's more complicated and therefore more challenging to implement. Historically, there was a split where mobile went with TBDR (because of it's superior energy efficiency) while desktop went with forward rendering (because it was easier to implement and scale, while energy efficiency was a secondary concern). TBDR traditionally suffered from front-end bottleneck (geometry thoughtput), so it was optimal for mobile games which featured lower-quality graphics anyway.

As to industries adapting... if you have literally decades of R&D invested into a particular approach and it works very well, not much point in pursuing an alternative with unclear benefits. Apple is in a slightly different position, since they currently have fastest TBDR GPUs on the market — it makes perfect sense for them to build on that technology.

In the end, TBDR comes from the need to do "more with less" — it was developed to overcome the memory bandwidth limitations on slower, power-constrained hardware. This is less important in a desktop (or even laptop) situation. And TBDR is certainly not a magic bullet. Yes, it allows one to get better performance with slower memory. But this will only apply to games. It won't do anything for compute tasks for example. What is interesting about Apple's approach is not that they are leveraging TBDR. It is that they are opening the architectural details to the developers. You have direct access to the on-chip tile caches, which allows one to code some algorithms much more efficiently than it would ever be possible with forward rendering.


The level of performance shown by RTX 3080 and 3090 seems staggering.

And so is the power consumption. Some early benchmarks suggest that the 3080 might be somewhere between 60-80% faster than the 2080... but it also consumes 50% more power! If you normalize for power usage, you get about 10-20% improvement... which is still great but suddenly not as impressive looking to an average gamer. Which really tells you about what sells at the market. Current Ampere designs are deliberately designed to be power-hungry monsters simply because people will buy that.

I think it's a bit sad that the outstanding engineering achievements are banalized by the fact that the biggest gains are still because of fatter, more power-hungry chips.

P.S. By the way, one particular thing where I am grateful to Apple is that they didn't join the other with all the "shader core" nonsense. They just say — "our GPUs have 8 cores and that's it" instead of "512 cores" like everyone else does.
 
  • Like
Reactions: Andropov

tevion5

macrumors 68000
Jul 12, 2011
1,967
1,603
Ireland
  • Apple Arcade, while great on paper, has a selection of games that is very skewed toward kids and family minigames.

As a "proper" gamer I totally get what you're saying here but in Apple's defence this is not an invalid entry into the gaming market.

Back in the mid 2000s when my go-to systems were the PSP and the PS3 I was always confused by how popular the Nintendo DS and Wii were despite the overwhelming amount of their libraries being very family/kid oriented, compared to the "serious gamer" games on Playstation, Xbox and PC. The Wii also heavily outsold either console in that generation.

I doubt many would question the validity of Nintendos gaming credentials. And I later discovered a lot of those goofy looking Nintendo games are actually a lot of fun!

That said I'd obviously be able to replace my windows gaming experience for games like Fallout 4 or The Witcher III with MacOS but I really doubt that's going to happen anytime soon.
 
  • Like
Reactions: ChromeCloud

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,658
OBX
TBDR is certainly not new, but it's more complicated and therefore more challenging to implement. Historically, there was a split where mobile went with TBDR (because of it's superior energy efficiency) while desktop went with forward rendering (because it was easier to implement and scale, while energy efficiency was a secondary concern). TBDR traditionally suffered from front-end bottleneck (geometry thoughtput), so it was optimal for mobile games which featured lower-quality graphics anyway.

As to industries adapting... if you have literally decades of R&D invested into a particular approach and it works very well, not much point in pursuing an alternative with unclear benefits. Apple is in a slightly different position, since they currently have fastest TBDR GPUs on the market — it makes perfect sense for them to build on that technology.

In the end, TBDR comes from the need to do "more with less" — it was developed to overcome the memory bandwidth limitations on slower, power-constrained hardware. This is less important in a desktop (or even laptop) situation. And TBDR is certainly not a magic bullet. Yes, it allows one to get better performance with slower memory. But this will only apply to games. It won't do anything for compute tasks for example. What is interesting about Apple's approach is not that they are leveraging TBDR. It is that they are opening the architectural details to the developers. You have direct access to the on-chip tile caches, which allows one to code some algorithms much more efficiently than it would ever be possible with forward rendering.




And so is the power consumption. Some early benchmarks suggest that the 3080 might be somewhere between 60-80% faster than the 2080... but it also consumes 50% more power! If you normalize for power usage, you get about 10-20% improvement... which is still great but suddenly not as impressive looking to an average gamer. Which really tells you about what sells at the market. Current Ampere designs are deliberately designed to be power-hungry monsters simply because people will buy that.

I think it's a bit sad that the outstanding engineering achievements are banalized by the fact that the biggest gains are still because of fatter, more power-hungry chips.

P.S. By the way, one particular thing where I am grateful to Apple is that they didn't join the other with all the "shader core" nonsense. They just say — "our GPUs have 8 cores and that's it" instead of "512 cores" like everyone else does.
Has Apple ever shown an architectural diagram, like the ones below showing what each core looks like?
NVIDIA-GeForce-RTX-30-Series-Deep-Dive_RTX-3080_RTX-3090_RTX-3070_Ampere-GA102_Ampere-GA104_GPU_Graphics-Cards_4.png

03737c08-7540-4a78-940e-a660ca7fdebf.PNG

I think I asked before what an Apple GPU core looks like. I can’t seem to find a similar block diagram that shows what Apple calls a core. And it isn’t clear if they kept the Unified Shader Cluster design from PVR and are just calling it a core now.

Average gamers aren’t buying a 3080. They are probably going to get a 3070 which only consumes 5 more watts of power over the 2080 and less power than the 2080 Super or TI for the same performance (let’s assume Nvidia isn’t lying).
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
Has Apple ever shown an architectural diagram, like the ones below showing what each core looks like?

I don't think they did. Overall they are quite secretive with all this stuff. Looking at API constants etc. it seems that an Apple GPU core contains two 32-wide ALUs, but I haven't seen anything more detailed.

Average gamers aren’t buying a 3080. They are probably going to get a 3070 which only consumes 5 more watts of power over the 2080 and less power than the 2080 Super or TI for the same performance (let’s assume Nvidia isn’t lying).

That's why I am very interested in seeing some real-world benchmarks for the 3070. That GPU should be able to give us a clear understanding on how much the architectural changes between Turing and Ampere have impacted performance.
 

Andropov

macrumors 6502a
May 3, 2012
746
990
Spain
TBDR is certainly not new, but it's more complicated and therefore more challenging to implement. Historically, there was a split where mobile went with TBDR (because of it's superior energy efficiency) while desktop went with forward rendering (because it was easier to implement and scale, while energy efficiency was a secondary concern). TBDR traditionally suffered from front-end bottleneck (geometry thoughtput), so it was optimal for mobile games which featured lower-quality graphics anyway.

Interesting. Do you have any sources to read further on this?
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
Interesting. Do you have any sources to read further on this?

For TBDR architecture, Imagination blogs are where I have most of my information from: you can start from this and read the rest. If you are asking about information on GPU history, what I wrote is basically a compilation of bits and pieces that I read here and there + added some of my own thoughts, so unfortunately, I can't give you an authoritative link...
 
  • Love
Reactions: Andropov

endlessike

macrumors member
Jun 8, 2010
80
72
John Scully former Apple CEO , who you can bet still has contacts with in the company has said AS will deliver 25 times the performance of the fastest intel chips

You don't actually believe that do you? C'mon
 

matrix07

macrumors G3
Jun 24, 2010
8,226
4,895
John Scully former Apple CEO , who you can bet still has contacts with in the company has said AS will deliver 25 times the performance of the fastest intel chips
.
Apple GPU , when a pixel has to be changed on the display, only changes that one pixel. Every other provider has to redraw the entire screen.
If those are not paradigm shifts, I don’t know what is.

I believe you’re confused % with “times”.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.