Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

johngwheeler

macrumors 6502a
Original poster
Dec 30, 2010
639
211
I come from a land down-under...
There appears to be a fair bit of skepticism that Apple Silicon can achieve decent graphics performance without a discrete GPU, with the implied unsuitably of new Macs for games or intensive graphical applications.

Given that both the upcoming XBox System X and Playstation PS5 will have powerful GPUs on a SoC, I am curious as to why some people think that Apple won't be able to do the same?

I understand that the Sony and Microsoft consoles will be running their AMD SoCs at a high TDP (maybe >200W?), but this wouldn't be a problem for the iMac which already can run Intel Xeon + Radeon Pro GPU combinations with over 300W TDP

If the XBox X will allegedly have performance close to an NVidia 2080 Ti on a SoC, then what would stop Apple doing the same?

Obviously for the laptops a lower TDP would be needed. I think current MBP16s have a TDP of about 100W for the combined CPU + dGPU, but this should still mean that it would be possible to have a pretty powerful 50-70W TDP GPU on the SoC, similar in capability to the current AMD dGPUs

So I don't understand why there is doubt that it's possible to have a powerful GPU on Apple Silicon.

Or is it simply a lack of confidence that Apple has the expertise in this area to build one, compared to AMD & NVidia who have been producing GPUs for longer?
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
So I don't understand why there is doubt that it's possible to have a powerful GPU on Apple Silicon.

Because people like their preconceived notions and because looking at details is difficult. It’s much easier to operate with notions like “iGPU slow, dGPU fast” then try to understand the technology and what actually makes things slow or fast. In other words, the answer to your question is ignorance and laziness.
 
Last edited:

Apple Knowledge Navigator

macrumors 68040
Mar 28, 2010
3,690
12,911
Possibly very little, but do consider the machines that this would actually be practical for.

Take for instance the Mac Pro. Sure, Apple could put a similar SoC inside that product, but then as a business they need to scale their profits up and down. The selling point of the Mac Pro is that it's very modular, enabling the customer to customise the computer to their needs over time. That said, Apple would benefit more by continuing to sell additional MPX modules with different levels of graphics power, because the profit margins would be far greater than manufacturing their best possible SoC for every machine.

And as has been pointed out numerous times, not every 'power' customer requires lots of graphics compute power. Therefore, Apple will likely design some fantastic CPUs no-doubt, but I don't believe that the embedded graphics will be anything in that power range.
 

johngwheeler

macrumors 6502a
Original poster
Dec 30, 2010
639
211
I come from a land down-under...
Possibly very little, but do consider the machines that this would actually be practical for.

Take for instance the Mac Pro. Sure, Apple could put a similar SoC inside that product, but then as a business they need to scale their profits up and down. The selling point of the Mac Pro is that it's very modular, enabling the customer to customise the computer to their needs over time. That said, Apple would benefit more by continuing to sell additional MPX modules with different levels of graphics power, because the profit margins would be far greater than manufacturing their best possible SoC for every machine.

And as has been pointed out numerous times, not every 'power' customer requires lots of graphics compute power. Therefore, Apple will likely design some fantastic CPUs no-doubt, but I don't believe that the embedded graphics will be anything in that power range.

Your points are valid but I wasn’t really asking this question! I’m sure that Apple will create various levels of SOC with different GPU capabilities, depending on the model and price point.

my question was whether there is any underlying problem with creating a powerful GPU on an SOC given that Sony and Microsoft appear to have done just this.

To combine both points I think that Apple will probably produce a reasonably good GPU in the first Apple silicon Macs, that will be good enough to impress but not so outstanding that it will eclipse existing dGPUs.
 
  • Like
Reactions: UnbreakableAlex

ChromeCloud

macrumors 6502
Jun 21, 2009
359
840
Italy
I totally agree, the skepticism is mostly unfounded.

Let's talk numbers by comparing Metal GPU performance figures...

Fastest Apple GPU to date (current iPad Pro) - Apple A12Z --> 9105.
Fastest GPU option available for current Macbook Pro 13" - Intel Iris Plus --> 8499 (7% slower).
Fastest GPU option available for current MacBook Pro 16" - AMD Radeon Pro 5600M --> 40714 (3.5x faster).
Fastest GPU option available for current iMac 27" - AMD Radeon Pro Vega 48 --> 49589 (4.4x faster).

So basically Apple needs roughly a 4x jump in performance to match the current dedicated GPUs offerings.

A 4x increase in performance over the A12Z seems perfectly achievable considering that the A12Z is a slightly revised two year old SOC (A12X) designed to work within the thermal constraints of the iPad Pro.

When we talk about CPU performance, the A13 found in the iPhone 11 is already faster in single core performance than the fastest available Intel processor you can get in a Mac (both laptops and desktops), so Apple has already proved it can match and exceed (and even embarrass) its competition when it comes to CPU performance.

But Apple still has to prove it can do the same for the GPU performance (even if I'm really convinced they will succeed, considering the miraculously low power consumption of the A12Z), so I think that's where the skepticism is originating from.
 

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,659
OBX
There appears to be a fair bit of skepticism that Apple Silicon can achieve decent graphics performance without a discrete GPU, with the implied unsuitably of new Macs for games or intensive graphical applications.

Given that both the upcoming XBox System X and Playstation PS5 will have powerful GPUs on a SoC, I am curious as to why some people think that Apple won't be able to do the same?

I understand that the Sony and Microsoft consoles will be running their AMD SoCs at a high TDP (maybe >200W?), but this wouldn't be a problem for the iMac which already can run Intel Xeon + Radeon Pro GPU combinations with over 300W TDP

If the XBox X will allegedly have performance close to an NVidia 2080 Ti on a SoC, then what would stop Apple doing the same?

Obviously for the laptops a lower TDP would be needed. I think current MBP16s have a TDP of about 100W for the combined CPU + dGPU, but this should still mean that it would be possible to have a pretty powerful 50-70W TDP GPU on the SoC, similar in capability to the current AMD dGPUs

So I don't understand why there is doubt that it's possible to have a powerful GPU on Apple Silicon.

Or is it simply a lack of confidence that Apple has the expertise in this area to build one, compared to AMD & NVidia who have been producing GPUs for longer?
Nothing stops Apple from trying to do the same.
[automerge]1595077237[/automerge]
I totally agree, the skepticism is mostly unfounded.

Let's talk numbers by comparing Metal GPU performance figures...

Fastest Apple GPU to date (current iPad Pro) - Apple A12Z --> 9105.
Fastest GPU option available for current Macbook Pro 13" - Intel Iris Plus --> 8499 (7% slower).
Fastest GPU option available for current MacBook Pro 16" - AMD Radeon Pro 5600M --> 40714 (3.5x faster).
Fastest GPU option available for current iMac 27" - AMD Radeon Pro Vega 48 --> 49589 (4.4x faster).

So basically Apple needs roughly a 4x jump in performance to match the current dedicated GPUs offerings.

A 4x increase in performance over the A12Z seems perfectly achievable considering that the A12Z is a slightly revised two year old SOC (A12X) designed to work within the thermal constraints of the iPad Pro.

When we talk about CPU performance, the A13 found in the iPhone 11 is already faster in single core performance than the fastest available Intel processor you can get in a Mac (both laptops and desktops), so Apple has already proved it can match and exceed (and even embarrass) its competition when it comes to CPU performance.

But Apple still has to prove it can do the same for the GPU performance (even if I'm really convinced they will succeed, considering the miraculously low power consumption of the A12Z), so I think that's where the skepticism is originating from.
Assuming all Apple does is add GPU cores to hit 12 TF they would need ~72 cores. it match the 4x increase, just adding cores would need 32 and that would make it slightly slower than an Xbox One X.
 

Pressure

macrumors 603
May 30, 2006
5,178
1,544
Denmark
Nothing stops Apple from trying to do the same.
[automerge]1595077237[/automerge]

Assuming all Apple does is add GPU cores to hit 12 TF they would need ~72 cores. it match the 4x increase, just adding cores would need 32 and that would make it slightly slower than an Xbox One X.

Teraflops is a bad way to measure performance, especially when comparing something like an Immediate Mode Render with a Tile-based Deferred Renderer.

You would need to measure actual performance in an application.
 
Last edited:

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,659
OBX
Teraflops is a bad way to measure performance, especially when comparing something like an Immediate Mode Render with a Tile-based Deferred Renderer.

You would need to measure actual performance in an application.
For a given TFLOP level metal compute should be similar. For actual rendering I agree, as AMD has historically been good at crunching numbers but not so good running games (say compared to nvidia). Games that show off the power of TBDR compared to IMR don't really appear to be common.
 

Pressure

macrumors 603
May 30, 2006
5,178
1,544
Denmark
For a given TFLOP level metal compute should be similar. For actual rendering I agree, as AMD has historically been good at crunching numbers but not so good running games (say compared to nvidia). Games that show off the power of TBDR compared to IMR don't really appear to be common.

Have in mind that the A12X/Z operates on a paltry ~30GB/sec memory bandwidth shared between the CPU and GPU cores.

Still it matches the performance of the Xbox One S which has a 256-bit memory bus and over twice the memory bandwidth, not to mention 32MB ESRAM giving 219GB/sec memory bandwidth.

Compute workloads are often limited by memory bandwidth, which you can easily see on the AMD VEGA 10/20 chip.
 
Last edited:
  • Like
Reactions: burgerrecords

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,659
OBX
Have in mind that the A12X/Z operates on a paltry ~30GB/sec memory bandwidth shared between the CPU and GPU cores.

Still it matches the performance of the Xbox One S which has a 256-bit memory bus and over twice the memory bandwidth, not to mention 32MB ESRAM giving 219GB/sec memory bandwidth.

Compute workloads are often limited by memory bandwidth, which you can easily see on the AMD VEGA 10/20 chip.
I still don’t understand how folks are comparing the graphics performance between the A12x/z and the Xbox one s, aside from Apple saying so.
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
Games that show off the power of TBDR compared to IMR don't really appear to be common.

You don’t need to do anything particular to show the advantage of the TBDR approach. Look at this review of the 2018 iPad Pro where it outperforms a dedicated Nvidia GPU while consuming around 1/4-1/3 of power.
[automerge]1595082816[/automerge]
I still don’t understand how folks are comparing the graphics performance between the A12x/z and the Xbox one s, aside from Apple saying so.

That’s also something I’d like to know.
 

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,659
OBX
You don’t need to do anything particular to show the advantage of the TBDR approach. Look at this review of the 2018 iPad Pro where it outperforms a dedicated Nvidia GPU while consuming around 1/4-1/3 of power.
[automerge]1595082816[/automerge]


That’s also something I’d like to know.
Yeah like the second paragraph of that url has me asking a whole lot of questions....
In rough terms, the Xbox One S is roughly 1.4 TFLOPS at its peak. But for better or worse, when the PC moved to unified shaders, the industry moved to FP32 for all GPU functions. This is as oppposed to the mobile world, where power is an absolute factor for everything, Vertex shaders are typically 32bpc while Pixel and Compute shaders can often be 16bpc. We’ve seen some movement on the PC side to use half-precision GPUs for compute, but for gaming, that’s not currently the case
 

JacobHarvey

macrumors regular
Apr 2, 2019
118
107
Somewhere
Yeah like the second paragraph of that url has me asking a whole lot of questions....


The next paragraph expands further. It seems like the benchmarks on the A12 iPad have its GPU run many functions at half-precision (FP16) which likely gives it some performance boost vs PC GPUs that have to run all functions at full-precision (FP32). The benchmarks therefore give us a somewhat rough idea of performance between the A12 and those GPUs, rather than being a perfect comparison of raw GPU performance

"Overall, that makes like-for-like PC comparisons difficult. An AMD Ryzen 2700U SoC has a Vega GPU which offers 1.66 TFLOPS of FP32 performance, in theory. If run at 16-bit, that number would double, in theory. The iPad Pro would likely use half-precision for some of the GPU workload. This has been an issue for years and has made it difficult easily compare any cross-platform benchmark against the PC."
 
Last edited:

Pressure

macrumors 603
May 30, 2006
5,178
1,544
Denmark
So in the end PC’s (and consoles?) are doing double the work.

No, some workloads might be half-precision without knowing which ones. This is a problem with the software.

Again, you need to focus on real workloads and not rely on benchmarks to give the broader picture.
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
Modern desktop GPUs have fast FP16 units and the drivers often contain “optimizations” that will downgrade the precision for some popular games where appropriate. For most color operations, there is no difference between 32 bit and 16 bit precision. These optimizations make benchmarking non-trivial. I don’t think that popular benchmarks do correctness checking to catch precision degradation.

I have no idea whether Apple actually actively manipulates precision in fragment shaders. Their documentation seems to suggest that they do not do it. They do encourage the developer to use half precision explicitly where it is sufficient. I don’t know whether mobile graphics benchmarks use different precision than the desktop one (I hope not, that would be awkward...)
 

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,659
OBX
No, some workloads might be half-precision without knowing which ones. This is a problem with the software.

Again, you need to focus on real workloads and not rely on benchmarks to give the broader picture.
So the only real workload that we can sort of compare would be Shadow of the Tomb Raider Rosetta demo?
Modern desktop GPUs have fast FP16 units and the drivers often contain “optimizations” that will downgrade the precision for some popular games where appropriate. For most color operations, there is no difference between 32 bit and 16 bit precision. These optimizations make benchmarking non-trivial. I don’t think that popular benchmarks do correctness checking to catch precision degradation.

I have no idea whether Apple actually actively manipulates precision in fragment shaders. They do encourage the developer to use half precision explicitly where it is sufficient. I don’t know whether mobile graphics benchmarks use different precision than the desktop one (I hope not, that would be awkward...)
interestingly Unity development page flat out says...
https://docs.unity3d.com/2020.2/Documentation/Manual/SL-DataTypesAndPrecision.html said:
One complication of float/half/fixed data type usage is that PC GPUs are always high precision. That is, for all the PC (Windows/Mac/Linux) GPUs, it does not matter whether you write float, half or fixed data types in your shaders. They always compute everything in full 32-bit floating point precision.

The half and fixed types only become relevant when targeting mobile GPUs, where these types primarily exist for power (and sometimes performance) constraints. Keep in mind that you need to test your shaders on mobile to see whether or not you are running into precision/numerical issues.
 

Erehy Dobon

Suspended
Feb 16, 2018
2,161
2,017
No service
my question was whether there is any underlying problem with creating a powerful GPU on an SOC given that Sony and Microsoft appear to have done just this.
No.

There are tons of similar GPU units in data centers all around the world right now.

High-performance GPU architecture made cryptocurrency mining on CPUs obsolete years ago.

What you are asking happened about five years ago.
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
interestingly Unity development page flat out says...

That page must be not entirely up to date. Nvidia supports half precision floats since Turing and AMD since Vega if I remember correctly. On these GPUs, using half precision can bring performance advantage, same as on mobile chips. Interestingly enough, half precision used to be a big thing in desktop a decade or so ago, but as the transistor budget and the performance have increased, it was dropped from consumer products. Interest in machine learning as well as more sophisticated shading algorithms brought half-float back.
 

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,659
OBX
That page must be not entirely up to date. Nvidia supports half precision floats since Turing and AMD since Vega if I remember correctly. On these GPUs, using half precision can bring performance advantage, same as on mobile chips. Interestingly enough, half precision used to be a big thing in desktop a decade or so ago, but as the transistor budget and the performance have increased, it was dropped from consumer products. Interest in machine learning as well as more sophisticated shading algorithms brought half-float back.
This page talks about having to use min16float to get half precision in HLSL, otherwise half gets mapped as float by the compiler. Explicit FP16 isn't supported in D3D11 at all, you would have to run D3D12.
 

Erehy Dobon

Suspended
Feb 16, 2018
2,161
2,017
No service
Sony streamed a technical presentation about the upcoming PS5 in June.

This page:


includes an image from Sony's slide deck that shows the CPU and GPU on the same custom chip. Since the presentation was given by Sony's lead architect on the project, I assume he knew what he was talking about.

I would not be surprised if there is a similar article about the XBox System X on the same site but I did not bother to search for it myself. I will leave that as an exercise to you.

My guess is that johngwheeler is not making stuff up unlike some people here do.
 
  • Like
Reactions: Mojo1019
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.