Apple dedicated GPU

ian87w · Nov 22, 2022

I think Apple's way would be to look at how the GPUs are being used by Pros. For example, video encoding and decoding. Instead of just tacking in dGPU, Apple created the media engine. I have a feeling the trend will be these specialized parts of the silicon, accelerating those things, instead of having the GPU doing everything. M1 and M2 already have AI accelerator. I can see more and more of these accelerators, rendering the need for a traditional discrete GPU obsolete.

singhs.apps · Nov 22, 2022

ian87w said:
I think Apple's way would be to look at how the GPUs are being used by Pros. For example, video encoding and decoding. Instead of just tacking in dGPU, Apple created the media engine. I have a feeling the trend will be these specialized parts of the silicon, accelerating those things, instead of having the GPU doing everything. M1 and M2 already have AI accelerator. I can see more and more of these accelerators, rendering the need for a traditional discrete GPU obsolete.

Yes. GPUs were increasingly being thought of as general processing units rather than pure graphics.
Dedicated accelerators may be a better idea.

diamond.g · Nov 22, 2022

singhs.apps said:
Yes. GPUs were increasingly being thought of as general processing units rather than pure graphics.
Dedicated accelerators may be a better idea.

ASICs are the future!!! (again)

EDIT:don't want the shoe folk to get upset

singhs.apps · Nov 22, 2022

diamond.g said:
ASICS are the future!!! (again)

Wasn’t aware an Aparel brand is the future of computing ..but YMNK

leman · Nov 22, 2022

singhs.apps said:
Yes. GPUs were increasingly being thought of as general processing units rather than pure graphics.
Dedicated accelerators may be a better idea.

Can't solve every problem with a dedicated accelerator. Those can vastly improve your capabilities in certain domains but a lot of applications still require general programmability. Even more: capable general-purpose machines allow developers to come up with new approaches and solutions.

To be clear: I have nothing agains special accelerators, those are very useful and I want to see this model developer further and further. But it cannot happen at the expense of programmable power.

Colstan · Nov 22, 2022

leman said:
What AMD is currently doing is mostly about optimising cost (and maybe production capacity). I don't think that Apple is that concerned about this, unless the costs of newer nodes will truly become astronomical. I could see them doing more modular packages in the future, just to offer more flexible configurations, but I wouldn't be surprised if they don't go that way either.

That's the impression I get, as well. Gamer's Nexus did an interview with AMD engineer Sam Naffziger about RDNA3. While he tends to stress the technical benefits of the chiplet approach, I think it's fairly obvious that AMD is doing it to save money.

Simply put, Apple doesn't have this problem, with a gigantic war chest. AMD is doing much better, but they're dwarfed by Apple's budget for the latest nodes. Apple can integrate with impunity, which is why I think it's making some folks skittish about the next Mac Pro, since it's supposed to be a high-performance desktop, but using Apple's "new way" of design. We've become so accustomed to the PC mindset, that any other way of achieving that goal is going to unsettle some folks.

I'm not in the market for an Apple Silicon Mac Pro, never will be, but as the last remaining Intel Mac, the end result is infinitely fascinating...and I suspect predictable based upon the other Apple Silicon Macs, but that's just my guess.

leman · Nov 22, 2022

Colstan said:
Simply put, Apple doesn't have this problem, with a gigantic war chest. AMD is doing much better, but they're dwarfed by Apple's budget for the latest nodes.

It’s not just that, but mostly the fact that Apple doesn’t need their chips to be cost-competitive on the market. They can literally afford spending $1000 to produce a single high-end chip and it will still cost them less than buying comparable tech from other companies. That’s the bug advantage of doing it in-house - if you have the capabilities of course.

singhs.apps · Nov 22, 2022

leman said:
Can't solve every problem with a dedicated accelerator. Those can vastly improve your capabilities in certain domains but a lot of applications still require general programmability. Even more: capable general-purpose machines allow developers to come up with new approaches and solutions.

Fair enough. And it explains why GPUs have come to their current position in the computing space. So future ideas can thrive in the GPGPU space.

But if something like say… physics (fluid/particles/rigid etc) solutions have been developed and they have a big market use case and a dedicated accelerator can do it say …2x as fast as GPUs wouldn’t it make sense to create such an accelerator?

A few years down the line a superior solution is developed and it requires a different kind of accelerator then perhaps creating a new one would make sense ?

On the other hand dedicated solutions have existed in the past too, but economies of scale tilted in favour of general purpose solutions.

Apple though has the heft to continue creating accelerators as a differentiator. I mean even in games, it’s iOS offerings ( and a possible AR/VR headset ) can function like the consoles market.

leman · Nov 22, 2022

singhs.apps said:
But if something like say… physics (fluid/particles/rigid etc) solutions have been developed and they have a big market use case and a dedicated accelerator can do it say …2x as fast as GPUs wouldn’t it make sense to create such an accelerator?

A few years down the line a superior solution is developed and it requires a different kind of accelerator then perhaps creating a new one would make sense ?

On the other hand dedicated solutions have existed in the past too, but economies of scale tilted in favour of general purpose solutions.

I think that’s exactly how technology develops. The big push into ML happened mostly because there was suddenly a lot of cheap computational power (GPGPU) to play with, and as the domain matured dedicated hardware solutions were developed. If something similar can be done for physics simulations, why not (of course, those units probably would be a waste of space in a prosumer laptop)? Right now there is a big push into specialized matrix coprocessors, and I’m sure these things will transform and adapt as we move on.

iPadified · Nov 22, 2022

leman said:
I think that’s exactly how technology develops. The big push into ML happened mostly because there was suddenly a lot of cheap computational power (GPGPU) to play with, and as the domain matured dedicated hardware solutions were developed.

I believe the biggest change from previous "AI" was the development a new AI algorithm that made a much better job to find patterns. My colleague at the Compute department explained it once but I forgot the details. As this was research, the researchers probably had a supercomputer at hand during the initial phases or were very patience using a PC workstation. ML in most cases are simply pattern recognition in data so essentially a stupid but very usable AI. As it is usable, dedicated hardware was developed as we also see for ray tracing hardware and video encoders.

leman · Nov 23, 2022

iPadified said:
I believe the biggest change from previous "AI" was the development a new AI algorithm that made a much better job to find patterns. My colleague at the Compute department explained it once but I forgot the details. As this was research, the researchers probably had a supercomputer at hand during the initial phases or were very patience using a PC workstation. ML in most cases are simply pattern recognition in data so essentially a stupid but very usable AI. As it is usable, dedicated hardware was developed as we also see for ray tracing hardware and video encoders.

Modern ML does not rely on pattern recognition per se. It uses signal propagation networks that remix data in complex ways and “learn” by tweaking data flow until a useful result can be produced. Think about it this way: if you convert pixel data to numbers in some way and then combine these numbers in some other way, you can get a number that gives you the likelihood that the picture depicts a horse. You are not actually actively searching the image for a horse-like pattern, it just happens since a horse image is likely to exhibit certain features that your number mixing network can isolate. I would characterize this as pattern abstraction rather than pattern search. Pattern search is heuristical, deep learning is not.

The basic methodology for was known for decades, but computational capability was missing. Parallel processors are very at this particular kind of computation and the programmability of GPUs as massively parallel processors allowed people to play around with more complex networks, sending the field into an overdrive.

Pattern search still has its place of course, but it’s a different topic.

iPadified · Nov 23, 2022

leman said:
Modern ML does not rely on pattern recognition per se. It uses signal propagation networks that remix data in complex ways and “learn” by tweaking data flow until a useful result can be produced. Think about it this way: if you convert pixel data to numbers in some way and then combine these numbers in some other way, you can get a number that gives you the likelihood that the picture depicts a horse. You are not actually actively searching the image for a horse-like pattern, it just happens since a horse image is likely to exhibit certain features that your number mixing network can isolate. I would characterize this as pattern abstraction rather than pattern search. Pattern search is heuristical, deep learning is not.

The basic methodology for was known for decades, but computational capability was missing. Parallel processors are very at this particular kind of computation and the programmability of GPUs as massively parallel processors allowed people to play around with more complex networks, sending the field into an overdrive.

Pattern search still has its place of course, but it’s a different topic.

Is not identifying uniques features in complex data pattern recognition? I have never seen ML being used for anything else than finding patterns in complex data. It can be a horse in a picture as you say, a biomarker in clinical data, finding a signal in noisy data or speech recognition. In all of these examples data is processed with the aim to find a pattern or to find patterns that can be connected to a word (images, sound files) or a disease (biomarkers).

I agree that GPGPU made it widespread but the computational power was there to demonstrate the usability and that was long before GPGPU became mainstream.

Xiao_Xi · Nov 23, 2022

iPadified said:
Is not identifying uniques features in complex data pattern recognition? I have never seen ML being used for anything else than finding patterns in complex data. It can be a horse in a picture as you say, a biomarker in clinical data, finding a signal in noisy data or speech recognition. In all of these examples data is processed with the aim to find a pattern or to find patterns that can be connected to a word (images, sound files) or a disease (biomarkers).

Deep learning uses brute force, whereas previously, image recognition used meaningless features like haar-like features.

Haar-like feature - Wikipedia

en.wikipedia.org

leman · Nov 23, 2022

iPadified said:
Is not identifying uniques features in complex data pattern recognition? I have never seen ML being used for anything else than finding patterns in complex data. It can be a horse in a picture as you say, a biomarker in clinical data, finding a signal in noisy data or speech recognition. In all of these examples data is processed with the aim to find a pattern or to find patterns that can be connected to a word (images, sound files) or a disease (biomarkers).

Even if the end result is the same, the path is different. Of course, it depends on your terminology. Usually when one talks about pattern search (or recognition), it's about heuristics based methods. E.g. if you are looking for a cat look for something with two triangular ears and a fluffy tail. Modern deep learning doesn't really work this way, it just aggregates the information in some (often non-transparent way) to find associations between input and output.

It's a bit like the distinction between internal and external knowledge. In pattern search you algorithmize external knowledge (i.e. what you know is relevant about the thing you want to find). In modern deep learning ML you let a network acquire its own internal knowledge instead.

theorist9 · Nov 23, 2022

leman said:
BTW, according to AMD nomenclature (at least as presented here) Apple Silicon does not have a GPU, since Apple GPU does not output any video signals. It jus takes data from memory, does some processing on it and writes it back to memory. What happens to that data afterwords is none of GPU's concern. There is a completely independent hardware unit that reads this data and sends it to the display.

Do traditional integrated GPU's from Intel and AMD have video outputs, or do they also send the data back to the CPU?

If the latter, and if it were the case that AMD's nomenclature required devices to have video outs to be considerd GPU's, then AMD woud not label their own integrated GPU's as GPU's—but they do.

diamond.g · Nov 24, 2022

theorist9 said:
Do traditional integrated GPU's from Intel and AMD have video outputs, or do they also send the data back to the CPU?

If the latter, and if it were the case that AMD's nomenclature required devices to have video outs to be considerd GPU's, then AMD woud not label their own integrated GPU's as GPU's—but they do.

Are you referring to their desktop or laptop designs as they are different as far as I can tell.

leman · Nov 24, 2022

theorist9 said:
Do traditional integrated GPU's from Intel and AMD have video outputs, or do they also send the data back to the CPU?

I don't know whether the display controller is nominally part of the GPU or a separate device on the SoC. Not that it matters, really. All this is just to show how ridiculous this kind of "definitions" can be.

Fun fact: many dGPU-equipped laptops send the frame buffer data back to the iGPU since the dGPU is not connected to the internal display. E.g. Nvidia Optimus.

Xiao_Xi · Nov 24, 2022

leman said:
All this is just to show how ridiculous this kind of "definitions" can be.

This is getting crazy. The thread title has three words in it. We have discussed two of them: dedicated and GPU.
Should we discuss what Apple means?

leman · Nov 24, 2022

Xiao_Xi said:
Should we discuss what Apple means?

That’s just a word that is lexicographically ordered before “Atari”

diamond.g · Nov 24, 2022

leman said:
I don't know whether the display controller is nominally part of the GPU or a separate device on the SoC. Not that it matters, really. All this is just to show how ridiculous this kind of "definitions" can be.

Fun fact: many dGPU-equipped laptops send the frame buffer data back to the iGPU since the dGPU is not connected to the internal display. E.g. Nvidia Optimus.

Yeah mux switches are a newer thing to look for in a pc laptop these days.

Xiao_Xi said:
This is getting crazy. The thread title has three words in it. We have discussed two of them: dedicated and GPU.
Should we discuss what Apple means?

Aside from us all agreeing that Apple doing a "traditionally" dedicated GPU would break what they are telling developers to expect with Apple Silicon GPUs, what else is there to talk about?

Boil · Nov 24, 2022

I am assuming, with the way the whole Unified Memory Architecture thing works, the best we can hope for in relation to add-in graphics processing cards would be an ASi GPGPU, strictly for offloading compute/render jobs while the "iGPU" in the SoC handles display output...?

sam_dean · Nov 24, 2022

Zest28 said:
What are the odds that Apple will move away from integrated graphics for their Mac Pro and iMac Pro?

1 big GPU card is better than 4 M2 Max fused together in the end.

Combining multiple AMD or NVIDIA GPU’s also wasn’t very good using SLI or Crossfire. It is better to just have 1 big powerful one.

Apple's execution so far is better than SLI or Crossfire.

Apple's iGPU has discreet GPU performance but with the added bonus of near zero latency as the GPU cores are right beside the CPU that both share the same memory.

leman · Nov 24, 2022

Boil said:
I am assuming, with the way the whole Unified Memory Architecture thing works, the best we can hope for in relation to add-in graphics processing cards would be an ASi GPGPU, strictly for offloading compute/render jobs while the "iGPU" in the SoC handles display output...?

I think the reasonable possibilities for the Mac pro GPU is one of the following:

- a single SoC with a big GPU (e.g. four dies, around 40-50k FP32 ALUs). That’s the easiest option for Apple and it won’t really be able to challenge any of the high-end multi-GPU systems

- a single SoC with a VERY BIG GPU (multiple GPU-only tiles, a lot of cores). Very expensive, very custom, very big, but still has uses the same programming model as any other model

- multiple SoCs on separate compute boards, connected via some sort of PCIe-facilitated bus (maybe cache-coherent CLX), maybe with a shared pool of traditional RAM. This is something I’ve been thinking about for a while, as this approach would solve the issues with modularity and expandability. But it will require a new programming model that can efficiently use non-local compute clusters.

Boil · Nov 24, 2022

leman said:
I think the reasonable possibilities for the Mac pro GPU is one of the following:

- a single SoC with a big GPU (e.g. four dies, around 40-50k FP32 ALUs). That’s the easiest option for Apple and it won’t really be able to challenge any of the high-end multi-GPU systems

Going from the M1 family of SoCs, this would be the expected outcome...?

leman said:
- a single SoC with a VERY BIG GPU (multiple GPU-only tiles, a lot of cores). Very expensive, very custom, very big, but still has uses the same programming model as any other model

This could be an ASi Mac Pro only variant:

M2 Ultra = one CPU/GPU SoC paired with one GPU-specific SoC
M2 Extreme = two CPU/GPU SoCs paired with two GPU-specific SoCs

leman said:
- multiple SoCs on separate compute boards, connected via some sort of PCIe-facilitated bus (maybe cache-coherent CLX), maybe with a shared pool of traditional RAM. This is something I’ve been thinking about for a while, as this approach would solve the issues with modularity and expandability. But it will require a new programming model that can efficiently use non-local compute clusters.

Least possible, due to the whole UMA thing; add-in GPGPU/Compute cards more likely...?

Is it Spring 2023 yet...? ;^p

diamond.g · Nov 24, 2022

Instead of using 10x ALU couldn't they just up the clock rate 2-3x?

Apple dedicated GPU

macrumors G3

macrumors 6502a

macrumors G5

macrumors 6502a

macrumors Core

macrumors 6502

macrumors Core

macrumors 6502a

macrumors Core

macrumors 68020

macrumors Core

macrumors 68020

macrumors 68000

macrumors Core

macrumors 601

macrumors G5

macrumors Core

macrumors 68000

macrumors Core

macrumors G5

macrumors 68040

Suspended

macrumors Core

macrumors 68040

macrumors G5

Our Staff