Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
Isn't that specific to MSAA? Is FXAA or SMAA less costly on Apple GPUs?
MSAA isn't as frequent as it used to be.

Yes, it's specific to MSAA. But since MSAA on Apple GPUs is programmable, I wouldn't be surprised if you can use it to implement more advanced modern temporal AA techniques. Can't really comment too much here since it has been years I last looked how modern AA shaders work :)

It also makes me wonder, why SSAO and Shadows aren't free (or very low cost) on TBDR as well.

Apple gives you low-level control over how GPU does rendering, so you can do a lot of interesting things to make things more efficient and/or faster. One problem though is that many of these techniques are limited to processing a single tile at once. If you need to examine neighborhoods of pixels, like in the SSAO techniques, it might introduce artifacts on tile boundaries. Not sure.
 

diamond.g

macrumors G4
Mar 20, 2007
11,437
2,665
OBX
I feel like this has been asked, but does Apple Metal have a DLSS equivalent? The lack of one on the AMD side has DF scratching their heads.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
I feel like this has been asked, but does Apple Metal have a DLSS equivalent? The lack of one on the AMD side has DF scratching their heads.

I don’t think that Apple offers a ready solution like DLSS. I’ve read that AMD and co. are planning to release an open-source model for this, so maybe it can be later ported to Apple platform...
 

diamond.g

macrumors G4
Mar 20, 2007
11,437
2,665
OBX
I don’t think that Apple offers a ready solution like DLSS. I’ve read that AMD and co. are planning to release an open-source model for this, so maybe it can be later ported to Apple platform...
Not sure why Apple isn't leading in this since they started the whole resolution scaling thing. Why render a game (or anything really) at 2160P native when you can do your fancy resolution scaling thing and pump up the 1080P image to 2160P with no loss in quality.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
Not sure why Apple isn't leading in this since they started the whole resolution scaling thing. Why render a game (or anything really) at 2160P native when you can do your fancy resolution scaling thing and pump up the 1080P image to 2160P with no loss in quality.

Probably because you’ll need a ton of training data from actual games and it’s not like Apple is a gaming hardware company. It makes a lot of sense for Nvidia, not so much for someone like Apple. Besides, it’s not like ML upscaling is free you need non-trivial ml compute performance. Is M1 fast enough to do real-time upscale from HD to its native resolution? I’m not so sure...
 

gnomeisland

macrumors 65816
Jul 30, 2008
1,097
833
New York, NY
Probably because you’ll need a ton of training data from actual games and it’s not like Apple is a gaming hardware company. It makes a lot of sense for Nvidia, not so much for someone like Apple. Besides, it’s not like ML upscaling is free you need non-trivial ml compute performance. Is M1 fast enough to do real-time upscale from HD to its native resolution? I’m not so sure...
But a significant part of SoC (perhaps more than the GPU) is devoted to the NPU and ML accelerators. I'm not saying it *can* but it seems like anything else could it would be M1.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
But a significant part of SoC (perhaps more than the GPU) is devoted to the NPU and ML accelerators. I'm not saying it *can* but it seems like anything else could it would be M1.

I don't disagree with you. It's just that Apple quotes the M1 NL performance at 11 TFLOPS, where even Nvidia 2060 (the "lowest" GPU that support DLSS) has over 50 TFLOPS throughput on it's tensor cores. It is entirely possible that NL is fast enough to do this kind of upscaling with acceptable performance, but this needs to be tested.
 
  • Like
Reactions: gnomeisland

diamond.g

macrumors G4
Mar 20, 2007
11,437
2,665
OBX
If you mean signed distance fields, not that I know of.
Yeah. I found 2 PS4 games that use it Dreams and Claybook. Seems like an interesting technique that would need all the pipeline tools to catch up for support (since everyone else seem to work in polygons).
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
Yeah. I found 2 PS4 games that use it Dreams and Claybook. Seems like an interesting technique that would need all the pipeline tools to catch up for support (since everyone else seem to work in polygons).

With tile shaders etc. I think Metal on Apple GPUs is well suitable for these techniques. I am using SDFs to draw 2D caves for my hobby game project, but it’s a bit different since I am still using polygons, I am just computing distances from the nearest cave edge to draw visually complex boundaries without needing extra geometry. So more of a hybrid technique.
 

diamond.g

macrumors G4
Mar 20, 2007
11,437
2,665
OBX
With tile shaders etc. I think Metal on Apple GPUs is well suitable for these techniques. I am using SDFs to draw 2D caves for my hobby game project, but it’s a bit different since I am still using polygons, I am just computing distances from the nearest cave edge to draw visually complex boundaries without needing extra geometry. So more of a hybrid technique.
Yeah that was talked about on Beyond3d. You should check out Dreams if you have a PS4 (or PS5).
 

theorist9

macrumors 68040
May 28, 2015
3,882
3,061
They are hiding a lot of hardware details: clocks, RAM type, TDP... RAM will most likely be LPDDR4/5 with some sort of wide multi-channel configuration. Probably around 80GBps bandwidth, something that will be plenty for a chip of this spec.
...I'm pretty sure this is dual channel LPDDR4X. Compare the image you saw today to this image of the A12X..
Which uses LPDDR4X. I think, frankly, LPDDR5 supplies were more limited than they expected.
Grain of salt...?!?

Memory type:LPDDR4X-4266
LPDDR5-5500
Max. Memory:16 GB
Memory channels:2ECC:No
Where did they get the info like the frequency from? Do they have a preview model?

The following is from ifixit's Nov. 19 teardown of the 8 GB M1 Air and M1 MBP (https://www.ifixit.com/News/46884/m1-macbook-teardowns-something-old-something-new). Not sure if LPDDR4X-4266 is also used in the 16 GB machines:

1607292569255.png

1607292959353.png
 
Last edited:

awesomedeluxe

macrumors 6502
Jun 29, 2009
262
105
The following is from ifixit's Nov. 19 teardown of the 8 GB M1 Air and M1 MBP (https://www.ifixit.com/News/46884/m1-macbook-teardowns-something-old-something-new). Not sure if LPDDR4X-4266 is also used in the 16 GB machines:

View attachment 1688142
View attachment 1688145
Yeah, anandtech calls the 16GB of RAM in the Mac Mini LPDDR4X-4266-class, so I'm pretty sure it's LPDDR4X everywhere.

In light of the recent Bloomberg report about "32 core" graphics in laptops, I'm again wondering what Apple will do with their GPU memory.

I guess it's plausible that a 16-core CPU and 32-core GPU could be married into a giant APU. The article does imply that Apple is prepared for abysmal yield on these things, and yield could definitely be a blood bath at that size. But if it's a big APU, 4x LPDDR5-6400 modules would probably cut it. That would be a threefold increase in bandwidth relative to a fourfold increase in core count. Bandwidth starved, sure, but still performant.

But it really sounds like Apple is making separate GPUs. Chips that, at the very least, are separate enough to need their own memory controller. A 16 firestorm core part, minus the GPU cores, plus some I/O is probably under 200mm2. 32 GPU cores plus a memory controller is around 150mm2, and it just makes more sense for Apple to target parts this size. But it's weird to imagine a discrete GPU plugging back into the same memory pool the CPU is using, not to mention challenging. I can't think of a good way to go about it.
 

Pressure

macrumors 603
May 30, 2006
5,182
1,545
Denmark
Yeah, anandtech calls the 16GB of RAM in the Mac Mini LPDDR4X-4266-class, so I'm pretty sure it's LPDDR4X everywhere.

In light of the recent Bloomberg report about "32 core" graphics in laptops, I'm again wondering what Apple will do with their GPU memory.

I guess it's plausible that a 16-core CPU and 32-core GPU could be married into a giant APU. The article does imply that Apple is prepared for abysmal yield on these things, and yield could definitely be a blood bath at that size. But if it's a big APU, 4x LPDDR5-6400 modules would probably cut it. That would be a threefold increase in bandwidth relative to a fourfold increase in core count. Bandwidth starved, sure, but still performant.

But it really sounds like Apple is making separate GPUs. Chips that, at the very least, are separate enough to need their own memory controller. A 16 firestorm core part, minus the GPU cores, plus some I/O is probably under 200mm2. 32 GPU cores plus a memory controller is around 150mm2, and it just makes more sense for Apple to target parts this size. But it's weird to imagine a discrete GPU plugging back into the same memory pool the CPU is using, not to mention challenging. I can't think of a good way to go about it.
Here is a die shot of the current M1 chip.
M1.png

Scaling it straight up to 12 Firestorm cores and a 32-core GPU would take up less than 260mm2. That’s not a giant chip by any means.
 
  • Like
Reactions: armoured

diamond.g

macrumors G4
Mar 20, 2007
11,437
2,665
OBX
Here is a die shot of the current M1 chip.
M1.png

Scaling it straight up to 12 Firestorm cores and a 32-core GPU would take up less than 260mm2. That’s not a giant chip by any means.
Do we know how they disable the 8th GPU core?
 

awesomedeluxe

macrumors 6502
Jun 29, 2009
262
105
Here is a die shot of the current M1 chip.
M1.png

Scaling it straight up to 12 Firestorm cores and a 32-core GPU would take up less than 260mm2. That’s not a giant chip by any means.
Is 12 a typo? The article says 16 with some cores potentially disabled.

I got 263mm2, eyeballing the Firestorm block at about 18mm2 and GPU cores at about 30mm2. We still have to dedicate some space for extra I/O. I'd round up to 300mm2 to account for that and potential increases to SLC and other areas, but we're in the same ballpark.

I agree with you that that's not a giant chip. But it's... really big. It pretty much puts the final design in the hands of TSMC's N5 process. I guess that's consistent with the article, which is suggesting they could bin these pretty aggressively. They'd probably have to before putting it into the MBP14 anyway, so it's not like these binned chips wouldn't have a home.

A 16+32 core APU still gives me pause. Like, even if we assume the CPU scaled down to iPhone speeds, it's still consuming 40W under load... and the GPU also consumes 40W under load... all within about a square inch of silicon. Dell struggled to cool Kaby G, which was 65W over a much larger area, and I think that's the biggest accomplishment to date.
 

diamond.g

macrumors G4
Mar 20, 2007
11,437
2,665
OBX
Is 12 a typo? The article says 16 with some cores potentially disabled.

I got 263mm2, eyeballing the Firestorm block at about 18mm2 and GPU cores at about 30mm2. We still have to dedicate some space for extra I/O. I'd round up to 300mm2 to account for that and potential increases to SLC and other areas, but we're in the same ballpark.

I agree with you that that's not a giant chip. But it's... really big. It pretty much puts the final design in the hands of TSMC's N5 process. I guess that's consistent with the article, which is suggesting they could bin these pretty aggressively. They'd probably have to before putting it into the MBP14 anyway, so it's not like these binned chips wouldn't have a home.

A 16+32 core APU still gives me pause. Like, even if we assume the CPU scaled down to iPhone speeds, it's still consuming 40W under load... and the GPU also consumes 40W under load... all within about a square inch of silicon. Dell struggled to cool Kaby G, which was 65W over a much larger area, and I think that's the biggest accomplishment to date.
Would it be easier to cool if they used more of a chipset design and spread the chip out more?
 

EntropyQ3

macrumors 6502a
Mar 20, 2009
718
824
Is 12 a typo? The article says 16 with some cores potentially disabled.

I got 263mm2, eyeballing the Firestorm block at about 18mm2 and GPU cores at about 30mm2. We still have to dedicate some space for extra I/O. I'd round up to 300mm2 to account for that and potential increases to SLC and other areas, but we're in the same ballpark.

I agree with you that that's not a giant chip. But it's... really big. It pretty much puts the final design in the hands of TSMC's N5 process. I guess that's consistent with the article, which is suggesting they could bin these pretty aggressively. They'd probably have to before putting it into the MBP14 anyway, so it's not like these binned chips wouldn't have a home.

A 16+32 core APU still gives me pause. Like, even if we assume the CPU scaled down to iPhone speeds, it's still consuming 40W under load... and the GPU also consumes 40W under load... all within about a square inch of silicon. Dell struggled to cool Kaby G, which was 65W over a much larger area, and I think that's the biggest accomplishment to date.
Apple is already cooling WAY beyond that in the iMac in total thermal load, and even power density both on the intel CPU, and the 5700xt GPU. In the same enclosure, Apple should be able to cool the chip outlined above very quietly as the amount of heat needing to be dissipated determines the required airflow.
It’s a big chip compared to their cell phone SoC:s, but Sonys and Microsofts new console chips are 306 and 360mm2 respectively on the closely related 7nm node (and they are cheap). TSMC already reports comparable defect rates on their 5nm node, and will have further dialed in the process in manufacturing a hundred million or so A14 SoCs by now.
And, as opposed to their phone SoC:s, Apple will have opportunities to increase yield further both by disabling defective/underperforming functional blocks and binning, standard procedure in the industry.
I really wouldn’t worry.
 

awesomedeluxe

macrumors 6502
Jun 29, 2009
262
105
Would it be easier to cool if they used more of a chipset design and spread the chip out more?
Yeah. Like there's no problem with 80W in the abstract; you can buy Macbooks right now that use 90W. I've never seen that much power in that small a space, though.

Apple is already cooling WAY beyond that in the iMac in total thermal load, and even power density both on the intel CPU, and the 5700xt GPU. In the same enclosure, Apple should be able to cool the chip outlined above very quietly as the amount of heat needing to be dissipated determines the required airflow.
It’s a big chip compared to their cell phone SoC:s, but Sonys and Microsofts new console chips are 306 and 360mm2 respectively on the closely related 7nm node (and they are cheap). TSMC already reports comparable defect rates on their 5nm node, and will have further dialed in the process in manufacturing a hundred million or so A14 SoCs by now.
And, as opposed to their phone SoC:s, Apple will have opportunities to increase yield further both by disabling defective/underperforming functional blocks, standard procedure in the industry.
I really wouldn’t worry.
Oh, for sure! It's no problem in the iMac. But I think these chips are supposedly going into the MBP14 and MBP16. I'm just taking for granted right now that the parts going into the MBP14 are binned and have a lot of cores disabled. But the article implies that they're testing a full 16+32 loadout for the MBP16.

That's not impossible but it's certainly unprecedented. Take a gander at this article about cooling Kaby G in the XPS. The XPS is the same thickness as the current MBP16, and Kaby G is a 65W part which comes with its own high bandwidth memory solution. A 16+32 core part would probably have a similar TDP, but still be in need of high bandwidth memory (creates more heat) and is trying to accomplish that in a smaller area (more heat-dense).
 
  • Like
Reactions: EntropyQ3

EntropyQ3

macrumors 6502a
Mar 20, 2009
718
824
Yeah. Like there's no problem with 80W in the abstract; you can buy Macbooks right now that use 90W. I've never seen that much power in that small a space, though.


Oh, for sure! It's no problem in the iMac. But I think these chips are supposedly going into the MBP14 and MBP16. I'm just taking for granted right now that the parts going into the MBP14 are binned and have a lot of cores disabled. But the article implies that they're testing a full 16+32 loadout for the MBP16.

That's not impossible but it's certainly unprecedented. Take a gander at this article about cooling Kaby G in the XPS. The XPS is the same thickness as the current MBP16, and Kaby G is a 65W part which comes with its own high bandwidth memory solution. A 16+32 core part would probably have a similar TDP, but still be in need of high bandwidth memory (creates more heat) and is trying to accomplish that in a smaller area (more heat-dense).
Fully agree when it comes to the portables. Again, it wouldn’t really be worse than what they are already handling in the enclosure, on the other hand I think putting that kind of thermal load inside a MBP is .... suboptimal.
 

theorist9

macrumors 68040
May 28, 2015
3,882
3,061
Do we know how they disable the 8th GPU core?
According to several reports, they don't. They knew a certain percentage of the chips were coming out with a bad GPU core. Thus, rather than throwing those out, they created an SKU for the Air with 7 GPU cores, and use them in that. The process is called "binning".

To the extent they offer different CPU/GPU core counts within each model in the upcoming generation of AS Macs, some (but not all) of that will likely result from binning as well.
 
Last edited:

theorist9

macrumors 68040
May 28, 2015
3,882
3,061
Yeah, anandtech calls the 16GB of RAM in the Mac Mini LPDDR4X-4266-class, so I'm pretty sure it's LPDDR4X everywhere.

In light of the recent Bloomberg report about "32 core" graphics in laptops, I'm again wondering what Apple will do with their GPU memory.

I guess it's plausible that a 16-core CPU and 32-core GPU could be married into a giant APU. The article does imply that Apple is prepared for abysmal yield on these things, and yield could definitely be a blood bath at that size. But if it's a big APU, 4x LPDDR5-6400 modules would probably cut it. That would be a threefold increase in bandwidth relative to a fourfold increase in core count. Bandwidth starved, sure, but still performant.

But it really sounds like Apple is making separate GPUs. Chips that, at the very least, are separate enough to need their own memory controller. A 16 firestorm core part, minus the GPU cores, plus some I/O is probably under 200mm2. 32 GPU cores plus a memory controller is around 150mm2, and it just makes more sense for Apple to target parts this size. But it's weird to imagine a discrete GPU plugging back into the same memory pool the CPU is using, not to mention challenging. I can't think of a good way to go about it.
This post, by cmaier, speaks to some of the issues you've raised:

 
  • Like
Reactions: awesomedeluxe

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
But if it's a big APU, 4x LPDDR5-6400 modules would probably cut it. That would be a threefold increase in bandwidth relative to a fourfold increase in core count. Bandwidth starved, sure, but still performant.

I don't think that is going to be enough bandwidth. For that kind info chip, you really want 200+GB/s... so something "HBM-like". Frankly, I am starting thinking that Apple will have a DYI-HBM with stacked LPDDR chips and very wide memory bus, with 8 memory controllers or more. They already stack RAM on top of the iPhone chip, so I don't see why it wouldn't be possible.



That's not impossible but it's certainly unprecedented.

I don't think it's unprecedented. There are laptops shipping with large and hot GPUs. A mobile RTX 2080 is 545mm2 with a TDP of 80W in the Max-Q configuration. An Apple SoC with a 12+4 CPU and a 32-core GPU will probably have a combined TDP of around 60-70watts. Shouldn't be that much of a challenge to cool in the current 16" chassis.

But it really sounds like Apple is making separate GPUs. Chips that, at the very least, are separate enough to need their own memory controller. A 16 firestorm core part, minus the GPU cores, plus some I/O is probably under 200mm2. 32 GPU cores plus a memory controller is around 150mm2, and it just makes more sense for Apple to target parts this size. But it's weird to imagine a discrete GPU plugging back into the same memory pool the CPU is using, not to mention challenging. I can't think of a good way to go about it.

For high-end configs (like the Mac Pro), I don't really see them doing monolithic chips — yields will probably be abysmal. But a multi-chip package, with CPU+GPU dies connected to a shared I/O+cache die (possibly stacked with RAM) — that should be doable. AMD does it with Zen3 and it seems to work just fine.

But than again, it's Apple we are talking about. I can totally see them using a 1000mm2 monolithic die that costs $1000 to make just to prove a point. Still cheaper than the Xeons and the GPUs they have to buy from a third party :)
 
  • Like
Reactions: Boil

Pressure

macrumors 603
May 30, 2006
5,182
1,545
Denmark
Is 12 a typo? The article says 16 with some cores potentially disabled.

I got 263mm2, eyeballing the Firestorm block at about 18mm2 and GPU cores at about 30mm2. We still have to dedicate some space for extra I/O. I'd round up to 300mm2 to account for that and potential increases to SLC and other areas, but we're in the same ballpark.

I agree with you that that's not a giant chip. But it's... really big. It pretty much puts the final design in the hands of TSMC's N5 process. I guess that's consistent with the article, which is suggesting they could bin these pretty aggressively. They'd probably have to before putting it into the MBP14 anyway, so it's not like these binned chips wouldn't have a home.

A 16+32 core APU still gives me pause. Like, even if we assume the CPU scaled down to iPhone speeds, it's still consuming 40W under load... and the GPU also consumes 40W under load... all within about a square inch of silicon. Dell struggled to cool Kaby G, which was 65W over a much larger area, and I think that's the biggest accomplishment to date.
No, that’s 12 High Performance cores and 4 High Efficiency cores.

But let us be honest with ourselves.

I think the 16” MacBook Pro will get at maximum 8 High Performance cores, 4 High Efficiency cores and up to 16-core GPU (12- to 16-cores depending on configuration).
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.