The other aspects of a potential 'M1X'

quarkysg · Feb 22, 2021

bla1011 said:
Apple silicon magic strikes again.

I think you need to cool down and understand the context of the discussion.

quarkysg · Feb 22, 2021

crazy dave said:
On the subject at hand, I think LPDDR5 is unlikely for the M1X.

I think you'll be right in this case. I think M1/X probably only has LPDDR4X memory controller built-in. Probably have to wait for the next gen M series SoCs for DDR5/LPDDR5 memory controller.

Most likely would be that the M1X, if it comes out, will likely use higher channel LPDDR4X. It really depends if Apple thinks it'll be enough to beat the performance level of the models they are replacing with the M1X, for them to claim 2-3x performance increases, especially in the GPU department.

leman · Feb 22, 2021

quarkysg said:
I think you'll be right in this case. I think M1/X probably only has LPDDR4X memory controller built-in. Probably have to wait for the next gen M series SoCs for DDR5/LPDDR5 memory controller.

Most likely would be that the M1X, if it comes out, will likely use higher channel LPDDR4X. It really depends if Apple thinks it'll be enough to beat the performance level of the models they are replacing with the M1X, for them to claim 2-3x performance increases, especially in the GPU department.

I think that the main reason why we don't see LPDDR5 in Apple products yet is because the supply is still very limited. The controller itself probably supports it. I believe there is some hope that we will see LPDDR5 in higher-end, lower volume Macs.

JMacHack · Feb 23, 2021

ArPe said:
HUrRY uP IwAnT 16 inCh M2 nOW!!

This but unironically, it’s like being blueballed.
Also, the posts in this thread seem to point to Apple’s gpu being on-par with the traditional players, except in memory bandwidth, where ddr and lpddr won’t provide as much.

so what’s more likely, that Apple’s preserves the UMA and uses the same RAM for the cpu and gpu or goes back to a traditional RAM arrangement for their higher end machines?

Joe The Dragon · Feb 23, 2021

quarkysg said:
I would think with Apple's experience with high performance system (i.e. Mac Pros, xServes, etc) their internal fabric would be designed to handle really high bandwidth. Like you said, Apple's pocket is deep enough to go really wild as far as SoC design is concerned.

Wow! I suspect tho. that what you saw probably are calculations performed wholely using the SoC's cache? FP32 are 32-bits long. Completing 2.6 TFLOPS with each data item 32-bits long means we need 10TB/s of bandwidth in steady state, not withstanding the other processing cores' need for memory bandwidth. I'm sure my calculation is over simplifying the scenario but I somehow think that simply doubling the bandwidth will double the M1's 8 GPU core's performance in real world use.

I think you'll be right in that most likely they'll go with multi channel DDR5. I still can't reconcile how Apple will implement it for the iMacs and Mac Pros tho. Using soldered memory in notebooks may be fine, but I don't think Apple would want to manufacture a bunch of iMacs and Mac Pros with soldered memory and find themselves stuck with unmovable inventories. It'll be interesting to see how Apple is going to address this.

Also, I'm not sure if there are designs where the internal fabric are designed to connect to two types of memory controllers (i.e. HBM2 and DDR5/LPDDR4X). If possible, the fixed HBM2 memory will be used for the GPUs, while the slower DDR5/LPDDR4X could be via DIMM slots for the Mac Pros and iMacs, maybe even the 16" MBP. The drivers will have the smarts to delineate the memory regions for various processing cores' use. If anyone can do it, it'll probably be Apple.

I think and hope that the issue is likely driver related instead of the actual Silicon. Intel's CPU and their north/south bridges chipsets are mature with equally mature drivers, while the M1 has yet to be battle tested, so to speak. So I'm hopeful existing issues will be resolved with future Big Sur updates.

Ram and CPU on an card?

But in an pro system storage needs to be on cards like the imac pro and mac pro have now.

leman · Feb 23, 2021

JMacHack said:
so what’s more likely, that Apple’s preserves the UMA and uses the same RAM for the cpu and gpu or goes back to a traditional RAM arrangement for their higher end machines?

Why not preserve UMA and use a memory configuration that will offer higher bandwidth? Seems like best of the both worlds to me.

fisha · Feb 23, 2021

so what’s more likely, that Apple’s preserves the UMA and uses the same RAM for the cpu and gpu or goes back to a traditional RAM arrangement for their higher end machines?

Would it be possible to have a 16/32GB on chip UMA and have it supplemented with DIMM style memory pool? Effectively making a super cache on the chip.

crazy dave · Feb 23, 2021

leman said:
I think that the main reason why we don't see LPDDR5 in Apple products yet is because the supply is still very limited. The controller itself probably supports it. I believe there is some hope that we will see LPDDR5 in higher-end, lower volume Macs.

Maybe, but even with that logic, I just don't see the M1X Macs as having it. The rumored models getting the M1X are the Mac Mini, the 14/16 MacBook pro, and the low end iMac and these are still among the higher selling models. By the time Apple gets up to the really low-volume Macs, the SOC will probably be the M2 generation and hopefully LPDDR5 will be standard across AS. But who knows? I'll be happy to be wrong about it coming earlier.

flopticalcube · Feb 23, 2021

crazy dave said:
Maybe, but even with that logic, I just don't see the M1X Macs as having it. The rumored models getting the M1X are the Mac Mini, the 14/16 MacBook pro, and the low end iMac and these are still among the higher selling models. By the time Apple gets up to the really low-volume Macs, the SOC will probably be the M2 generation and hopefully LPDDR5 will be standard across AS. But who knows? I'll be happy to be wrong about it coming earlier.

I don't think the mini will get an M1X. Traditionally, the mini has been a parts bin Mac and will live off the excess M1's and LPDDR4 until the supply runs out or demand begins to wane. Then, when the Air is ready for the next low-power Apple Silicon chip, the mini will also get it.

crazy dave · Feb 23, 2021

flopticalcube said:
I don't think the mini will get an M1X. Traditionally, the mini has been a parts bin Mac and will live off the excess M1's and LPDDR4 until the supply runs out or demand begins to wane. Then, when the Air is ready for the next low-power Apple Silicon chip, the mini will also get it.

The reason I think so is that Apple left higher range Intel minis in their lineup - they only replaced the lowest end mini and continued selling i5/i7 minis with more (and, in the case of ethernet, better) ports. In contrast, Apple replaced the totality of MacBook Airs with M1s.

Beyond higher CPU/GPU core counts, the M1X is supposed to come with improved IO. It thus stands to reason than an M1X mini could ably replace the i5/i7 minis in both power and IO currently in the lineup. After all, as powerful as it is rumored to be, the M1X is reportedly still only a 35W chip. So the mini would be M1 and M1X instead of i3/i5/i7.

I grant you the other possibility is that Apple won't replace those Intel mac minis until the M2. But I don't see why they wouldn't.

flopticalcube · Feb 23, 2021

Yes, they may have an M1/M1X configuration option.

Boil · Feb 24, 2021

Gotta have a M1X to go along with a Space Grey chassis...!?!

leman · Feb 24, 2021

fisha said:
Would it be possible to have a 16/32GB on chip UMA and have it supplemented with DIMM style memory pool? Effectively making a super cache on the chip.

You can, but would it really be useful? You are basically talking about introducing another level of cache. If your active RAM requirements are low enough that most of the active data can fit into the "fast RAM", well, do you even need the larger "slow RAM" pool? Couldn't probably just use SSD swap. And if you need more active RAM, the performance will be suboptimal. Let's not forget that the problematic of "fast RAM and plenty of it" is only really exists if we talk about the Mac Pro (consumer-level Macs don't need that much RAM), which is a multi-core machine designed for parallel workloads, so you really want to have a lot of memory bandwidth per CPU core.

EntropyQ3 · Feb 24, 2021

crazy dave said:
Maybe, but even with that logic, I just don't see the M1X Macs as having it. The rumored models getting the M1X are the Mac Mini, the 14/16 MacBook pro, and the low end iMac and these are still among the higher selling models. By the time Apple gets up to the really low-volume Macs, the SOC will probably be the M2 generation and hopefully LPDDR5 will be standard across AS. But who knows? I'll be happy to be wrong about it coming earlier.

Whether the rumoured M1X would get LPDDR5 would depend on just how cheap Apple would want to go.
The rumoured configuration of 8 performance CPU cores and twice as many GPU cores as the M1, would suggest that an M1X would need twice the bandwidth for performance to scale with computational capabilities vis a vis the M1.
I can see four cheap scenarios -
* 128-bit LPDDR4x, same as M1. This is a possibility, but it would suck.
* 128-bit LPDDR5. This would offer a 50% (in some scenarios better) improvement in bandwidth over the M1. It would be as cheap to implement as the M1 solution, and could offer up to 32GB if I read Samsung correctly. Max performance would be compromised relative to linear scaling.
* 256-bit LPDDR4x, twice the M1. Straightforward on all fronts, and a pretty reasonable configuration. Using the same parts as other devices probably makes procurement easier/cheaper, not that these devices should be very sensitive to such concerns, we are talking small money here.
* 256-bit LPDDR5, three times the M1 nominally, and better under certain conditions. This would allow superlinear scaling with computational resources vs. the M1, and provide a great hike in performance vs. the M1 with minimum expenditure of 5nm SoC area. Would also provide a wider range of memory configurations.

These are all cheap/minimum engineering effort variations of what Apple is already doing, and thus trivial to predict. So if any of this shows up in a rumour, it doesn't lend any credence to that rumour, much as just doubling the performance cores and GPU cores is a trivial extrapolation of the M1 that any internet bot can come up with. Doesn't mean it's wrong, but there is nothing there that suggests that anything other than speculation as a source.

crazy dave · Feb 24, 2021

EntropyQ3 said:
Whether the rumoured M1X would get LPDDR5 would depend on just how cheap Apple would want to go.
The rumoured configuration of 8 performance CPU cores and twice as many GPU cores as the M1, would suggest that an M1X would need twice the bandwidth for performance to scale with computational capabilities vis a vis the M1.
I can see four cheap scenarios -
* 128-bit LPDDR4x, same as M1. This is a possibility, but it would suck.
* 128-bit LPDDR5. This would offer a 50% (in some scenarios better) improvement in bandwidth over the M1. It would be as cheap to implement as the M1 solution, and could offer up to 32GB if I read Samsung correctly. Max performance would be compromised relative to linear scaling.
* 256-bit LPDDR4x, twice the M1. Straightforward on all fronts, and a pretty reasonable configuration. Using the same parts as other devices probably makes procurement easier/cheaper, not that these devices should be very sensitive to such concerns, we are talking small money here.
* 256-bit LPDDR5, three times the M1 nominally, and better under certain conditions. This would allow superlinear scaling with computational resources vs. the M1, and provide a great hike in performance vs. the M1 with minimum expenditure of 5nm SoC area. Would also provide a wider range of memory configurations.

These are all cheap/minimum engineering effort variations of what Apple is already doing, and thus trivial to predict. So if any of this shows up in a rumour, it doesn't lend any credence to that rumour, much as just doubling the performance cores and GPU cores is a trivial extrapolation of the M1 that any internet bot can come up with. Doesn't mean it's wrong, but there is nothing there that suggests that anything other than speculation as a source.

I’m kind of thinking option 3, hoping not option 1.

For what it’s worth, potentially very little, the M1X “leak” on CPU Monkey said it was LPDDR4X memory.

EntropyQ3 · Feb 27, 2021

crazy dave said:
I’m kind of thinking option 3, hoping not option 1.

For what it’s worth, potentially very little, the M1X “leak” on CPU Monkey said it was LPDDR4X memory.

I waited to see if the thread would generate some more traffic. Honestly I think we have exhausted the subject in the absence of new information, but I wanted to state somewhere that I’d really like to see a tile based deferred renderer with some serious hardware grunt, and the backing and support of a strong company such as Apple. Preferable priced within the reach of mere mortals (such as me) so that it benefits as many people as possible.
I’ll pay for it out of pure technical curiosity, and I’ll pay for games that are actually coded targeting the architecture in preference to buying them on other platforms available to me.
If anyone working at Apple who feels the same reads this, know that there are people on the sidelines cheering you on! 😀

leman · Feb 27, 2021

EntropyQ3 said:
I waited to see if the thread would generate some more traffic. Honestly I think we have exhausted the subject in the absence of new information, but I wanted to state somewhere that I’d really like to see a tile based deferred renderer with some serious hardware grunt, and the backing and support of a strong company such as Apple. Preferable priced within the reach of mere mortals (such as me) so that it benefits as many people as possible.
I’ll pay for it out of pure technical curiosity, and I’ll pay for games that are actually coded targeting the architecture in preference to buying them on other platforms available to me.
If anyone working at Apple who feels the same reads this, know that there are people on the sidelines cheering you on! 😀

I suppose your prayers will be answered in full very soon then

And yes, I agree that TBDR is a great thing to be exited about. Frankly, it is incomprehensible to me how a GPU enthusiast wouldn’t be excited about it. Of course, the credit goes to PowerVR, but they didn’t have the resources or the product stack of Apple to really pull it off.

Sophisticatednut · Oct 18, 2021

So.. What do you Guys now think of the M1 Pro and M1 Max? seems to be closer to a rtx 3080m but with 100w less power, and also the 400Gb/s transfer rate is way beond LPDDR5 RAM

thenewperson said:
At this point we've all seen the Bloomberg report about future ASi chips (refresher) which gets into a bit of detail about the core count variations of both the CPU and GPU that Apple is testing. But what do we speculate about other aspects of it? Will there be clock speed increases? Ray tracing hardware? Will the neural engine be the same across all? What kind of RAM will they use? How will the chips be packaged?

leman said:
We are still talking about an M1/A14 variant, so I wouldn’t expect any microarchitectural changes. Memory bus will be doubled to 256 bit, that’s almost certain, with twice as many memory controllers as M1 and four RAM chips instead of two. Packaging I’d expect to stay the same, just with two more RAM chips on the other side. No changes to neural engine from M1. Maybe LPDDR5. More thunderbolt controllers. That’s about it.

quarkysg said:
Doubling the memory bus width would also double the memory bandwidth to 136 GB/s, assuming the memory technology used is still LPDDR4X. It'll be interesting to see if doing that alone will double the M1's thruput for all processing cores (i.e. CPUs, GPUs, NPUs, etc.)

Tests conducted by Anandtech on the M1 Macs shows that a single Firestorm core is enough to saturate the 68 GB/s bandwidth. So it would seem that the M1 Macs are severely bandwidth constrained with more potential yet to be unleashed?

With UMA, I would think that the M1's system interconnect fabric would have to implement some sort of fair share algorithm for each of the processing cores to prevent data starvation. So the 68 GB/s bandwidth provided by the 128-bit LPDDR4X memory could not be allocated fully to any single processing core's use.

According to Apple, the M1's GPU could perform 2.6 TFLOPS (presumably FP32). From my limited understanding, 68 GB/s is nowhere near enough to keep the M1 7/8 GPU cores fed to achieve 2.6 TFLOPS.

For iMacs and Mac Pros, I would think it's unlikely Apple will go with higher bandwidth memory, e.g. HBM2, as it'll be too cost prohibitive to implement for consumer products. What I think would be likely is that HBM2 or equivalent (costly) memory tech. will be use solely for the GPU, and DDR5/LPDDR5 will be used for main memory, with the GPU sitting on a separate die/board with it's own memory, but with custom circuitry to ensure memory coherency with main memory so as to preserve the UMA architecture. The 68000 Macs used to have proprietary bus slots (if memory serves) for such purposes, so Apple may go back to custom designs instead of using PCIe.

I am probably completely off tho.

Thoughts?

leman said:
If you increase the number of processing cores, you have to increase the memory bandwidth. The GPU in M1 is already likely bandwidth-limited, if one wants to have 16 cores one needs to at least double the bandwidth.

That’s interesting, right? Bandwidth to individual cores is usually constrained, but not with Apple design. I wouldn’t say that M1 CPU is bandwidth constrained, more that it’s able to utilize all available bandwidth. As to what maximal bandwidth the internal fabric can support, we can only guess.

I was able to get pretty much exactly 2.6 TFLOPS using long chains of fused multiply adds. The FP16 performance is identical to FP32 (which is a big difference to A14 that has half the FP32 throughout). As to bandwidth... no GPU or CPU has enough of it. The assumption is that you do a bunch of calculations between loads and stores or your ALUs are running empty.

I think we will see “real” unified memory. Enduring coherency as you describe is really complicated and so dint think design purists st Apple would be happy with it. Maybe not HBM, but multi channel stacked DDR5 (8 to 16 channels, should provide plenty of bandwidth). Abs yeah, it’s costly but still cheaper than buying Xeons. And Apple is the only company that can afford it

quarkysg said:
I would think with Apple's experience with high performance system (i.e. Mac Pros, xServes, etc) their internal fabric would be designed to handle really high bandwidth. Like you said, Apple's pocket is deep enough to go really wild as far as SoC design is concerned.

Wow! I suspect tho. that what you saw probably are calculations performed wholely using the SoC's cache? FP32 are 32-bits long. Completing 2.6 TFLOPS with each data item 32-bits long means we need 10TB/s of bandwidth in steady state, not withstanding the other processing cores' need for memory bandwidth. I'm sure my calculation is over simplifying the scenario but I somehow think that simply doubling the bandwidth will double the M1's 8 GPU core's performance in real world use.

I think you'll be right in that most likely they'll go with multi channel DDR5. I still can't reconcile how Apple will implement it for the iMacs and Mac Pros tho. Using soldered memory in notebooks may be fine, but I don't think Apple would want to manufacture a bunch of iMacs and Mac Pros with soldered memory and find themselves stuck with unmovable inventories. It'll be interesting to see how Apple is going to address this.

Also, I'm not sure if there are designs where the internal fabric are designed to connect to two types of memory controllers (i.e. HBM2 and DDR5/LPDDR4X). If possible, the fixed HBM2 memory will be used for the GPUs, while the slower DDR5/LPDDR4X could be via DIMM slots for the Mac Pros and iMacs, maybe even the 16" MBP. The drivers will have the smarts to delineate the memory regions for various processing cores' use. If anyone can do it, it'll probably be Apple.

I think and hope that the issue is likely driver related instead of the actual Silicon. Intel's CPU and their north/south bridges chipsets are mature with equally mature drivers, while the M1 has yet to be battle tested, so to speak. So I'm hopeful existing issues will be resolved with future Big Sur updates.

dmccloud said:
That could be how Apple replaces the models with dGPUs (16" MBP, iMac), but I don't think you'll see that type of setup in a MBA or sub-16" Pro. There's a reason dGPUs (even the ones used in Macs) are running GDDR instead of DDR, so I think that you'd still have the SoC with its CPU and GPU cores, then a dedicated GPU with its own memory that connects via a (likely proprietary) high-speed bus to the SoC.

iBug2 said:
I don't think there's anything left to say. RTX 3080 Mobile has the following specs:

View attachment 1870019

M1Max 32 Core GPU offers 10.4 TFLOPs compute, 327 GTexels/s and 165 GPixels/s rates.

Texture and pixel fill rates exceed RTX 3080 Mobile but computing performance is half as much. (Wonder why that is).

Sophisticatednut · Oct 18, 2021

crazy dave said:
I’m kind of thinking option 3, hoping not option 1.

For what it’s worth, potentially very little, the M1X “leak” on CPU Monkey said it was LPDDR4X memory.

it's funny how apple actually went with LPDDR5 512-bit LPDDR5

thenewperson · Oct 19, 2021

Sophisticatednut said:
So.. What do you Guys now think of the M1 Pro and M1 Max? seems to be closer to a rtx 3080m but with 100w less power, and also the 400Gb/s transfer rate is way beond LPDDR5 RAM

I'm definitely impressed by the RAM bandwidth, though I guess it's mainly from them moving to LPDDR5 when I expected them to stay with LPDDR4X. A bit sad we didn't get the A15 cores here, which probably means next year's Mac Pro will be boasting 2020 cores 🥴

The bandwidth is in-line with LPDDR5 with a wider interface. It's probably ~408GB/s even.

Sophisticatednut · Oct 19, 2021

thenewperson said:
A bit sad we didn't get the A15 cores here, which probably means next year's Mac Pro will be boasting 2020 cores 🥴

what do you mean? why woul we have a15 cores considering it's their M1 version

thenewperson said:
The bandwidth is in-line with LPDDR5 with a wider interface. It's probably ~408GB/s even.

cant find this to be normal ether. considering GPUs normaly uses GDDR6(x) with more latency compared to LPDDR5

Sophisticatednut · Oct 19, 2021

thenewperson said:
I'm definitely impressed by the RAM bandwidth, though I guess it's mainly from them moving to LPDDR5 when I expected them to stay with LPDDR4X. A bit sad we didn't get the A15 cores here, which probably means next year's Mac Pro will be boasting 2020 cores 🥴

The bandwidth is in-line with LPDDR5 with a wider interface. It's probably ~408GB/s even.

thenewperson · Oct 20, 2021

Sophisticatednut said:
what do you mean? why woul we have a15 cores considering it's their M1 version

It was just my hope that it wouldn't be A14 cores but A15 cores. Oh well.

Sophisticatednut said:
cant find this to be normal ether. considering GPUs normaly uses GDDR6(x) with more latency compared to LPDDR5

Well it's probably cheaper to go with GDDR6 vs a very wide LPDDR5 interface, but Apple has money to spend.

quarkysg · Oct 21, 2021

thenewperson said:
Well it's probably cheaper to go with GDDR6 vs a very wide LPDDR5 interface, but Apple has money to spend.

From what I know, GDDR memory runs too hot and too power hungry, so it'll likely never considered. Also has 2-3 times the access latency as well.

leman · Oct 21, 2021

thenewperson said:
Well it's probably cheaper to go with GDDR6 vs a very wide LPDDR5 interface, but Apple has money to spend.

And get 1 hour battery life max? Not to mention ultra high latency?

The other aspects of a potential 'M1X'

macrumors 65816

macrumors 65816

macrumors Core

Suspended

macrumors 65816

macrumors Core

macrumors regular

macrumors 68000

macrumors G4

macrumors 68000

macrumors G4

macrumors 68040

macrumors Core

macrumors 6502a

macrumors 68000

macrumors 6502a

macrumors Core

macrumors 68030

macrumors 68030

macrumors 65816

macrumors 68030

macrumors 68030

macrumors 65816

macrumors 65816

macrumors Core

Our Staff