Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Gnattu

macrumors 65816
Sep 18, 2020
1,106
1,668
That one has to wait for benchmark. Apple know there is going to have performance loss due to reduction of memory BW. That's why they create dynamic caching, let's wait and see
Oh they are not related. Dynamic caching is more useful for low memory capacity not bandwidth.
And 150G/s is already more than what a 6-core P cluster can pull, probably the GPU performance is more impacted but it is still to be seen.

The M3 Pro is actually a downgrade because transistor count is less than M2 Pro. Apple just realized that compute people is not buying Max variants.
 

senttoschool

macrumors 68030
Nov 2, 2017
2,626
5,482
Oh they are not related. Dynamic caching is more useful for low memory capacity not bandwidth.
And 150G/s is already more than what a 6-core P cluster can pull, probably the GPU performance is more impacted but it is still to be seen.

The M3 Pro is actually a downgrade because transistor count is less than M2 Pro. Apple just realized that compute people is not buying Max variants.
That’s actually aimed at me. I’m a high CPU compute user happily buying the Pro chip because it matches the Max in CPU performance. Now Apple has broken that.
 

Homy

macrumors 68030
Jan 14, 2006
2,507
2,459
Sweden
At least it exists and we don’t have to wait until the middle of 2024.

20% is very good indeed and that's only the GPU. M3 Max CPU is up to 50% faster than M2 Max. Intel 14th gen CPUs like i9-14900K are only a few procent better than 13th gen like i9-13900K according to Puget Systems among others but I guess in the PC world Intel's 5% is equal to Apple's 20%.
 
Last edited:
  • Like
Reactions: krell100

Gnattu

macrumors 65816
Sep 18, 2020
1,106
1,668
Most curious fact in the announcement: the 6P + 6E config for Pro was correct. How utterly bizarre.

It looks like they decided they really need a bigger CPU perf differential between Pro and Max. Not surprising, but it is surprising that they did it this way. Did they actually build a 2P cluster, or is there an entire 4P cluster there with 2P disabled? At their volume, I'd expect the former.
Actually, these new chips come with a new 6-core cluster design. Even the 4-core M3 follows a similar pattern. You can see big square-like structures in all three SoCs, with one corner without CPU cores. Instead, there is an AMX coprocessor in that corner. The AMX coprocessor is smaller than 2 P-cores, so the square still appears to be missing one corner. The 4-core M3 seems to have 2 more cores removed from the other 2 corners, as annotated in the graph. The red square represents the approximate cluster, the blue square represents the AMX coprocessor, and the yellow square represents the 'missing cores' in M3

Apple-M3-chip-series-architecture-231030_big.jpg.large_2x.jpg
 

Kazgarth

macrumors 6502
Oct 18, 2020
318
834
1698719393167.png

Interesting

M3 has 18 TOPS NPU for AI, less than iPhone 15 Pro 35 TOPS

Or the Snapdragon Elite X 45 TOPS
 

name99

macrumors 68020
Jun 21, 2004
2,410
2,318
Damn, it's still 8GB but with 512GB SSD

Something odds about memory bandwidth of M3 Pro: 150GB/s and 18GB of RAM, hmm. Apple is reducing memory bus from 256-bit to 192-bit, Damn Apple
I'm seeing a lot of these sorts of claims. I'm not convinced.
My guess is that
- M3 Pro (and Max) use LPDD5X
- 150GB/s is "enough" for the target uses of this chip. Note that QC, targeting the same sort of price segment for their chip, also go with only 150GB/s.

- look at the Max. If we assume the obvious stuff scattered around the edges of the GPU is memory controller plus PHY, then two things stand out
+ Pro and M3 both have eight of them (though the layout geometry is slightly different between the two)
+ Max has something different: 8+8, also each is double wide.

The obvious assumption is something like Max has four times as many EXTERNAL pins to DRAM, but twice as many internal pins. Meaning something like it has quadruple the RAM capacity, but twice the RAM bandwidth, as seems to be the case.

This suggests something for M1 and M2 generation, Apple was kinda forced to tie together RAM capacity and bandwidth in a way that perhaps was not optimal. (Of course one always likes more RAM bandwidth, but not at the expense of area and power.) With the new design (and the higher intrinsic speed of LPDDR5X) they can recalibrate this so that DRAM capacity can double in the Max without requiring the full array of DRAM paraphernalia in the M2 and M1 Max. If you compare relative to the M1 or M2 generation, the M3 Max's memory stuff looks the same, the M3 and Pro memory stuff is half as wide. So in a sense Apple has substantially shrunk the area (and power?) required to communicate with memory in the M3 and Pro, and while the area has not shrunk on the M3 Max, the memory capacity has doubled.

The interesting case especially will be the Ultra and Extreme, presumably at 600 and 1200GB/s. That seems low compared to nVidia high end, but Apple may believe (and may be correct?) that
- their large available memory (presumably 512GB for a maxed out Extreme) supports customers trying models that simply won't fit even on the largest nVidia system
- their various technologies (eg SLC and tagging of GPU streams to give them much more intelligent memory behavior) allow them to match much higher raw bandwidth at lower power?

Other cases of interests:
- Seems, like I suggested, that the new cluster size is (up to) 6 cores. In principle all 6 could share the same infrastructure (same L2, same L2 TLB, same page de/compression engine, same AMX). OR you could sub-cluster so that, eg, three cores share an AMX. By eye it's not clear to me. The 6 P-core clusters are very clear, but AMX could be duplicated or not.

- anyone know what the "Dynamic Caching" stuff is about? My best guess is that
+ GPUs do not allow for dynamic allocation of many resources (like Scratchpad or Ray tracing temporaries) and so apps are forced to allocate the maximum size they might require. Which in turn means that you often can't pack as many threadblocks onto the core as you would like to because they all claim to want a lot of (then unused) Scratchpad.
+ Apple works around this; perhaps by the second level paging that I discovered in the patents but did not understand.
?Apps allocate lots of space in Scratchpad or Ray address space, but that's "virtual" allocation. Attempts to touch the address space trigger a physical allocation in Scratchpad or Ray cache, but if you never touch the address space...? Basically like standard VM and its various magics like page faults for demand allocation, only handled by the GPU (presumably the GPU companion core) rather than the OS proper.

An alternative possibility is they copy what (Ampere? one of the recent nVidia chips) has started to do where one core can use the local storage of a neighboring core. So rather than separate Scratchpads per core, it's more like there is a large pool of Scratchpad, and any threadblock can allocate within that large pool.

Yet a third option (again copying nVidia) is you have common storage for GPU L1D and what Apple call's Tile memory (ie basically Scratchpad) so instead of say 8K of L1D and 64k of Tile, you get 128K of combined, use as much Tile/Scratchpad as you need, and the rest is "dynamically" used as L1D.


Other non-obvious points:
- why no Pro or Max in an iMac? Apple knows their market better than I do, but my guess would be lots of people want that!
The obvious rejoinder is that (at some point...) an iMac Pro is coming, maybe 32", and at that point we'll see the full range.
But even so, to my eyes the obvious configs are iMac with M3 or M3 Pro [cf the mini], and iMac Pro with Max, Ultra or even Extreme.

- why not announce the mini's at this same time? What's the point of delaying them? Is it purely a business decision, in the sense that different products get announced in different quarters to smooth revenue? That's my best guess.
 
Last edited:

senttoschool

macrumors 68030
Nov 2, 2017
2,626
5,482
I'm starting to think that M3 might in fact be a different architecture than A17 Pro...

Different NPU, GPU feature not in A17 Pro, "only 30%" faster than M1 in CPU which would yield a GB6 ST score of only around 3,000. That seems off.
 

senttoschool

macrumors 68030
Nov 2, 2017
2,626
5,482
M3 series is what happens when the MBA took over. It felt like M1 was an engineer's dream. Simple. Elegant. Then the MBAs needed something to do so they segmented the hell out of the chips. Now we have a mess of configurations.
 

Homy

macrumors 68030
Jan 14, 2006
2,507
2,459
Sweden
No on-chip GPU memory (the UMA RAM is on-package), just Dynamic Caching of the UMA RAM in regards to GPU usage of said UMA RAM...?

There has always been on-chip memory (core private memory and caches). Dynamic Caching refers to the use of on-core memory by the shaders. It has nothing to do with UMA.

I'm not familiar with the details but was referring to this image. So perhaps there's always been on-chip memory as Leman says.

Skärmavbild 2023-10-31 kl. 03.48.06.png
 

name99

macrumors 68020
Jun 21, 2004
2,410
2,318
M3 series is what happens when the MBA took over. It felt like M1 was an engineer's dream. Simple. Elegant. Then the MBAs needed something to do so they segmented the hell out of the chips. Now we have a mess of configurations.
It always looks like that after a Keynote because we get a limited view of the full story.
We don't have a "mess of configurations"; what we have is what we have always had – if you want the nicest stuff, you have to pay more for it. For some reason, there's always people who find this a personal insult, and imagine that the rest of us actually care about this complaint.

If you have something TECHNICAL to say, stick around.
If you're angry that Apple (like every other company on earth) charges more for nice stuff, please share your fascinating insights on this issue somewhere ELSE...
 

sunny5

macrumors 68000
Jun 11, 2021
1,837
1,706
I would say I'm disappointed especially what they did with memory bandwidth, M3 Pro, and price. Beside, M3 Max's GPU is only UPTO 20% which is only good for limited uses. Beside, who even mainly use 3D, AI, research with Mac? Most of them will be PC, not Mac especially lack of software.
 

zlt1228

macrumors member
Mar 13, 2019
66
107
Damn Apple, all M3 lineup still using LPDDR5 with reducing memory bus. M3 SoC remains same config, M3 Pro is using 192-bit memory bus. M3 Max is having two type of memory bus support, 30-core GPU is using 384-bit memory bus while 40-core GPU is using 512-bit memory bus. Damn, only Apple can do it. :mad:

Memory DensityMemory Size Per Chip x64128-bit Memory Bus (100GB/s)192-bit Memory Bus (150GB/s)384-bit Memory Bus (300GB/s)512-bit Memory Bus (400GB/S)
Memory Chips Required2 pcs3 pcs6 pcs8 pcs
32 Gigabit4 GB8 GB
48 Gigabit6 GB18 GB36 GB48 GB
64 Gigabit8 GB16 GB64 GB
96 Gigabit12 GB24 GB36 GB
128 Gigabit16 GB96 GB128 GB
They could have done 6GB * 2 to create 12 GB on M3, but chose not to. At this point it just seems petty. When you get to M3+16GB, it is only 200 Dollars cheaper than the M3 Pro with much more powerful… everything.
 
  • Like
Reactions: Ruftzooi

senttoschool

macrumors 68030
Nov 2, 2017
2,626
5,482
It always looks like that after a Keynote because we get a limited view of the full story.
We don't have a "mess of configurations"; what we have is what we have always had – if you want the nicest stuff, you have to pay more for it. For some reason, there's always people who find this a personal insult, and imagine that the rest of us actually care about this complaint.

If you have something TECHNICAL to say, stick around.
If you're angry that Apple (like every other company on earth) charges more for nice stuff, please share your fascinating insights on this issue somewhere ELSE...
I didn't know this thread was titled "M3 Chip Generation - TECHNICAL Speculation Megathread". Get your BS out of here.

Don't get mad because you keep writing essay long posts that no one reads.
 
Last edited:

zlt1228

macrumors member
Mar 13, 2019
66
107
I would say I'm disappointed especially what they did with memory bandwidth, M3 Pro, and price. Beside, M3 Max's GPU is only UPTO 20% which is only good for limited uses. Beside, who even mainly use 3D, AI, research with Mac? Most of them will be PC, not Mac especially lack of software.
The chose to cut M3 out of M3 Pro this year, instead of cutting M3 Pro from M3 Max. The resulting CPU performance is only 20% faster than M1 Pro, so about the same as M2 Pro. It is laughable at this point.
 

TigeRick

macrumors regular
Oct 20, 2012
144
153
Malaysia
They could have done 6GB * 2 to create 12 GB on M3, but chose not to. At this point it just seems petty. When you get to M3+16GB, it is only 200 Dollars cheaper than the M3 Pro with much more powerful… everything.
Yeah, that is what I thought when I created the table. With 12GB RAM, MBP14 @ $1599 is justified Pro naming, but Apple being Apple, damn Tim :mad:
 
  • Like
Reactions: tk111
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.