Well, you can measure the peak memory bandwidth, but what good will it do? It’s going to be 350GB/s. We know how many channels there are and how wide the bus is.
Actually we DON'T know what that peak memory bandwidth will be...
On the one hand, the memory may be compressed, so that effective bandwidth is larger than 300GB/s.
On the other hand, that effective bandwidth may not be visible to a cluster (or to compute generally) if the bandwidth between a cluster and the SLC is still capped at around ~100GB/s
On the third hand, maybe that bandwidth is capped higher, since this is a newer SoC.
On the fourth hand, with six rather than four CPUs in a cluster, a higher cluster<->SLC bandwidth makes sense.
So bottom line is
- unless you're willing to do a serious deep dive into the new architecture, testing a lot of different bandwidths under different conditions (multiple threads, GPU bandwidth, compressible vs incompressible data sets, etc) you're unlikely to be able to conclude much of interest or validity from a single number derived from code you can't modify and control.
What COULD be done to investigate this, by amateurs, is on a low-end machine (ie an 8GB machine) create the scenarios that people claim lead to thrashing on these machines (which appears to be something like create 20 tabs in Chrome), do the same thing on an M2 equivalent 8GB machine, and see if there is a noticeable difference.
I don't think you can do a perfect investigation right now; the memory footprints between the M3 Pro and M2 generation don't quite match. But you could try this sort of thing and at least see what happens, if there is a noticeable performance drop at, say 1.5x as many tabs, or if you can go quite a bit further (which would suggest some sort of transparent [non-page-based!] memory compression.