64 GB maybe good for entry level stuff but considering the target industry most likely a cache, but as your screen capture suggests it’s flexible (bottom right )
Systems composed of a single HBM package will likely be very low. If put 50-1000 of those 64 GB packages into a coherent system and that's 3.2 - 64 TB of overall system RAM. That really is not "entry level stuff".
The target industry is supercomputer nodes. Not solitary, single user, single app desktops.
It isn't homogeneously flexible to the applications. The mix-mode in the middle requires an App rewrite. For Apple that likely the mode you are pragmatically pushing to once take into account that the CPU and GPU would likely have to share the HBM/"Poor mans HBM via LPDDR5". Caches for CPUs do not have the same access patterns as caches for GPU.
Pragmatically, This is mostly second generation stuff for Intel. This "flex mode with HBM" stuff was in the Xeon Phi ( Knights Landing). And also Intel's OS work with trying to weave substantially different Optane DIMMs in with pure RAM DIMMs. ( requiring application modifications and/or kernel tweaks in many/most cases ).
Not particularly surprising it shows up here because when Xeon Phi "Kings Hill" was killed off a new Aurora Intel package was to take its place.
To a certain extent, the “Knights” family of parallel processors, sold under the brand name Xeon Phi, by Intel were exactly what they were supposed to be:
www.nextplatform.com
Supercomputer node that is focused on HPC and a couple of local GPGPU compute accesslerators needs to keep up and would/could go HBM only. The very high bandwidth RAM is spread between CPU package and GPU packages to keep the cores fed. Nodes that are tasks with File I/O probably can lean on a bigger/better cache hierarchy.
There is also some configurations being thrown at in-memory datawarehouses where the apps are already tweaked to deal with different speed "fast RAM".
The vast bulk of Xeon SP gen 4 thrown into individual workstation sockets is not going to be these. A relatively few will, but that isn't what is paying the bills for this HBM variant development.
And this "everything and the kitchen sink" is one reason why this hundreds bugs and ridiculous number of steppings to work through that greatly delayed its time to market. Jumping up and down trying to push maximum complexity at Apple SoC isn't likely going to get them to come out quicker ( or last long term as a viable product. )
[ Time will tell how well Intel's HBM 'bolt on' works versus AMD far more cleanly and purposeful 3D stack of simply just more L3 cache in HPC supercomputer node use cases (smaller capacity jump , but lower latency hit) . ]
AMD has basic graphics in the TR 7 series … itself a sign of how AMD is seeing stuff going forward, or intel’s much derided consumer GPU venture which they plan to use as part of the CPU itself.
What are the signs that Threadripper 7000 has an iGPU? The desktop Ryzen 7000 have an iGPU but that is a completely different IO hub chiplet.
Sightings of TR 7000 Pro so far have pegged it as very likely being the same package as Eypc (just like the previous "Pro" iterations) .
AMD's Ryzen Threadripper 7000 CPUs codenamed "Storm Peak" has appeared within the Einstein @Home database & features 64 Zen 4 cores.
wccftech.com
AMD Threadripper series getting more cores with “Storm Peak” New AMD OPN codes have shown up on Einstein @ Home website. It looks like someone is testing unreleased AMD CPUs with high core count. This is definitely not the first time we spot an unreleased AMD processor on volunteer computing...
videocardz.com
AMD slowed down the rollout of this generation Eypc package to weave CXL 1.1 into the package. There is no iGPU at the server level for the Zen 4 generation in the server space. Hence, very likely non for the Threadripper space either.
a GPGPU as a compute accelerator is being driven by both CXL 1.1 and by proprietary InfinityFabric links to AMD Mixxx series accelerators.
There might be a "non Pro" Threadripper but likely would come with socket SP6 which again is for Server packages.
AMD SP5 socket has been confirmed It didn’t take long for the rumors about AMD’s next-gen SP6 socket to already find a confirmation through newly leaked photos. Over at AnandTech forums one can find photos and schematics for the socket codenamed SP6. This socket has been explained in leaked...
videocardz.com
Somewhat possible if push the CPU core count down below 64 to add an iGPU bandwidth pressure uplift. Smaller, more manageable socket but also possibly loosing memory channels. ( at least the standard usage as server does. could be other pins can repurpose if don't have power distribution issues to cover. )
Either way Threadripper is a derivative product forked from the server package products. The volume to support the specific chiplet development is being driven by the server package unit volumes; not by diving deep into a 10% of 1% niche.
Towards the end, he even wonders why no one’s released a pure raytracing accelerator yet(something I have been wondering too)
completely disintegrated from the raster process is going to buy what? Similarly from the display output process buys what in most use cases ?
So maybe Apple has something planned? Maybe.
Apple has something planned to jump eyeball deep into the server and supercomputer node market? Probably not. MacOS still doesn't do more than 64 threads. The shared kernel between the rest of the Apple product line up probably means that isn't going to change.
Apple has shown distinct disinterest in CXL. Their chiplet interconnect is homegrown and likely not looking to couple to third party dies. And also extremely focused on highest per/watt ( not max interopertibilty).
So yeah, interesting developments but as the narrator cautioned, it’s a big ‘if’ . Nevertheless a sign of where things are headed all over. Some may trickle down to the consumer market. Apple’s success (so far) with the m1 in the consumer space may have the others trying it out.
AMD's 3D stacked cache already has. CXL will in future iterations of Intel (Gen 14 ) and AMD (probably another iteration (or two) ).
Some of the previous Xeon Phi cards ran a lightweight Linux on the 'card' and communicated to the host system over a virtual ethernet connection implemented over the PCI-e x16 connection (set up virtual ethernet device on both sides and then can just communicate with apps that used physical wire mechanisms to talk from node to node).
A MI300 node or Mx Max node on a card communicating back to the host Mac would be doing scale out cluster inside the container.