http://www.anandtech.com/show/10578...rs-micro-op-cache-memory-hierarchy-revealed/2
This is best analysis of the Zen CPU. Now it is all down to core clocks on final silicon.
They will have 32 lanes. Zen is modular architecture. It has 2 clusters that are built from 4 cores, and 8 MB of L3 cache.
Both clusters need to have their own separate 16 PCIe lanes, for complete scalability.
4 cores, 8 MB L3 cache, 16 PCIe lanes.
8 cores, 16 MB L3 cache, 32 PCIe lanes.
....
Bandwidth between the CPU and the GPU is not usually a bottleneck. With zen topping out at 32 cores that is going to be a very big die.
I doubt they have any room left to fit any sort of GPU on there, especially one that has even mainstream performance.
http://www.planet3dnow.de/vbulletin...95W-TDP-DDR4?p=5110384&viewfull=1#post5110384Anandtech hasn't dug into the UnCore part of the flavors of Zen yet. Perhaps after the Hotchips presentation.
Errrr. no. Here is the die photo of the 8 core model.
http://dresdenboy.blogspot.com/2016/08/some-last-chance-pre-hot-chips.html
The PCIe+FCH ( south bridge) portion is not replicated per core cluster. The memory channel controller is replicated. The GMI-LInk (GMI -- Global Memory Interconnect. Think HyperLink or Intel's QPI links ) is replicated. The south bridge is not.
8 cores may have 16 or 32 depending upon what UnCore Southbridge is attached. That is likely why the 32 core model has 32 PCI-e v3 lanes. [ a MCM package with two if these 8 core dies each with 16 ; 16 * 2 = 32 ] Therefore, the two package solution of the 32 core model has 64 [ which aligns up with Anandtech's motherboard layout]
... to have at least 36 PCIe lanes. According to him, the listed configuration seems to be a bit chaotic. BTW, "Promotory" should actually be written "Promontory".
From the same blog .
What about Thunderbolt?
A trash can thermal core could easy 3x APU, a nMP based on Zen APU should be a monster.You mean relevancy for the Mac Pro design? Chuckle.... this tread doesn't seem to be concerned about that.
If there is not enough PCI-e v3 bandwidth to go around, then it is a non starter even before start to wade into boot/support issues.
Intel is still the sole supplier of Thunderbolt controllers. Not buying them bundled with CPU packages isn't going to be cheaper. Intel probably isn't going to bend over backwards with boot/compatibility issues. Likewise, not sure Thunderbolt has gotten to the critical mass point where AMD cares about being "left out". The USB 3.1 Type C alternative modes of DP+USB gen 2 covers much of what the original TB v1 did. AMD has more than enough drama to fix to get back to be very competitive in the overall market than to add TB to the pile at the moment.
Unless, there are more than a couple other system vendors who want TB v3 who are not Apple, it doesn't really make alot of sense to AMD to chase that. Especially, if Apple being Scrooge McDuck and not paying for the whole effort in advance. It is too high a risk at the moment to spend a significant amount of money and lose out in a design bake-off. AMD needs to get healthy and then they may be able to afford that kind of stuff.
A trash can thermal core could easy 3x APU,
What cube is asking for is supposedly as a interest for his potential Zen build at home.
IIRC, Thunderbolt was supposed to work with any brand, not only Intel, but also AMD, ARM, and... Nvidia.
Back to thread for a second. 95W APUs also available? That is more interesting.
We'll have to wait for benchmarks, but I'm growing ever more suspicious that Zen's SMT implementation is more like Power8's than anything Intel's produced so far. Intel's approach has been to allow a second thread to use unused CPU resources, but doesn't really over-provision those resources (a single thread can very nearly saturate the whole CPU). On Power8, they can scale up to 8 threads per core (Zen will only do 2), but they make that viable by doubling down on key CPU resources in the first place (Instruction Cache, rename registers, etc.). The end result is that the second SMT thread on Intel increases overall performance by around 15-25%, but on Power8 the second SMT thread can increase overall performance by around 60% in some workloads. In Layman's terms, Power8's 'hyperthreads' are more useful than Intel's.
AMD haven't talked about rename registers yet, but they have revealed that the instruction cache is 64KB per core; perhaps not-so-coincidentally, that's double the size of Skylake's instruction cache, and the same size as Power8's. The L1 Data cache is only 32K in all of these processors, but its rather odd in processor design to have your instruction cache be twice the size of your L1 data cache -- unless you have a good reason. There's only two reasons I can think of -- either that second thread chews through a lot more instructions than in competing SMT designs, or possibly the uOp Cache can spill to L1. Looking at the slide from HotChips that shows which CPU resources are exclusive, competitively shared, or arithmetically arbitrated, has me leaning toward the former, though they might not have overprovisioned CPU resources enough to match Power8 fully. There were also rumors months back about Zen doing some really novel things with SMT, which would seem to back that up.
The implication of that would be that Zen could run at a lower clockspeed than Intel's current Broadwell DE but still match in overall threaded performance (but perhaps giving up 10-15% single-threaded performance (not clock-normalized)). For the mainstream, they could release a quad-core CPU at similar clocks to Skylake, and outperform it in threaded workloads. In gaming workloads, since current consoles make 6-7 threads available to games, a quad-core Zen with 4 hyper-threads giving ~60% additional performance would give a lot bettter performance than a quad-core i7 with 4 hyper-threads giving ~20% additional performance. In fact, that Zen would would have a throughput comparable to 6-7 dedicated cores.
We won't know until someone does an architecture deep-dive or we have benches showing SMT gains much larger than intel's. But its looking increasingly likely from what I see.
Dec, the technology required to interconnect CPUs its old, Given AMD foresee those APU to be key offering for HPC AMD should already has some provisions and name it Coherent Link (also I read somewhere they till launch an HPE Moonshot card.).No. The inter-socket/package connections that AMD will have (if could be used with an APU ) likely have distance limitations that the current Mac Pro design can't get around. Also as mentioned elsewhere, the additional ram DIMMs go where?
The desktop/'single socket' APUs can't really be connected. Having three seperate computers inside the Mac Pro case doesn't really buy all that much.
I don't care about a proprietary Metal API.
Apple should support Vulkan.
Same reason why I prefer OpenCL to CUDA, and OpenGL to DirectX.
GL was created by a corporation.DX does kill GL though. Not everything corporate sucks.