SK Hynix has a dual link interposer called 8-hi-hi, that can manage 2 layers of 4GB.
8hi is 8 dies stacked.
1. Do you get to keep all the bandwidth with the increase in distance ?
HBM is going extremely wide ( parallel ) instead of faster (and mostly serial). Wider paths all have to stay synchronized in flow.
2. At what cost?
"... It could also simply be down to cost. HBM, while cheaper than rival stacked memory technology Hybrid Memory Cube (HMC), is still likely to be pricier than the equivalent DRAM, making eight stacks impractical. ... "
http://arstechnica.com/information-...nfirms-4gb-limit-for-first-hbm-graphics-card/
The cost is probably going up. Not just because more dies but the interconnect complexity is higher ( 4 more paths from the 4 additional layers down through stack and into the interposer layer and package substrate.
8hi is coming in later standarization. It would be far less risky though to just get this out the door. It is Generation 1 technology. There are bound to be hiccups in production ramping. 8hi would just be more risky in both complexity and "clean up" costs.
Seems like SkyLake really is a big jump in IPC, comparable to the jump from P4 to Core series.
Where does that come from. The slide deck you point to claims better performance per watt. higher memory bandwidth. The first should allow them to pack more cores into the package. The second basically allows them to keep those cores fed with data/instructions.
AVX-512 could be counted toward higher IPC if count the parallel instructions implicit in the vector processing. But that isn't going to help the folks who want to run legacy code faster.
The biggest change since Nehalem is really about the being similar to the big bandwidth change at Nehalem uncorking the performance can get if really run that many cores in parallel. The single point prior to Nehalem was a chokepoint. On extremely parallel stuff 4 cores of Nehalem cores could approach 8 cores previous generation. That is in part because the previous ones were throttled.
Skylake looks to be far more attuned to a era where there is a relatively large amount of data on PCIe SSDs, substantially larger RAM (in memory databases ), and the inflow/outflow can be much higher if grid the data and attack it with double digits number of cores.
Rumours of MorphCore, same base core for all CPUs.
Errrrrr, probably not. The WCC site sells a lot of kool-aid that is primarily only good for generating ads page views far more than being based on reality.
ttp://www.fool.com/investing/general/2015/04/22/intel-corporations-skylake-morphcore-and-unrealist.aspx
While the Xeon E5 implementation of Skylake could be a tweaked from what was in the mainstream line up.... that is a bit of a huge tweak.
I suspect more along the lines if don't need the AVX-512 subsystem it can be shut off and that power shifted over to "turbo mode" the function units that are left. e.g., when running single threaded, even more of the sub function units in the core are turned off till have just the subset "drag racing" on. The core doesn't morph it is just a bigger core. Bigger transistor budget means can lots stuff not really using present ( but you pay for). All of that will probably get "more performance per Watt" which is exactly what Intel is claiming.
The same "turn stuff off you are not using" works on the desktop/laptop designs which have fewer power hogs internally.
MorphCore I would more expect to see in next iteration of Xeon Phi after Knights Landing. Phi got AVX-512 before (well kind of before rollout delays) the rest and MorphCore is a more natural fit there and probably a better match to compete with the GPGPU competitors.
The "lash stuff up as vector units" and then unconnect them to do high thread counts has been played with before.