What is being done in your FCPX h.264 comparison?...Is some of the speedup GPU related? That is the question I asked in the other thread, which is how much of the FCPX improvements with the 2017 i7 are due to the new 14 nm Radeon Pro 580. The 580 is vastly superior to the 2015 i7's Radeon R9 M395. The reason this is pertinent is you can also get the i5-7600K with the Radeon Pro 580. In fact, in some artificial tests at least, the base model i5's Radeon Pro 570 is as fast as the M395 (if not slightly faster)....It would be good to get a FCPX video editor's review of the 7600K/580 directly compared to the 7700K/580..
The FCPX H264 comparisons included either (1) Importing 4k H264 long GOP material and building ProRes proxies, or (2) Exporting an H264 timeline to 4k H264. This came from a variety of cameras but mainly a Panasonic DVX200 and Sony A7RII.
In the FCPX import and proxy creation case, the 2017 iMac 27 i7 was about 2x faster than the 2015 iMac 27 i7
In the Premiere import and proxy creation case, the 2017 iMac 27 i7 was about 25% faster than the 2015 iMac 27 i7, so they both improved but FCPX improved vastly more.
In the 4k H264 export case, the improvement was less -- 16% for FCPX and 12% for Premiere, however FCPX was 3.7x faster than Premiere, and with far less heat and noise -- on the exact same hardware.
It seems unlikely the improvement is GPU related per se. This is because a GPU by itself cannot significantly accelerate long GOP encode/decode. The core algorithm is sequential: frame 1 must be calculated before frame 2 can begin, etc. Only by doing the innermost part in hardware can it be greatly accelerated, which is what Quick Sync does.
This is obvious from running similar FCPX tests on a 12-core Mac Pro with dual D700 GPUs. Despite the powerful CPU and GPUs, it is 1/2 the speed of a 2015 iMac 27 at exporting H264 or transcoding H264 to ProRes proxy. If a GPU could make this task faster, it would work here but it does not.
Quick Sync is associated with but functionally separate from the GPU. It is entirely separate fixed function logic. Unfortunately due to shared resources (frame buffer, busses, etc) it cannot exist without the on-chip GPU but the work is done by Quick Sync not by the GPU.
I don't know how Apple got such large performance increases out of FCPX on Kaby Lake for the import and proxy transcode task. It's the exact same code. Supposedly the Quick Sync improvements on Kaby Lake were for function (HEVC 10-bit), not performance.
The huge performance, heat and noise difference between apps that support Quick Sync and those that do not are visible when doing these tests in FCPX vs Premiere Pro CC on the exact same 2017 iMac 27 i7:
Import and create ProRes proxies for 10 H264 4k long GOP files, total running length 11 min 43 sec:
FCPX, 2015 iMac 27: 5 min 37 sec
FCPX, 2017 iMac 27: 2 min 40 sec
Premiere 2015 iMac 27: 8 min 6 sec
Premiere, 2017 iMac 27: 6 min 27 sec
Even though Premiere is much slower in the above test, it won't create full-HD ProRes proxies (only 1537 x 790), so this is an artificial advantage. To bypass this we can do a straight export test of a 1 min 51 sec 4k clip to 20 mbps single-pass H264 4k:
FCPX, 2015 iMac 27 i7: 1 min 21 sec; CPU levels: moderate, noise & heat: low
FCPX, 2017 iMac 27 i7: 1 min 8 sec; CPU levels: moderate, noise & heat: low
Premiere, 2015 iMac 27 i7: 4 min 43 sec; CPU levels: high, noise & heat: high
Premiere, 2017 iMac 27 i7: 4 min 11 sec; CPU levels: high, noise & heat: high