You're right it wasn't perfect in the early implementations but the tech press, whether they're normally favouring Apple or not seem to be suggesting this big.Little concept is some new unheard of innovation.
I'm intrigued to see how this works for Apple, and whether it's preferable to a 2 or 4 identical cores. I always thought the Samsung and Qualcomm implementations were acknowledgement that they couldn't do a scalable core, suitable for both low and high power usage. Dedicating 2 cores to low power usage is a significant amount of die space that could otherwise be used by high power cores.
There are 3 ways to improve performance, add more cores, scale current cores, and increase clock speed. I believe increasing clock speed is the least efficient, so you're left with the other two options. Any "active" quad core SoC, whether that be from Qualcomm or Samsung, is an admission that they weren't able to scale a dual core SoC.
Could Apple have used the extra die space to improve the two high power cores and stick with dual core? I don't think so. I think the A10 high power dual cores is the upper limit of what Apple was able to achieve and they needed the extra 2 low power cores to get to a really low power state.
The iPhone 7 is up to 40% faster with 25% brighter screen and gets 1-2hrs+ battery life over the 6S. That's impressive.
I'm actually curious about the implementation logic. One of the things I like most about A9/A9X was that it rendered gif- and JavaScript-heavy Tumblr webpages as fast as my desktops and laptops did (finally). It would be quite annoying if it uses the slow cores for web browsing.
Referring to Apple's custom performance controller:
"...in real time makes sure the correct processes are running for maximum performance or maximum battery life." Phil Schiller
I have no doubt in my mind that it'll work well. Apple has created custom dedicated chips before. For example the TCON for the iMac and the motion co-processor for the iPhone. There's also a custom image processing chip in the iPhone 7,