I am not making any excuses. I am trying to understand why a chip with 33% wider execution backend and 50% more branch units only achieves 3-5% higher IPC. Either Apple has royally botched up something inside that CPU, or their IPC is already so high that expecting more substantial gains is unreasonable.
Also, AMD is hardly a good example. Zen4 IPC is on par with Apple A12. Of course it's easier for AMD to get notable IPC improvements. Hell, they got a decent IPC improvement simply from increasing the cache size. They are basically repeating steps Apple did years ago. At some point the bag of tricks is empty.
I am not sure what you are basing all these speculations on. Analysis (however limited) on A17 u-arch is available. We know that this is Apple's first really new microarchitecture in many years, and we know that it's substantially wider than Firestorm and its iterations. It is extremely unlikely that Apple will have another massive micro-architecture update within a year. Maybe some minor tweaks that help extract more IPC from that wide core, sure, that cannot be discounted. But A17 is the basis for the next few years at least.
What's the difference? It's ultimately the same thing. On the fundamental level, a modern super-scalar CPU core is multi-device machine trying to concurrently chop away on a serial program in the most efficient way.
Also not to mention - at what point does it not become worth it? Like is it worth pouring hundreds of millions or billions into R&D just to return +3% more IPC?
I'm running an M1 Max (24 core igpu) right now and don't plan to upgrade for like 7yrs+ if this rate of improvement continues.