I'm just not getting it. Like, 35% more efficient at the same power consumption means 35% better performance. Or it could mean lower reduced with the same (or slightly better) performance. But... why would Apple choose to further reduce power consumption? Their power consumption is already ridiculously low, but they are lagging behind in performance. Just doesn't make much sense to me.
I think you are skipping over a bunch of likely constraints if all of the information is correct.
1. The core baseline designed is being shared across N5P and N3. The P core complex is basically going to be the same P core complex with likely the same memory bandwidth allocation from the internal bus (namely 100GB/s per complex). So if crank the clocks higher than can deliver data from RAM memory what are you really 'buying' ? except goosing some "smaller than cache" tech porn benchmarks , the higher than memory clock is only going to buy corner cases.
[ Since on the same N5P constraints and minimizing design differences there is little upside of taking the P core complex off the limiters on this iteration. But there could be more substantive L2 cache size and/or turbo mode time width breadth differences. If 20% down in consumption then can hold a turbo mode how much longer with one core in a four core complex that is fully 'lit up' ? ]
2. Likely a fixed overall TDP budget ( e.g., the Ultra needed a different cooling block than the Max in the Studio. The MBP 14 and 16 are limited on thermal dissipation. ). If add two more power and bandwidth consuming cores to the line up then the P cores probably need to give something back from the total CPU thermal/consumption budget. ( more consumers of the budget, but the budget is the same size).
Similar issue with "middle" cores soaking up more bandwidth. If the new overall system bandwidth is 300GGB/s. The 'extra' 100GB/s get allocated to "middle core" complex and additional GPU cores.
3. N3 main focus is smaller "medium large" dies not max, "hot rod" feature this iteration. N3 is more expensive than N5P; way more. Decent chance there is $/working-unit number that Apple has self constrained on also.
The M1 Max is constrained on size to hit the Info-LSI packaging limit of 1 reticle. The "Pro" class can afford some N5P bloat, but the "Max" class at the edge in the first place. That would/could be only be done on N3 to pull back from the limit and crank up yields per wafer. Once have a working N3 Max die doing a N3 Pro "chop" of that isn't a huge additional expense.
In the scope of #1 , that 0.1 (10%) could be an aggregate single threaded market which sets reasonable expectations. There probably would be some tech porn benchmarks were the number was higher. Depends upon where Apple sets the "turbo" limiters. They may not be so behind on single threaded drag racing as you are worried about. Just racing down longer track that typically used in the tech porn press.
What definitely does make sense to me is that they might be working on two prosumer chips: one based on the A15 family (using N5P) and one based on A16 family (using N3).
A16 or A17 ? Something shipping in Fall 22 in the 10's of millions unit volume range on N3 is doubtful.
A16 on a slightly at risk N4P would be a similar mode of just doing a limited "shrink the same thing" to get the wafer economics back. The A15 is a bloated die size for an iPhone chip. More performance on a average size iPhone die could have been the primary objective for the A16 (iPhone 14). N3 A17 ( iPhone 15) is a Fall '23 issue. [ And the M2 Pro/Max etc. on N3 would have been the 'pipe cleaners' on the N3 process ahead of the A17. ]
If the work on the N3 chip progresses well, Apple might decide to skip the rest of the A15 chips entirely and go straight for A16 ones.
What is not here in looking at already packaged chips is what could be the UltraFusion subsystem(s) in the loop here. Let say there was a M2 Pro with no UltraFusion ( like M1 generation). But had another M2 Pro class die with on UltraFusion connector ( i.e., used some of the N3 space savings to add something extra). That could open up an "Ultra lite" class were pair two N3 M2 Pros along side two N3 M2 Max Or if didn't have to have complete symmetrical. N3 Max + N3 Pro along with a N3 Max + N3 Max-Desktop. The first would give 40 CPU + 60 GPU and second 40 CPU + 80 GPU with probably better value match for some folks looking to grow CPU count without having to pay top dollar for more GPU cores they may not need.
That would essentially amount to rebranding the original "M3 Pro/Max" as "M2 Pro/Max". Such strategy would be interesting for at least two reasons: a) it would help them get back on their already massively delayed schedule and b) it would make prosumer chips more advanced than the consumer chips, generating positive PR for pros.
Not so sure. Apple could also easily be following someting close to a "Tick/Tock" strategy were change microarch and make large fab process shrinks at two different times. Make it work. Make it smaller and then go back to Make it work. Control the complexity at these stanges. M2 is a "control micro arch design" generation and just worry about fab complexity. M3 would be a "mastered fab issues , no go back to micro arch."
One thing I find particularly interesting is that the iPhone 14 is claimed to continue using A15. This suggests one of two things: either there are massive issues with volume production/availability on TSMC N3 or Apple wants to free up some volume for more important high-perf chips.
The first has been true for years. N3 wasn't suppose to even start high volume manufacturing at best until 2H '22. Given the more than several weeks 'baking' time that is simply too late for Fall '22 iPhone. HVM for iPhone needs to start in April-to-early-June. N3 was roadmapped too late.
N3 "massive" issues with using it while in 'at risk' mode are overblown. It was always a 'bad' fab tech for mid '22, even before the pandemic roadmaps. At best it was a "nov-december 22" tech. It has pragmatically blown past that.
Even N4P is likely a slide.
iPhone Pro 14 getting A16 will cause Apple to sell more Pro than regular iPhones. Couple that with the price hikes coming for Pro while 'regular' iPhone stays at "bargain" $799 means the average selling price for iPhone 14's will go up. Fatter margins . Bigger stock price bump. It is also Apple goosing profits higher on a product in a maturing ( lower growth) market.
N3 does make some aspects harder. It does about nothing for Analog circuits. ( no shrinkage). So is does start to push a move to more disaggregated designs. The DRAM shrink is also smaller. ( which doesn't help with Apple's "larger than everyone's else's caches" method of staying ahead on slower main memory RAM. ) It would be good see what does/doesn't work so well here before fully jumping into a deep dive into N3 ( or N3E ).