My reasoning for the CPU core count to not increase was:
1) The A17 Pro didn't increase CPU core count, only GPU core count.
2) The higher clock frequency afforded by the N3 process was enough to make the relative performance of M3 vs A17 Pro, the M2 vs A15 and the M1 vs A14 about equal.
3) Manage higher fab costs and leave room for die growth in the M4, similar to how the die grew from M1 to M2.
Now, is it possible that the M3 Max could have 12P+4E? That would mean the M3 would have 6P+4E.
Here is a block diagram showing the relative die areas between the M1, M2 and the M2 if it was scaled down using the N3 process:
View attachment 2302643
Based on the diagram, 2 CPU P-cores and 2 GPU cores could be added if you allow the theoretical M3 die to have about the same width as the M2 die.
The consequence of the M3 Max having 4 additional P-cores means that it would be in i9-13900KS/M2 Ultra territory in terms of multi-core performance, which feels a bit ridiculous considering the M2 Max is currently 20% slower then the M1 Ultra.
However, Intel's upcoming Meteor Lake will have up to 6 P-cores, so who knows, maybe the 12P+4E prediction will be correct.