I fail to understand where the "*extremely* notable performance improvements" should come from. They are just sizing down the structures. If you don't change the architecture and don't change the clock speed, it only will result in a smaller die that's potentially cheaper to produce and possibly more energy efficient.
Wrong. And historically, wrong about every single year for the last ten years. (Edit: To be specific, wrong about "they are just sizing down the structures". Your "if..then" is correct but not relevant to the real world, where they do change every year. See below.)
Some years, you get a process shrink. Some years, you don't. This year, they're getting one, which will allow them to build more cores (probably only on the Pro/Max chips) or shrink the chip. It would also allow them to boost performance by raising clocks, though I don't expect them to do that (or not much) on the laptops. Instead they'll take the lower power, translating to better battery life or lighter batteries.
The performance improvement comes from redesigning the chip, which (as I said) they do every single year. Most years the redesign has gotten them decent speed boosts. Some years it's been dramatic. Exceptionally, last year it was negligeable, because they had to shelve their original designs at the (relatively) last moment due to TSMC, mostly reusing the M1 design. Most (but not all) of the M1->M2 boost comes from clocks, not design.
This year, they have taken the original A16/M2 designs, worked on them more, and finally implemented them. They should be getting roughly double the usual yearly architectural improvements, plus likely significant further improvements on the high end (the M1 was very much a learning experience), even before you factor in additional cores on the higher-end chips. So, *before* you factor in any boosted clocks or added cores, you're likely looking at +20-30% IPC for the CPU cores. GPU I'm less clear on but it's likely to be similar, maybe better.
But smaller structures don't automatically increase the performance.
The increase in energy efficiency means the thermal load is decreased wich in turn allows for higher clock speeds. And it also allows for more transistors on the same size of die. So you could add more cores. Or you could change the architecture (alus/jump predictions/registers/...) to a more powerful and complex one... But that's not easy, and it would require a serious redesign.
It's not easy, but they do it every year.