I wonder if they're eventually going to do that? What you say makes a lot of sense - although the problem is exacerbated this year by how fast the M3 Max is. It benefitted BOTH from a die shrink and a 50% increase in P-cores. In total, it's about 150% as fast as last year's M2 Max, which is typical for Max-to-Ultra of the same year. I'd imagine this double whammy that actually catches the previous year's Ultra (except in GPU performance) will be rather uncommon - maybe one year in four or five?
In a more typical year, when the overall performance upgrade on the Max is more like 10-15%, the previous year's Ultra will still be noticeably ahead, and even the Ultra from two years ago (assuming that neither of the two previous years was a huge year for the Max) will still be slightly faster.
Using Geekbench 6, simply because it's easy to search (I know it's not perfect) multicore:
M1 Max =~12000
M2 Max=~14500
"M3 Max" expected value (based on M1 Max and M2 Max) =~17000
M3 Max (the four extra P-cores make a big difference) =~21100
M1 Ultra =~18500
M2 Ultra =~21200
M3 Ultra expected value=~31650??? (150% of M3 Max - historically, the Ultra is almost exactly 150% as fast as a Max on GB6)
Were it not for the huge performance jump from M2 Max to M3 Max, the M1 Ultra would fall in between the expected speeds of an M3 Max and M4 Max. A 50% core jump is a VERY significant architectural update, especially coupled with a die shrink, and that temporarily upset the relationship between the Max and the Ultra (which will be restored when the M3 Ultra comes out).
Of course, Apple playing their cards close to the vest doesn't help. Prior to the actual release of the M3 Max, no rumor had suggested that it might have extra P-cores and be unexpectedly faster than it "should be". The M3 almost exactly matched expectations, the M3 Pro slightly underperformed expectations (nobody suggested that it would trade in two P-cores for E-cores), and the M3 Max outperformed expectations. As little as two weeks prior to the introduction, nobody expected the M3 Max at all before next March .
Another possibility that will help is if they get an "Extreme" chip into the Mac Pro. That should be around 250% the speed of a same-year Max (with GPU performance well over 300%, since GPU cores scale better), assuming that it's a straight-up quadruple Max or double Ultra, and that should be faster than any lesser chip for a number of years, barring a complete architecture shift.
Even WITH a complete architecture shift (and a very successful one) in the mix, 250-300% is a lot. The M1 Max MacBook Pro is about 180% the speed of the final Late 2019. Intel MacBook Pro, the M2 Max is around 210%, and the M3 Max is about 315%. Desktops are similar - the M1 Ultra Mac Studio is about 180% the speed of the final Intel Mac Pro, the M2 Ultra is around 210%, and I'd expect the M3 Ultra to tuck right in at 315%.
Without an architecture shift, we don't have enough data for Apple Silicon to say how long it will take to get that kind of performance improvement, but for Intel Macs, it was roughly seven years - a Late 2019 16" final Intel MacBook Pro is about 270% the speed of a Mid 2012 MacBook Pro (the first Retina model). There was an "M3-type transition" in there two - the move from four to six, and finally eight cores. Five to seven years for a newer "Max" to catch an older "Extreme" is a pretty safe guess, with a minimum of four years in the case of a highly successful architecture shift.