BTW it appears that the first round of A17 GB6 results were dramatically lower than they "should" have been.
Look at
or
I'm not sure what to make of this. Do some reviewers already have models but are still under embargo? Were the first reviewers running GB6 the moment they opened the phones (so that it was running while the phone was still doing initial setup and backup-restore stuff?)
Hell, maybe even this "iOS17.0" that they all claim to be running is not the same? Maybe they ship with a iOS17.0xxx and as soon as you go through the usual updates you get a 17.0yyy that, among other things, flips some chicken bits? I've no idea how leading edge reviewer phones are!
Certainly the newer (~20% improvement) results don't look obviously faked. If there is any pattern it's that stuff I would expect to depend more of NEON/FP gets more of a boost, so 5th NEON unit?
The obvious next config (IMHO!) would be
10-wide decode
[connected by queue, not just direct coupling as in earlier designs]
8-wide rename
5x NEON,
8x INT (including 3 branch units)
probably still 4x LD/ST but MAYBE switch from the current 2x LD, 1x ST, 1x ambidexstrous to 2x LD, 2x ambidextrous?
point of the 10-wide 8-wide is that
(a) decode is easy, rename is hard. Might as well do as extra decode and store it in a queue as a buffer against when there is less work to decode because of earlier branches.
(b) some decode results in no work for rename (eg NOPs and fusions) so a 10-wide decode results in something like an average of 8-wide clients for rename anyway
Already a recent design at HotChips (I can't remember, maybe the newest ARM? maybe the Veyron) has this similar wider decode than rename, so it's not like it's my crazy idea; it's the logical next step given aggressive fusion.
There's probably also interesting other work in other parts of the CPU (for example there's a weird "pattern" cache patent which makes no sense to me except as a fancy way of describing a zero-content cache, but a zero-content cache is in fact a nice addition!) And as I've said there are multiple patents suggesting big changes in virtualization (probably? not relevant to phone -- though "work phone vs home office" profile?) but part of these are large pages which will boost even some phone workloads. Of course with any patent, who knows if it's implemented this year or next year?