Vadim said that M4 uses Arm v9 instructions, but did not say which version.
After reading some Arm documentation, I am under the impression that all Arm v9 extensions are optional. If that is true, is it far-fetched to say that M4 is an Arm v9 core if it is at least Armv8.5 compatible and uses a couple of optional Arm v9 extensions?
"new" features in Arm version generation tend to start off optional in the early releases. There is going to be a v9.1 , 9.2, 9.4 , ... 9.6, 9.7 , 9.8 , etc. There usually are a hard subset of those 'optional' that tend to go 'required' by x.6 , .x.7 , .x.8 . Arm lets implementors ease they way "into the pool" instead of given folks lots of work (complexity) upfront that have to deal with all at once. It gives folks room to order which features they work on first and to incrementally grow a more complex implementation.
The problem with hyping up Apple "v9" is that they are covering SME but not SVE2. There is a more than pretty good chance that SVE2 goes 'required' toward the end of the v9 "go required" sequence. If Apple is "v9" in version9.4 and then not 'v9' in version 9.8 , then the big hypetrain about them being 'v9' is a bit overblown. If Apple skips SVE2 that would pull them off pragmatically being "v9" in the future.
The part that should give folks pause is that Arm describes SME as being a 'superset' of SVE2. Most of Arm's presentations about SME2 is that it is build on "top of" SVE2. However, SME has some loopholes built in.
" ... SME – possible implementations
The spec leaves some key details for implementations to define
...
... SVE vs SSVE
By default, some SVE 2 instructions are not supported in the streaming mode ...
..."
[ Also the MLIR AMX in the slide deck is Intel's AMX. Apple's AMX cover same general issue (matrix math). ]
The Arm picture for possible implementations has SVE and SME implemented inside the Arm core and alternative SVE inside the Arm core and SME as a co-prosessor. The latter looks like Apple's co-processor AMX (only no SVE in the Arm core. )
If what Apple has done is something akin to putting both SSVE and SME in the coprocessor ( e.g, layered on top of their AMX implementation. ) , then long term it is questionable they are going in the same direction as Arm.
Apple already had Apple AMX. Is putting a compatibility interface on top of it really moving 'toward' v9 and or just getting a inexpensive win in the intermediate term?
Arm v9.1-9.4 introduces lots more than just SME2.
en.wikipedia.org
For example if there are 10 new v9 features. If one implementor does 4 of 10 and another does 1 of 10 ... which implementor is more so on the "v9" track?
Geekbench having some 'hook' for SME and better scores following out shouldn't be the primary completion indicator of implementing Arm v9.