I have been quietly sitting back and listening (reading) to lots of different really great opinions and viewpoints and it‘s really great to hear all of the different perspectives here.
If I may, I’d like to add a viewpoint that I hope contributes to the friendly discussion that is taking place
The conversation around ‘
lack of optimized software‘ for M1 I think warrants further discussion - particularly against, what does that actually mean!?.
There is definitely room to grow for some of the larger apps like Adobe, AutoCAD, Maxon Cinema 4d etc… However there is also a wealth of apps already today that have been ported at the very least from x86 Rosetta —> to ARM64 and in the best cases have been fully optimized on Apple Silicon to use the accelerators and co-processors that
only become available to you when you go through Apple‘s API and Compiler stack. I think that last part and level of optimization is important to differentiate between …
ARM64 optimized and
Apple Silicon optimized.
Of course, not all problems lend themselves to Apples’ co-processors and accelerators.
Yet, when you consider all the background jobs that are taking place when even running menial tasks on Windows or Mac OS, no doubt farming off instructions to
dedicated, fast, high energy efficient co-processors then also has the side benefit of freeing up the ALU/FPU for other more traditional workloads. Everybody wins!
A case in point is utilization of the AMX co-processor in Apple Silicon. Apple’s own native APIs compile to instructions targetting co-processors such as the AMX matrix co-processor that I believe is still not publicly acknowledged by Apple (likely for ARM licensing reasons) but nevertheless is a matrix process that speeds up matrix workloads (as succinct from the neural engine that is advertised by Apple).
Here is a really lovely essay from Erik Engheim that presents this in a vastly superior way to any way that I can regurgitate here!
Link to M1, Co-Processors and Accelerators discussion
Apples SoC architecture and overall ‘
own the entire widget’ approach to design uniquely positions them to take a path forwards towards long term
scalable performance beyond the traditional ‘throw more wattage, increase clock speed, use longer pipelines, shrink to a smaller die, throw more cores’ approach.
Intel on the other hand (at least from a business perspective) doesn’t have the same ‘
ease’ with which a similar SoC approach
could be taken towards performance because adding co-processors and accelerators to your silicon design means that you also need to have tighter integration and
industry alignment (this one in particularly should not be underestimated from a business perspective). Because Intel and AMD need to partner with large vendors such as Microsoft - they need to ensure partnership, agreement and alignment with their silicon vision so that the entire ecosystem (dev tools, Operating System, right down to the silicon) is aware of these co-processors / accelerators and can take advantage. This takes time. Getting alignment in a single organization is challenging, to do so across companies is incredibly difficult. In that sense, one could argue that this is more a business ‘people’ problem than a technical problem.
Apple can not only drive more efficient (and more powerful designs per wattage) from their approach, but can unilaterally dictate the rollout timeframes (at least to native 1st and 3rd party software) for solutions. They still need to convince 3rd party developers to develop for what is effectively a niche Mac platform as judged by market share. However at least today when a developer builds for Mac, he/she/they are also in a position to port to iPad or iPhone where Apple commands a sizeable market share and in turn a larger revenue stream worth pursuing.
That being said, Apple is making big efforts to contribute to open source projects in order to drive Apple Silicon optimization where possible.
Regarding cinebench and Maxon 4d, I fully expect to see significant performance improvements as and when Maxon 4d optimizes more for Apple Silicon stack.
These numbers that we are seeing today are IMHO a
worst case scenario for M1, M1 Pro and M1 Pro Max - and yet we are comparing a
laptop chip (
very favourably on a raw performance) with the absolute latest and greatest
desktop/workstation class Core i9 Desktop/Workstation chip.
Finally I’m not sure if anybody checked out Apples videos on raytracing and ray tracing acceleration during WWDC this year - but there is some nice documentation on how to accelerate ray tracing on Apple Silicon
https://developer.apple.com/documentation/metal/accelerating_ray_tracing_using_metal/
Obviously the level or precision may not be sufficient for some of the fine folks here where a fall back to more traditional CPU core execution would be required. Never the less, Apple had a lovely demo during WWDC on acceleration of ray tracing and how to optimize for a TBDR versus an immediate renderer. Again here, I expect to see optimization improvements in 3rd party renderers over time
Thanks for humouring a long diatribe! Hope everybody is having a really great Sunday and enjoying their MacBooks and Alderlakes.