No one should be surprised by this.
Apple is probably paying by the wafer for everything TSMC makes for them.
Of course they are going to use Apple Silicon and probably a stripped down OS called AIOS, or serverOS to do what it needs.
After 4 years, Apple probably has a warehouse full of binned Apple Silicon parts that have too many cores not useable, or not able to meet the targeted frequency. Why spend $100k per server with NVidia when they have their own goldmine?
Full transparency, none of us know what the yield rate is for Apple Silicon parts. They could have enough to start this off for the first 6 months (likely higher), they may have enough for the first few years. Theres no logical way we can know.
No logical way?
iPhones sell in an order of magnitude range higher than Macs. This 'Cloud Compute ' rolled out to iPhones likely is going to swap the number of Max chips made. The largest selling Macs are MBA , MBP 13/14" with the plain Mn SoC in them. The Mn Pro are more affordable than the Mn Max so likely also substantively outsells the Max.
The Max dies is likely in the single digit millions range. The iPhone Pro is likely in the 10 of millions range.
The bigger factor here is how many users 'opt in' to Apple Intelligence. It is not going to be 'on' by default.
If only 10% of folks opt in then the iPhone number could drop to single digit millions range. But if it is 50% then have substantive issues.
The M2 is on N5P. At this point, it is more than several years old. The defects are not just going to land in specific CPU/GPU cores. defect in the SSD controller will kill the die as something that can boot iBoot. Defect in memory controller likely render it ineffective as a "inference server" also ( it is a bandwidth intensive task).
Apple also sells binned Max dies. So this is all not coming from the 'cannot be sold' defect pile. A larger number of 'out of the garabage can' servers isn't going to help. If need twice as many server boards then that takes up twice as much datacenter space. Also eats up more electricity. Ditto network switching costs. Dies that are mostly garbage ... do not necessarily lower long term service ongoing operating costs. ( early on Google bought lots of stuff out of the discount bargain bins at Fry's. That really didn't help long term to delivery 5 9's like service stability. )
What typical load factor and who well they can aggreate service is what will matter. If it is in similar ballpark range as competing hardware and Apple doesn't charge themselves the 'Apple Tax' for hardware , then yes there is upside in using their own 'dogfood' here. It isn't 'free' hardware though. Mininally the REST of the server board ( ethernet , NAND chips , RAM , etc. ) isn't going to be free at all. (Apple bought
none of those wafers from TSMC. )
With that said though, it’s not like Apple HASN’T built SOCs before. It’s not like Apple has never taken a fully developed operating system and made it purpose built before.
Apple has not build a specific SoC for the entry iPad. AppleTV. iPhoneSE. iPad Air. Mac Pro. The majority of Apple is all about reusing SoCs that lead out on other products.
Apple has done this more times than Intel made schedule for fabricating 10nm processors and then some.
That is pretty much comparing "apple's to oranges" . The number of dies that Intel has rolled out on 10nm is likely bigger. Intel's product line is much broader. INtel has Xeon D , Xeon SP . A couple of desktop dies ( Xeon E , Core i3 , i5 , i9) . multiple laptop dies. Atom processors, Celeron/Pentium. ( while had Altera ... FPGA 10nm product. ) Intel sells an order of magnitude more stuff to two orders of magnitude more system vendors than Apple Silicon group does.
Apple mutated A1_X dies into Mn dies to keep the die variant count down.
P.S. if go back to post 9
This paper is an overview of the privacy protections in Apple’s ML cloud. (Along the way you learn all the means someone could use to attack current generative products. Scary.) https://security.apple.com/blog/private-cloud-compute/
forums.macrumors.com
Apple basically says it is largely a trimmed down iOS. The primary point is basically to run just a larger variant of the same ML software that they are running locally. Same software just in a bigger RAM/core container , but on a long latency external line.
They aren't trying to do "mega LLM #42". All of that is a 'punt' to 3rd parties.