AI server makers are hoping to obtain orders from Apple ahead of its highly anticipated unveiling of new AI features later this year, according to...
www.macrumors.com
BUT... Apple needs to make their own server, sooner or later, especially for AI. Currently, only Nvidia has the power of AI on both hardware and software and therefore
Some stuff is a hypetrain struggling to justify why SuperMicro and Nvidia stock is hyper inflated. Apple doesn't have to chase their stock's hype.
Apple doesn't "need to". The major features of the Vision Pro are AI based and require zero connection to an Nvidia 'powered' server at all. The presumption here is that Apple is going to counter the 100+ B element GPT4 and Gemini/Bard with another 100+ Billion element large language model.
The primarily thing Apple has to do is deploy something better than Siri, not some OpenAI 'killer' chatbot.
There is very real chance what Apple is going to deploy is going to run local; not on servers. Several reasons.
i. They are already done that with Siri. Siri is relatively 'poor' primarily for other reasons than the size of the model. Siri gets confused and just doesn't a Google search and tells you to sort it out "here's what I found on the web".
ii. Deploying a 'acre' of Nvidia GPUs would suck up gobs of 'green power' that they don't have access to. Nevermind they don't have the data centers enclosures for that anyway.
iii. Apple doesn't have to do as much data siloing for inference in their data centers. Decent chance here that the objective isn't to 'hoover up' the maximum number of inference queries and associated data.
If Apple runs the AI on end users electricity then it has no impact on Apple's power draw requirements for inference. Nor does it has much impact on capital equipment spend. Again the end users funded all of the inference compute resource requirements.
But, but, but Apple's chapbot won't kill GPT4/Gemini/etc. So what. Are those bots going to be banned from Apple devices. Nope. So the end users lots access how. All of the 'bots' that are boat anchored to hyper expensive servers are going to have to try to pull revenue someone from Apple devices. Apple doesn't have to duplicate everything that everyone else does.
"no decent AI" can run on a phone. Not really:
This guy is is running some reasonable sized models on a Pi 5. Are the results better than GPT4? incrementally. Better than Siri in many cases? Much bigger margin. The latter is the 'problem' that Apple needs to solve sooner.
It is about inference, not training. Apple will need to buy more training hardware , but don't necessarily need to deploy symmetrically on the exact same equipment for inference.
[ Similar with the DLSR versus iPhone cameras. The "good enough" camera you have with you all the time has distinct advantages over the bigger , better camera you have to go get. ]
I suspect the 'hallucination' problem gets more tractable also when not trying to build a "does everything for everybody" AI. Build an expert portrait photography touch up versus build an AI that creates photos from scratch. Grounding the AI in the context of a specific problem means doesn't have blend major concepts that don't belong together ( or get confused trying to blend two conflicting narratives.) . for example the advance editing that Google Photos does on Pixel phones versus what Apple does. Which one is further behind?
, Apple has to buy x86 and Nvidia based servers which is quite ironic. Apple has neither hardware and software and yet how can they even enter the AI market? At least server is what they can justify to make.
A long time ago Apple bought a Cray to support computer design work ( while Seymour Cray bought a mac to do some Cray work. LOL. ). It doesn't matter. It may have changed but Apple was running their supply chain on SAP software on a large Sun box running solaris.
Apple doesn't "has to" buy x86 to run AI cards.
" ... NVIDIA unveiled the NVIDIA Arm HPC Developer Kit to support scientific computing amid the growing need for energy-efficient supercomputers and data centers. It includes an Ampere® Altra® CPU, with 80 Arm Neoverse cores running up to 3.3GHz; dual NVIDIA A100 GPUs, each delivering 312 teraflops of FP16 deep learning performance, as well as two NVIDIA BlueField-2® DPUs, which accelerate networking, storage and security. ..."
GTC -- NVIDIA today announced a series of collaborations that combine NVIDIA GPUs and software with Arm®-based CPUs — extending the benefits of Arm’s...
nvidianews.nvidia.com
[ I think AMD is slacking a bit for Linux on Arm support for MI300. Given the MI300A has x86 cores embedded into the package , that isn't too surprising. ]
Microsoft is running a fair amount of the OpenAI workload on Cobalt100 ( Arm) and Maia ( custom Inference )
hardware. No Nvidia , SuperMicro boards , or x86 there at all.
Microsoft unveils two custom chips, new industry partnerships and a systems approach to Azure hardware optimized for internal and customer workloads
news.microsoft.com
This is also why Apple needs to keep making Mac Pro workstation, not like Mac Pro 2023. It is useful for server and Nvidia actually made both server and workstation all together. Apple really needs to make Mac Pro with superior hardware again in order to start working on software and its ecosystem or otherwise, Mac will be too limited.
Where Apple is going to suffer is where "big inference" comes down to the workstation level. If something like "Sora on a workstation" takes off on local hardware Apple would be caught with their pants down with their "All 3rd party inference is 'evil'" stance. Reports are that the Blackwell B200 'card' is pushing Dell to deal with 1000W modules. If that is the track 'big inference' is on, then Apple probably made the right move.
The wild card is on what "reasonable 'large' sized" inference will be able to do in 1-3 years. I think Apple has missed the boat there, but it will take a few more years to bite them in the butt.