Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
How is it different from how others are doing it?
AI done locally on your device instead of shadily done in the cloud for "free" by some suspicious company with an opaque business model like Sam Altman's OpenAI (yes I realize there is an option to use OpenAI for web searches in IOS18).

Apple is basically the only tech company with a culture of privacy and user safety (remember they regularly refuse to cooperate with police requests to unlock their users' phones).*

Apple restricting these free AI services to only their premium tier devices is brilliant marketing. I've never cared for the "pro" line stuff but they just made the Iphone Pro, Max, and Ipad Pro worth the extra cost for the first time.

*(In fairness to Facebook/Meta, they also have a local AI called Meta Llama 3 which is somewhat open source and runs locally on your linux machine).
 
  • Like
Reactions: Tagbert
I don't think it's a space problem. 15 and 15 Pro have mostly the same form factor. Also, the surface area of 6 GB vs. 8 GB chips can't be so different. My guess is, that you could fit 16 GB in the same space. There could be a trade-off regarding power consumption though.
Restricting AI to premium level products was strictly a marketing decision, not a technical one.
 
Restricting AI to premium level products was strictly a marketing decision, not a technical one.
Maybe. Another thing to consider is, that Apple can't build out the infrastructure fast enough to serve all halfway recent iPhones with cloud-based AI capabilities. Imagine hundreds of millions of devices suddenly hitting those data centers. They have to roll it out gradually.

One thing is for sure. Compute is not the bottleneck on recent iPhones. If anything it's RAM.
 
  • Like
Reactions: ric22 and Tagbert
The whole point is the on-device processing.
Is it? That will give it speed (especially when you have poor reception), but in all my months using ChatGPT & CoPilot waiting 10 seconds for the server to provide the output is not exactly a chore. Apple have set up their own servers for tasks that are too demanding for on-device processing (running on Apple Silicon) so the cloud resources are there.

Perhaps server access will come as Apple builds out its server - there is apparently going to be a wait list for the server tasks so perhaps Apple are anticipating heavy demand, even with the functionality limited to M1+/A17Pro+ devices.
 
I know hence why I'm stating they are doing Recall. What recall does is create a semantic index of what is on your screen with associated data point on your device into a vectorized database. Then using a text/image encoder, etc to parse for whatever you ask for in natural language, in simple terms. That is just simply how it has to work if you want something to work fast enough in real-time. This is entirely different from a simple file and system index. It is Recall.
Screenshot-2024-06-10-201211.png


I know, I have played with some of their models on huggingface.

No, they have not shrunk a 7B+ parameter model to run on iPhone, they have created a specific 3B parameter model that is quantized to less than 3.4-bit to make it run on device. It is not magic, there are many tiny models out there, Phi3 mini, gemma2 2B etc. I have played with Phi-3 mini on my iPhone 13 Pro Max and it works.

Gemini nano models on android phones.
Tg1mA5C.png


They are not doing a H100 level GPU semantic and contextual modeling and understanding on your phone. I don't know what that even means. A H100 can do 3,958TOPS and iPhone 15 Pro can do 35TOPS. A H100 can run ChatGPT 4o and Gemini 1.4 Advanced class multimodal models with 1+ trillion parameters.

Read the flash paper. What was shipped is a 3B. They have tested 7B and above and pioneered a novel way to load just enough model into RAM augment off flash. They got Falcon 7b running in under a gig with 1/20th the latency. The ground work is there for much much much bigger models on the phone. With latency in line with bigger hardware.
 
For the first time I’m not motivated to update to the new iOS when it will be released. I’m not against AI stuff, on the contrary… I would have found a few of the announced features useful, but I have an iPhone 15 and no intention of buying a new model any time soon. The non AI stuff they added is not interesting for me.
 
For the first time I’m not motivated to update to the new iOS when it will be released. I’m not against AI stuff, on the contrary… I would have found a few of the announced features useful, but I have an iPhone 15 and no intention of buying a new model any time soon. The non AI stuff they added is not interesting for me.

I’m in no rush to add a beta. But have a 15 pro max so not too concerned. Thing is none of the so called ai features announced look compelling to me. Don’t see me using. Don’t consider phone recording AI. I barely use Siri as it is and without ChatGPT it didn’t look much smarter.

I think it’s more along the lines of giving a starving man a cracker. iPhone users will eat this up given how stale things have been for years with iOS. Doing their emojis customizing icons. The average iPhone user will be in heaven with that nonsense.
 
Also: the on device only or nothing is pointless for at least two reasons:

1) give me a choice: if my device is not powerful enough for AI on device, forward my request to an external service. You do have plenty of ways to anonymise my data, and you even claim stuff about Private Cloud, so is it possible when you need, but not when I want?

2) Even new iPhones will need to use external services for some features, so the whole “on device” thing doesn’t stand

Just in case it’s not clear, they could implement the same stuff for older devices, using their “private cloud”. They just don’t want, forcing people to buy new devices
 
Read the flash paper. What was shipped is a 3B. They have tested 7B and above and pioneered a novel way to load just enough model into RAM augment off flash. They got Falcon 7b running in under a gig with 1/20th the latency. The ground work is there for much much much bigger models on the phone. With latency in line with bigger hardware.
I've read that paper, and it is not a novel idea. It has nothing to do with shrinking a model, the model size is the same. What the paper talks about is storing model parameters in flash and loading chunks on demand into DRAM and to reduce data transfer by reusing previously activated neurons by exploiting how flash storage works.

They have not shrunk the model which was your claim.
Apple has published papers about the work they have done in shrinking 7B+ parameter models to run on device. They are doing H100 level GPU semantic and contextual modeling and understanding on your phone.
If this was at all feasible, they would run the 3B parameter model on iPhone 14 Pro Max and lower that have 6GB RAM. That is much more feasible. Instead, they have chosen to cut off devices with less than 8GB RAM. What I imagine they would do is finetune a much smaller model with 1B parameters to run on those devices just like Google has done with gemini nano.

A phone is not running a H100 class model. That is impossible, it is barely running a 3B parameter model which have to be quantized to less than 4-bit to make it performant enough to run.
 
Last edited:
I tend to think it was when Apple canceled the car. I take these origin stories with a grain of salt.
I think cancelling the car was a result of the AI decision. Apple had to reprioritize resources and the car didn’t make the cut. Seems like a good choice.
 
I suspect that context memory will all be in the cloud. The local device will take local data, probably convert it into vectors, then feed the cloud hosted GPT instance the data that it needs. The remote host will run a Langchain tool that requests the data from the small LLM on the local device. That data will go into context in one way or another.

You need context memory to be on the device doing the actual computation.
Neither Apple‘s Private Cloud Computer nor the ChatGPT interface that Apple is using will allow any profiles to be stores on the cloud. The Apple servers have their ability to do any local storage disabled and Apple has said that thei contract with OpenAI prevents them from storing or using any history or profile information.
 
Restricting AI to premium level products was strictly a marketing decision, not a technical one.
The cutoff is devices with both 8GB of RAM and a neural processor. The M-series chips and the iPhone 15 Pro. It’s not just marketing. LLMs do need a lot of RAM and I would expect newer Apple devices to have more.

We are told that Apple decided to pivot to AI two years ago. By that time the iPhone 14 was already in production and the design of the 15 were already locked in. That didn’t give them the time to upgrade RAM. It’s fortunate that the 15 Pro did have 8GB, likely intended to help with image processing.
 
Is it? That will give it speed (especially when you have poor reception), but in all my months using ChatGPT & CoPilot waiting 10 seconds for the server to provide the output is not exactly a chore. Apple have set up their own servers for tasks that are too demanding for on-device processing (running on Apple Silicon) so the cloud resources are there.

Perhaps server access will come as Apple builds out its server - there is apparently going to be a wait list for the server tasks so perhaps Apple are anticipating heavy demand, even with the functionality limited to M1+/A17Pro+ devices.
It depends on what you are asking it to do. If you are basically using it as a smarter web search then waiting a little for a result is not bad, but if you are using as an interactive way to control your device and apps and to interact with your phone and your data, then the lag time of server-base AI would be painful and make people not want to use it.
 
As far as I know they’re the only ones doing it mostly on device without sharing your personal data to subsidize the cost for “free” or for advertising purposes. Once again, with Apple you’re NOT the product.

They deserve a lot of credit for that.
Yeah, but it only works well if you connect to the off-device chatgpt. Otherwise you merely have the Apple Sirilligence.
 
  • Like
Reactions: cardfan
Apple “Intelligence” my butt. Apple trying to pretend they did the hard, intellectual work when this is just a repackaged ChatGPT program. It’s the same shenanigans they pulled with Apple Silicon where they tried to pretend their “designs” are the reasons their chips performed well when chip design is insignificant and about as hard intellectually as ordering pizza.

Stop taking credit for other company’s inventions!
A bit harsh on Apple Silicon!

Nothing else in the personal computing space has such useable power with such long battery life. If it was so easy, then everyone else would have done it just as well. And it's not just about switching macOS to ARM, but to do it so seamlessly. Windows is falling over itself to play catch up, and is still quite a way from getting Windows for ARM working as schmick. ARM chips for Windows are in the ballpark, but still not as good as the Apple ones.

NB: the M-series chips are actually beefed up A-series iPhone/iPad chips, so it's not like they were doing this from scratch. But it is quite a hefty beef up to go from A-series to M1 Ultra, with some not insignificant feature additions.
 
  • Like
Reactions: H_D
Was rewatching this part of the presentation, and thought it would’ve benefited greatly from being live on stage instead of a prerecorded presentation. The live audience really changes the way Apple writes its scripts. It creates a more personal and approachable tone which I think really would’ve been welcome when talking about something a lot of people are weary of.

Just a thought, was wondering if others think similarly.
 
It’s going to be funny come September when users start having to grapple with two definitions of AI. Which version will ultimately win out? Will “artificial intelligence“ end up having to rebrand themselves thereafter? :p
 
  • Haha
Reactions: nemodomi
It’s going to be funny come September when users start having to grapple with two definitions of AI. Which version will ultimately win out? Will “artificial intelligence“ end up having to rebrand themselves thereafter? :p

I think most people will just think of both as artificial intelligence and Apples is just a loose nickname.
 
It’s going to be funny come September when users start having to grapple with two definitions of AI. Which version will ultimately win out? Will “artificial intelligence“ end up having to rebrand themselves thereafter? :p
Firstly, Apple Intelligence is an oxymoron. It has the same chance of replacing artificial intelligence as liquid retina replacing "LCD".
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.