Good catch! From the Metal Shading Language Specification:Also, bfloat16 support on the GPU! I doubt it comes with improved performance though...
Good catch! From the Metal Shading Language Specification:
View attachment 2213746
Is this another case like the ray-tracing API where Apple has built software support before hardware support?
It makes a ton of sense for Apple to cede the training market. Apple has no advantage there. Nvidia has solutions connecting thousands of CPUs and GPUs together. Apple can't compete.Post WWDC, Apple execs (and influencers close to Apple) are saying that Apple Silicon isn't in the AI training game. Go do it in the cloud. Which, I think, is consistent with thoughts on this thread.
However, wrt LLMs, what about needs for inference, fine tuning, or even extending models with plugins - like the retrieval plugin? Is Apple ceding those 'non-cloud' tasks to be best performed on workstations from other vendors?
I ask this having not watched any of this year's WWDC content.
Device | --compute-unit | --attention-implementation | End-to-End Latency (s) | Diffusion Speed (iter/s) |
---|---|---|---|---|
iPhone 12 Mini | CPU_AND_NE | SPLIT_EINSUM_V2 | 20 | 1.3 |
iPhone 12 Pro Max | CPU_AND_NE | SPLIT_EINSUM_V2 | 17 | 1.4 |
iPhone 13 | CPU_AND_NE | SPLIT_EINSUM_V2 | 15 | 1.7 |
iPhone 13 Pro Max | CPU_AND_NE | SPLIT_EINSUM_V2 | 12 | 1.8 |
iPhone 14 | CPU_AND_NE | SPLIT_EINSUM_V2 | 13 | 1.8 |
iPhone 14 Pro Max | CPU_AND_NE | SPLIT_EINSUM_V2 | 9 | 2.3 |
iPad Pro (M1) | CPU_AND_NE | SPLIT_EINSUM_V2 | 11 | 2.1 |
iPad Pro (M2) | CPU_AND_NE | SPLIT_EINSUM_V2 | 8 | 2.9 |
Mac Studio (M1 Ultra) | CPU_AND_GPU | ORIGINAL | 4 | 6.3 |
Mac Studio (M2 Ultra) | CPU_AND_GPU | ORIGINAL | 3 | 7.6 |
I think Apple means exactly what they said – they're not in the "starting from scratch, using 1000 GPU's" training game. That does not mean they're not interested in the examples you gave like fine tuning.Post WWDC, Apple execs (and influencers close to Apple) are saying that Apple Silicon isn't in the AI training game. Go do it in the cloud. Which, I think, is consistent with thoughts on this thread.
However, wrt LLMs, what about needs for inference, fine tuning, or even extending models with plugins - like the retrieval plugin? Is Apple ceding those 'non-cloud' tasks to be best performed on workstations from other vendors?
I ask this having not watched any of this year's WWDC content.
Indeed, it's too early to say what will happen in this field. We'll see if the multiple powerful gpus stays the main way to train, or if another way is found. I also believe Apple hasn't given up on the training game, briefly mentioned in the WWDC Keynote:I think Apple means exactly what they said – they're not in the "starting from scratch, using 1000 GPU's" training game. That does not mean they're not interested in the examples you gave like fine tuning.
For example:
- They're using an LLM for the keyboard (and various other things). This will presumably be fined tuned as you type to match your particular language usage.
- They're offering personalized synthetic voices. Right now these are low-ish quality, and intended for people who have difficulty speaking. But at some point this will probably change.
Basically use common sense! If a task is being done on a rack of H-100s, it's not a task Apple thinks should (for now...) be done on a Mac. Otherwise...