Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Appletoni

Suspended
Mar 26, 2021
443
177
In case anyone is interested, in ran a fairly simple MNIST benchmark (proposed here : https://github.com/apple/tensorflow_macos/issues/25) on my recently acquired M1 Pro MBP (16-core GPU, 16GB RAM). I installed Tensorflow using the following guide (https://developer.apple.com/metal/tensorflow-plugin/).

For reference, this benchmark seems to run at around 24ms/step on M1 GPU.

On the M1 Pro, the benchmark runs at between 11 and 12ms/step (twice the TFLOPs, twice as fast as an M1 chip).

The same benchmark run on an RTX-2080 (fp32 13.5 TFLOPS) gives 6ms/step and 8ms/step when run on a GeForce GTX Titan X (fp32 6.7 TFLOPs). A similar level of performance should be also expected on the M1 Max GPU (which should run twice as fast as the M1 Pro).

Of course, this benchmark runs a fairly simple CNN model but it already gives an idea. Keep also in mind that RTX generation cards are able to run faster at fp16 precision, I am not sure it would apply to Apple Silicon.

I would be happy to run any other benchmark if suggested (or help someone to run the benchmark on a M1 Max chip), even if I am more of a PyTorch guy. ;-)

[edit] Makes me wonder whether I should have gone for the M1 Max chip... probably not.

Apple Silicon deep learning performance is terrible.​

Take a look at KatoGo benchmarks and LC0 benchmarks.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
Apple Silicon deep learning performance is terrible
Honestly, the deep learning training in Apple Silicon remains unreliable, but the inference (CoreML) seems to be surprisingly good.

Results​

YOLOv5 ? v6.1-25-gcaf7ad0 torch 1.11.0 CPU

YOLOv5s inference time
640x640 image bs1
PyTorch 1.11.0 CPU344 ms
CoreML 5.2.027 ms
 

buckwheet

macrumors 6502
Mar 30, 2014
460
509
I'm really hoping Apple announces some significant move forward with ML at WWDC. Things have been weirdly quiet from them on the ML software support front. The Mac Studio machines could be great for local ML work, if optimized for the job, and Swift has been differentiable for a while but not much gets said about it... (I mean, I realize it's already there for us to use, but it seems like the kind of thing Apple could wrap into some more dev-friendly, higher-level API, to me—e.g., tools for doing RL-like tasks with differential programming in pure Swift).

So there seem to be plenty of reasons for excitement, but nothing is really coalescing into useful tools or exciting announcements... weird... Maybe this year? I expected a lot more last year, but maybe they were waiting to have more of their own silicon out before committing heavily?

Of course, I do get that between Nvidia+CUDA and Google+TPU the market is pretty much cornered for enterprise ML stuff, but I do think there's still room for providing better support for end-users to train and test/develop on their local machines. Fingers crossed on WWDC to announce something worth getting excited about... ?
 
  • Like
Reactions: TiggrToo

TiggrToo

macrumors 601
Aug 24, 2017
4,205
8,838
I'm really hoping Apple announces some significant move forward with ML at WWDC. Things have been weirdly quiet from them on the ML software support front. The Mac Studio machines could be great for local ML work, if optimized for the job, and Swift has been differentiable for a while but not much gets said about it... (I mean, I realize it's already there for us to use, but it seems like the kind of thing Apple could wrap into some more dev-friendly, higher-level API, to me—e.g., tools for doing RL-like tasks with differential programming in pure Swift).

So there seem to be plenty of reasons for excitement, but nothing is really coalescing into useful tools or exciting announcements... weird... Maybe this year? I expected a lot more last year, but maybe they were waiting to have more of their own silicon out before committing heavily?

Of course, I do get that between Nvidia+CUDA and Google+TPU the market is pretty much cornered for enterprise ML stuff, but I do think there's still room for providing better support for end-users to train and test/develop on their local machines. Fingers crossed on WWDC to announce something worth getting excited about... ?

Just as an FYI: https://developer.apple.com/forums/thread/700083
 

jerryk

macrumors 604
Nov 3, 2011
7,421
4,208
SF Bay Area
I'm really hoping Apple announces some significant move forward with ML at WWDC. Things have been weirdly quiet from them on the ML software support front. The Mac Studio machines could be great for local ML work, if optimized for the job, and Swift has been differentiable for a while but not much gets said about it... (I mean, I realize it's already there for us to use, but it seems like the kind of thing Apple could wrap into some more dev-friendly, higher-level API, to me—e.g., tools for doing RL-like tasks with differential programming in pure Swift).

So there seem to be plenty of reasons for excitement, but nothing is really coalescing into useful tools or exciting announcements... weird... Maybe this year? I expected a lot more last year, but maybe they were waiting to have more of their own silicon out before committing heavily?

Of course, I do get that between Nvidia+CUDA and Google+TPU the market is pretty much cornered for enterprise ML stuff, but I do think there's still room for providing better support for end-users to train and test/develop on their local machines. Fingers crossed on WWDC to announce something worth getting excited about... ?
It would be nice if Apple played more in this space, but the Nvidia+CUDA is so common with the major frameworks, Tensorflow, Pytorch, etc. having support for it. And play well with the Nvidia 3070s in my deskside machine at a relatively low cost.
 
  • Like
Reactions: Xiao_Xi

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
Swift has been differentiable for a while but not much gets said about it
What could Apple do that Google hasn't tried? Although Julia and Swift are better equipped for ML and RL than Python, all the major libraries in ML and RL use Python, so I doubt anything will change soon.

I do get that between Nvidia+CUDA and Google+TPU the market is pretty much cornered for enterprise ML stuff, but I do think there's still room for providing better support for end-users to train and test/develop on their local machines.
Unless Apple offers a solution for personal computers and servers, who is going to learn a solution that is not scalable?
 
  • Like
Reactions: jerryk

buckwheet

macrumors 6502
Mar 30, 2014
460
509
What could Apple do that Google hasn't tried? Although Julia and Swift are better equipped for ML and RL than Python, all the major libraries in ML and RL use Python, so I doubt anything will change soon.


Unless Apple offers a solution for personal computers and servers, who is going to learn a solution that is not scalable?
Well, broadly speaking, the "what could Apple do that Google hasn't tried" philosophy is pretty much a non-starter for tech, so I'll let that question go as a non sequitur. I mean, presumably Google is still willing to try things that Google hasn't tried... ?? But, in my understanding, differentiable programming brings online learning to the table, which offers a lot of potential—different applications, perhaps, but lots of potential.

On the second question; anything that can export a graph to ONNX (for onnxruntime) is pretty much scaleable, no?
 

buckwheet

macrumors 6502
Mar 30, 2014
460
509
It would be nice if Apple played more in this space, but the Nvidia+CUDA is so common with the major frameworks, Tensorflow, Pytorch, etc. having support for it. And play well with the Nvidia 3070s in my deskside machine at a relatively low cost.
Yeah, I run a linux box with a 2070 in it... can't really afford a 30x0 at the moment, and can rarely find one to buy anyway! ?

PS - In case anyone suspects I'm going to engage in some idiotic platform war here, I'm not. Nvidia+CUDA is obviously a no-brainer. I just think it makes sense for Apple to leverage the horsepower of their new machines for this purpose, and I'd love to have a machine that could tackle big music projects and train ML models the rest of the time.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
differentiable programming brings online learning to the table, which offers a lot of potential—different applications, perhaps, but lots of potential.
Google also thought Swift could be a great programming language for ML and created Swift for Tensorflow. They dropped it because everyone in the ML world uses Python.
 

buckwheet

macrumors 6502
Mar 30, 2014
460
509
Google also thought Swift could be a great programming language for ML and created Swift for Tensorflow. They dropped it because everyone in the ML world uses Python.
I'm very much aware of S4TF.
I've heard that Chris Lattner's underlying objective was to get first-class support for differentiable programming in Swift and that S4TF was a good way to do that. If that's true, then mission accomplished. If it isn't true, Swift got differentiable programming anyway. But this isn't about Python vs Swift, so I'm not sure why you bring it up. Apple could just as well announce a partnership with PyTorch at WWDC to give us up-to-the-minute Metal support for all new releases. That would keep us using Python, but it would still be super cool. And changing .device("cuda") to .device("metal") would be a pretty scalable way to work on a Mac Studio, no? That wouldn't offend any Nvidia+CUDA sensibilities, would it?
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
I seem to have misunderstood you. I thought you wanted Apple to promote Swift in the ML world when you wrote:
Apple could wrap into some more dev-friendly, higher-level API, to me—e.g., tools for doing RL-like tasks with differential programming in pure Swift

Apple could just as well announce a partnership with PyTorch at WWDC to give us up-to-the-minute Metal support for all new releases
That would be very cool!
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
Apple could just as well announce a partnership with PyTorch at WWDC to give us up-to-the-minute Metal support for all new releases.
You may be right.
 
  • Like
Reactions: theorist9

buckwheet

macrumors 6502
Mar 30, 2014
460
509
You may be right.
Haha! Yup, I just downloaded and gave it a quick spin. My 16" M1 Pro MBP, base GPU, just beat my RTX-2070 on a simple MNIST test (3.2s vs 5.7s per epoch). Of course, the 2070 is a mid-level, previous gen GPU, but I'm still pleasantly surprised. I'd imagine I'll see different results on different tests, of course. Still, it means I now have another option for running jobs while the 2070 is busy. Super cool.

I'm looking forward to seeing some user benchmarks of M1 Ultras against higher-end cards like the 3090. It might make a Mac Studio look a bit more interesting if performance is good (as I mentioned elsewhere, it would be great to have a music production machine that could run ML jobs as well).
Of course, the 4000 series are just around the corner, and if they manage to keep prices in reason, and availability isn't a total fiasco, then the Mac Studio might lose some of its appeal again... heh...

PS — As I hoped, switching platforms is a simple as swapping "cuda" for "mps". Perfect.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
Could someone run some of the benchmark tests that Apple ran for the Pytorch blog post?
Tested with macOS Monterey 12.3, prerelease PyTorch 1.12, ResNet50 (batch size=128), HuggingFace BERT (batch size=64), and VGG16 (batch size=64)
 

Boomhowler

macrumors 6502
Feb 23, 2008
324
19
Haha! Yup, I just downloaded and gave it a quick spin. My 16" M1 Pro MBP, base GPU, just beat my RTX-2070 on a simple MNIST test (3.2s vs 5.7s per epoch). Of course, the 2070 is a mid-level, previous gen GPU, but I'm still pleasantly surprised. I'd imagine I'll see different results on different tests, of course. Still, it means I now have another option for running jobs while the 2070 is busy. Super cool.

I'm looking forward to seeing some user benchmarks of M1 Ultras against higher-end cards like the 3090. It might make a Mac Studio look a bit more interesting if performance is good (as I mentioned elsewhere, it would be great to have a music production machine that could run ML jobs as well).
Of course, the 4000 series are just around the corner, and if they manage to keep prices in reason, and availability isn't a total fiasco, then the Mac Studio might lose some of its appeal again... heh...

PS — As I hoped, switching platforms is a simple as swapping "cuda" for "mps". Perfect.
Did you find the exact benchmarks that they used in the blogpost or did you create something yourself?
 

senttoschool

macrumors 68030
Nov 2, 2017
2,626
5,482
I've started benchmarking the M1 Max with PyTorch here: https://github.com/lucadiliello/pytorch-apple-silicon-benchmarks
Does increasing the memory size requirement of the project make the M series look better?

V100 only has 16GB of VRam. But Apple Silicon can go up to 128GB of VRam at a relatively cheap price. Perhaps this is where it can shine today?

For example, V100 16GB retailed for ~$10,000 when it first launched. An M1 Ultra with 64-core GPU and 128GB of RAM is $5800.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
For example, V100 16GB retailed for ~$10,000 when it first launched. An M1 Ultra with 64-core GPU and 128GB of RAM is $5800.
V100 is the previous generation of Nvidia GPUs. Wouldn't it be a fairer comparison M1 Ultra vs A100?

V100 only has 16GB of VRam. But Apple Silicon can go up to 128GB of VRam at a relatively cheap price. Perhaps this is where it can shine today?
Can Apple's GPU be a serious alternative to Nvidia's GPU in deep learning without fp16 and bfloat16 support?

Does increasing the memory size requirement of the project make the M series look better?
Does PyTorch have a profiler like Tensorflow has?

It seems that the current version of PyTorch is faster than the new version on CPU. Does anyone have a good explanation for this?
 
Last edited:
  • Like
Reactions: jerryk

jerryk

macrumors 604
Nov 3, 2011
7,421
4,208
SF Bay Area
One of the ML researchers I follow has started posting some benchmarks...not bad!

Thanks for posting the link. Looks like Apple has a ways to go. The numbers were a bit shocking, especially compared to old GPU cards like the Nvidia 1080. That is 3 to 4 generations old.
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
Thanks for posting the link. Looks like Apple has a ways to go. The numbers were a bit shocking, especially compared to old GPU cards like the Nvidia 1080. That is 3 to 4 generations old.

That’s a 250W desktop GPU with dedicated ML accelerator hardware vs. 20W laptop general purpose laptop GPU. Perf per watt is comparable.

Once Apple releases more capable matrix coprocessors the gap will shrink tremendously.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.