Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Boomhowler

macrumors 6502
Feb 23, 2008
324
19
Hey Luca, nice work! However, when launching the benchmarking script on my M1 Max I am running into the issue described here (which I was indeed able to replicate): https://github.com/pytorch/pytorch/issues/78001.

Using exactly the same setup as in the repo. Did anyone else run into this?
Getting this error (which seems to be the same thing) regardless of sequence length. Running this on m1 max with 64GB

Code:
MPSNDArray.mm:782: failed assertion `[MPSNDArray, initWithBuffer:descriptor:] 
Error: buffer is not large enough. Must be 32768 bytes
 

Sterkenburg

macrumors 6502a
Oct 27, 2016
555
551
Japan
Getting this error (which seems to be the same thing) regardless of sequence length. Running this on m1 max with 64GB

Code:
MPSNDArray.mm:782: failed assertion `[MPSNDArray, initWithBuffer:descriptor:]
Error: buffer is not large enough. Must be 32768 bytes
Yeah, sounds very similar, I used the same machine. Happens regardless of hyperparameter settings.
 

chengengaun

Contributor
Feb 7, 2012
371
854
Getting this error (which seems to be the same thing) regardless of sequence length. Running this on m1 max with 64GB

Code:
MPSNDArray.mm:782: failed assertion `[MPSNDArray, initWithBuffer:descriptor:]
Error: buffer is not large enough. Must be 32768 bytes
Seems like a bug:


While I ran into this one:

 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,529
955
Apple will have a session on Pytorch and Tensorflow on Friday. 🤩

ML-metal.png
 

buckwheet

macrumors 6502
Mar 30, 2014
454
499
Thanks for posting the link. Looks like Apple has a ways to go. The numbers were a bit shocking, especially compared to old GPU cards like the Nvidia 1080. That is 3 to 4 generations old.
Yeah, cuda is Nvidia's secret sauce, but according to the platforms state-of-the-union video Metal 3 will accelerate PyTorch mps significantly. Looking forward to seeing the talk on Friday.
 

dgdosen

macrumors 68030
Dec 13, 2003
2,772
1,409
Seattle
How is it possible that Apple has not explained this before? Apple needs to learn to write changelogs. ;)

View attachment 2017301

Distributed training is very cool!


Does it matter if one is using MacOS Monterey (or a "pre-Ventura OS" and Metal V2(?)) vs MacOS Ventura and Metal V3?
Will these tensorflow or pytorch plugins work over different versions of Metal? Or is that all hidden behind the API surface of MPS Graph?

I'm assuming WWDC 22 is all Ventura and Metal 3 on Apple Silicon.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,529
955
Does it matter if one is using MacOS Monterey (or a "pre-Ventura OS" and Metal V2(?)) vs MacOS Ventura and Metal V3?
The minimum requirement is macOS 12.0 for Tensorflow and 12.3 for Pytorch.

Pytorch.png


I think the performance (and reliability) of Tensorflow and Pytorch on macOS depends heavily on whether the op you want to use is well supported by the GPU and not on your version of Metal. People keep finding ops in Tensorflow that are not yet supported by the GPU.

Apple seems to have focused on improving 3D rendering and gaming with Metal 3.
 
  • Like
Reactions: dgdosen and altaic

widEyed

macrumors regular
Aug 18, 2009
175
68
Not sure Apple will ever want to go the Cloud route but I agree that they need to up the ante for the Mac Pro platform and bring some feature parity on the GPU side, lest they want it to be just a "brand statement" computer confined to a niche of professional video producers. The potential is there with the AS architecture: lots of unified memory that can be accessed by the GPU, high bandwidth, low latency. But it needs software support.

I have always been disappointed at how the quarrel with Nvidia resulted in Apple just letting go of ML/AI computing without even trying anymore. It is even more perplexing when you consider that a majority of the scientists and engineers in the field use a Mac as a work machine... I really hope AS can be the trigger for things to turn around.

does apple shipping silicon with neural net cores count for anything? is it useful to researchers or only for Apple Store app developers that might make use of it?

Craig Federighi says he’s always been fascinated with ML in interviews and hopes/expects Apple pursues it deeper over time.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,529
955
does apple shipping silicon with neural net cores count for anything? is it useful to researchers or only for Apple Store app developers that might make use of it?
Apple hardware is very good at inference, but not so good at training. But Apple is getting better at training, and distributed training across multiple Mac Studio is now possible.

Does the M1/M2 SoC support native bfloat16 arithmetic?

Is there any benchmarking comparing Tensorflow and PyTorch on macOS?
 

name99

macrumors 68020
Jun 21, 2004
2,282
2,139
Apple hardware is very good at inference, but not so good at training. But Apple is getting better at training, and distributed training across multiple Mac Studio is now possible.

Does the M1/M2 SoC support native bfloat16 arithmetic?

Is there any benchmarking comparing Tensorflow and PyTorch on macOS?
Would PiM improve the situation?
What do you think of my hypothesis here:
https://www.realworldtech.com/forum/?threadid=208595&curpostid=208595
?

It would be helpful if we had some serious M2 teardowns/cross sections, but we do not.
An A16 cross section will help, maybe we'll have one in a month or so.
 
  • Like
Reactions: jdb8167

leman

macrumors Core
Oct 14, 2008
19,319
19,336

name99

macrumors 68020
Jun 21, 2004
2,282
2,139
I think that’s pretty far fetched 😅 but who knows? Could the mysterious die simply be the SoC cache or something like that?
SoC cache is on the SoC die -- it's easily visible in die shots.
 

mrsavage1

macrumors regular
Feb 1, 2010
220
0
Yeah, cuda is Nvidia's secret sauce, but according to the platforms state-of-the-union video Metal 3 will accelerate PyTorch mps significantly. Looking forward to seeing the talk on Friday.
Any news on this since Ventura has been released with metal 3. Been trying to find any new benchmarks that show metal3 increases PyTorch mps performance significantly
 

leman

macrumors Core
Oct 14, 2008
19,319
19,336
Any news on this since Ventura has been released with metal 3. Been trying to find any new benchmarks that show metal3 increases PyTorch mps performance significantly

Ventura hasn’t been released. And why would you expect Metal 3 to do anything for PyTorch at all? Most improvements in Metal 3 target raytracing and gaming.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,529
955
Should Apple adopt this format?
 

mrsavage1

macrumors regular
Feb 1, 2010
220
0
Ventura hasn’t been released. And why would you expect Metal 3 to do anything for PyTorch at all? Most improvements in Metal 3 target raytracing and gaming.

Metal backend for PyTorch​

The new Metal backend in PyTorch version 1.12 enables high-performance, GPU-accelerated training using MPS Graph and the Metal Performance Shaders primitives.

In the metal 3 overview PyTorch is mentioned saying it uses metal performance shaders then in the Metal shaders part Apple says there's a performance boost

Mesh shaders​

This new geometry pipeline replaces vertex shaders with two new shader stages — object and mesh — that enable more flexible culling and LOD selection, and more efficient geometry shading and generation.
 

GrumpyCoder

macrumors 68020
Nov 15, 2016
2,074
2,654

mrsavage1

macrumors regular
Feb 1, 2010
220
0
That refers to the PyTorch 1.12 backend which comes with MPS out of the box. It's been available in nightly releases before though, so there should be nothing new here: https://sebastianraschka.com/blog/2022/pytorch-m1-gpu.html
how about the updates to the shaders in metal 3 which PyTorch uses?

Mesh shaders​

This new geometry pipeline replaces vertex shaders with two new shader stages — object and mesh — that enable more flexible culling and LOD selection, and more efficient geometry shading and generation.
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.