Would the cheapest M1/M2 MBP be OK to run Stable Diffusion on MacOS...

Appletoni · Sep 17, 2022

GMGN said:
Yeah I know SD is compatible with M1/M2 Mac but not sure if the cheapest M1/M2 MBP would be enough to run?

Or Specifically if I wish I could get 50 images with 10m with those promts and setting:

Promts
ethereal mystery portal, seen by wanderer boy in middle of woods, vivid colors, fantasy, trending on artstation, artgerm, cgsociety, greg rutkwoski, alphonse mucha, unreal engine, very smooth, high detail, 4 k, concept art, brush strokes, pixiv art, sharp focus, raging dynamic sky, heavens

Seed
3733741481

Guidance scale
7

Dimensions
1024 × 1024

How many GPU/CPU/RAM or which type of M1/M2 would you recommend?

Looking at chess results and how bad they are, you should take the MacBook Pro 16-inch M1 MAX with 32 GPU cores and 64 GB RAM and 8 TB SSD or buy the stronger M2 MAX / M2 Ultra MacBook.

leman · Sep 17, 2022

Appletoni said:
Looking at chess results and how bad they are, you should take the MacBook Pro 16-inch M1 MAX with 32 GPU cores and 64 GB RAM and 8 TB SSD or buy the stronger M2 MAX / M2 Ultra MacBook.

What does stable diffusion have to do with chess results?

Xiao_Xi · Sep 17, 2022

Does Stable Diffusion use CoreML or Pytorch on Apple hardware?

If it does not use CoreML, it is normal for Stable Diffusion to be slow on Apple hardware because Pytorch has an experimental Metal backend. If Stable Diffusion is ported to Tensorflow, performance should be better because the Metal backend is more mature.

mi7chy · Sep 17, 2022

altaic said:
Took 21 seconds with a peak of 14.2 GB RAM utilization and a constant 100% GPU usage on my MBP M1 Max 64GB. Using DiffusionBee, so prompt_strength isn't settable, but all the other settings were as you described. Also, I had a dozen apps open with a couple hundred windows and over a thousand tabs in Safari, so not exactly a best-case benchmarking scenario.

What's the wattage pulled from wall under load?

9.23 seconds on laptop 3060 configured for 70W. 84W total system power pulled from wall. Maybe I'll get around to upgrading from Big Sur to Monterey to try on M1 MBA.

Stable Diffusion WebUI - Google Chrome 9_17_2022 9_13_15 AM.png

MysticCow · Sep 17, 2022

Xiao_Xi said:
Stable Diffusion uses Pytorch and Pytorch has an experimental Metal backend, so I don't expect Stable Diffusion to be very fast on Apple hardware.

They all were warned well over a year ago to switch to Metal AND THAT YOU WILL BE VERY SORRY IF YOU DO NOT SWITCH TO METAL.

Well…feel sorry developers. You didn’t listen when it came to CodeWarrior vs XCode and you didn’t listen here, either. History tends to rhyme.

GMGN · Sep 17, 2022

altaic said:
Took 21 seconds with a peak of 14.2 GB RAM utilization and a constant 100% GPU usage on my MBP M1 Max 64GB. Using DiffusionBee, so prompt_strength isn't settable, but all the other settings were as you described. Also, I had a dozen apps open with a couple hundred windows and over a thousand tabs in Safari, so not exactly a best-case benchmarking scenario.

looks nice actually!

Xiao_Xi · Sep 18, 2022

MysticCow said:
They all were warned well over a year ago to switch to Metal AND THAT YOU WILL BE VERY SORRY IF YOU DO NOT SWITCH TO METAL.

Do you think Pytorch and Tensorflow developers are like iOS developers who worry if their app doesn't run well on Apple hardware? Think again. Meta (Pytorch) and Google (Tensorflow) only care about running on large Nvidia GPUs, so Pytorch and Tensorflow have a very mature CUDA backend.

Only Apple benefits from Pytorch running fast on Apple hardware, so Apple could have given Pytorch a Metal backend earlier, as they have done with Tensorflow. If Apple cared about Pytorch, it would have joined the Pytorch foundation.

PyTorch Foundation

Learn how the PyTorch Foundation supports collaboration and growth in the deep learning ecosystem.

pytorch.org

leman · Sep 18, 2022

Xiao_Xi said:
Only Apple benefits from Pytorch running fast on Apple hardware, so Apple could have given Pytorch a Metal backend earlier, as they have done with Tensorflow.

A lot of developers prefer the Mac experience and better support for these frameworks is a popular request. That’s also why PyTorch devs are pursuing Apple Silicon accelerated backend (if I understand correctly, with Apples assistance). It will still take a while for these implementations to mature and of course they won’t replace high-end Nvidia GPUs any time soon, but it’s something that’s being actively worked on.

And btw, I think the discussion here misses the point about viability of ML workloads on Apple Silicon. A MacBook doesn’t need to outperform an RTX 3080. The implementation just needs to be fast enough to support prototyping or ML assisted work. Serious inference will be done in the cloud anyway.

Xiao_Xi · Sep 18, 2022

leman said:
A lot of developers prefer the Mac experience and better support for these frameworks is a popular request. That’s also why PyTorch devs are pursuing Apple Silicon accelerated backend (if I understand correctly, with Apples assistance). It will still take a while for these implementations to mature and of course they won’t replace high-end Nvidia GPUs any time soon, but it’s something that’s being actively worked on.

And btw, I think the discussion here misses the point about viability of ML workloads on Apple Silicon. A MacBook doesn’t need to outperform an RTX 3080. The implementation just needs to be fast enough to support prototyping or ML assisted work. Serious inference will be done in the cloud anyway.

Why should Meta employees care more than Apple employees about whether Pytorch runs fast on Apple hardware? Apple hasn't even joined the Pytorch foundation. Besides, Pytorch and Tensorflow are open source projects, and anyone can improve them.

Tensorflow and Pytorch are developed primarily for data center GPUs, and any other hardware has lower priority. The Tensorflow developers were smart to decouple the kernel code from the backend, so they let third parties worry about other backends while they take care of their main focus, the backend for TPUs and Nvidia GPUs.

PluggableDevice: Device Plugins for TensorFlow

In this post, we introduce the PluggableDevice architecture which offers a plugin mechanism for registering devices with TensorFlow without the need

blog.tensorflow.org

DearthnVader · Sep 18, 2022

altaic said:
Took 21 seconds with a peak of 14.2 GB RAM utilization and a constant 100% GPU usage on my MBP M1 Max 64GB. Using DiffusionBee, so prompt_strength isn't settable, but all the other settings were as you described. Also, I had a dozen apps open with a couple hundred windows and over a thousand tabs in Safari, so not exactly a best-case benchmarking scenario.

Around 65 seconds with 13" MBP M2 16GB. Lots of other stuff was open.....

mi7chy · Sep 18, 2022

M1 MBA @ 100 secs so 11x slower than 3060 laptop @ 9s.

Screen Shot 2022-09-18 at 5.58.42 AM.png

Search

Search

Would the cheapest M1/M2 MBP be OK to run Stable Diffusion on MacOS...

Appletoni

Suspended

leman

macrumors Core

Xiao_Xi

macrumors 68000

mi7chy

Suspended

MysticCow

macrumors 68000

GMGN

macrumors newbie

Xiao_Xi

macrumors 68000

PyTorch Foundation

leman

macrumors Core

Xiao_Xi

macrumors 68000

PluggableDevice: Device Plugins for TensorFlow

DearthnVader

Suspended

mi7chy

Suspended

Our Staff