Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Welcome to MacRumors
If you want to bring it back on topic, you could post some performance numbers between your Mac and PC setups for the same generation (prompt, seed, steps, cfg_scale, scheduler). Or maybe follow up on what happened with your PC setup once you installed CUDA.
 
  • Like
Reactions: theorist9
Interesting thread! @Spaceboi Scaphandre

Is Spatial Diffusion set up for GPU compute on both NVIDIA cards and Apple Silicon? Did you post somewhere how many GPU cores your MacBook 14" has? Are you leveraging the GPU on both the MacBook and the laptop with the 3060?

The 32-core M1 Max GPU seems to compete well with the mobile 3060 in various benchmarks, and it would be interesting to see a real-world comparison of machine learning performance between these two GPUs.

My gaming PC isn't a laptop lmao. It's a desktop RTX 3060. Completely different beast than a laptop spec 3060. And my M1 Pro is a base spec with 14 cores in the GPU.

I don't know anything about Spatial Diffusion lmao. I just set up Stable Diffusion and a set of weights made by weebs. I'm not an AI developer I'm just a network administrator lmao. I do all this for fun.

But what I compare between the two is the comfort I get from working with the AI on my Mac vs the gaming desktop, and I'm sure if I configured it a certain way the desktop will probably be slightly faster, but in the end I'd rather just use the Mac because of the human comfort factor of working on it. As I said rendering on my gaming PC, it gets loud. Super loud. The RTX 3060 alone has to ramp it's fans to overdrive as it works. Playing games with raytracing on like Insomniac's Spider-Man Remastered and it never got to this point. Only when using SD did it ever get this loud. Not to mention the power draw it uses to render.

Meanwhile on my MacBook Pro, I can just do it, and set it and forget it and go do other things while it quietly works in the background. It's fans do kick on as well but nowhere near the point of it being so loud. You can hear it but it's so quiet it sounds like a small desk fan, and is only audible when you're next to it. It gets an image done every minute for me so I can make batches of 6-10 at a time, and even 20, taking longer the more steps or the bigger the image I make it. In fact last night, I had a batch of 100 images at 786x786 each queued at 80 steps. I went to bed while it ran in sleep mode and I couldn't hear the fans at all from my bed, and when I woke up at 5:30, I found it finished hours ago.
 
If you want to bring it back on topic, you could post some performance numbers between your Mac and PC setups for the same generation (prompt, seed, steps, cfg_scale, scheduler). Or maybe follow up on what happened with your PC setup once you installed CUDA.
That would be a good idea were it not that I don not own and do not wish to own any PCs anymore!
 
Pc vs Mac aaaaah, ye if I had a penny for every time I seen this one, I could buy a 4090TI AND a Top spec Mac Studio :p
 
Why does a great discussion about using creative AI tools degrade to a useless discussion of gaming?
Meanwhile on my PC, it was so loud as the RTX 3060's fans had to kick into overdrive as if it was running Cyberpunk 2077.

After all that, man I wish Apple wasn't out of touch with the game industry, because if the Mac had the same games my PC has I'd be able to ditch Windows for good and just go fulltime for Mac for everything. With how good that Macbook Pro was I'd happily trade my PC for a Mac Studio and have that be a new gaming rig if it had access to my library.
Maybe you should read the original post and you would figure out that there are multiple topics discussed. Just because you think that topic as doesn’t mean that’s the reality
 
Meanwhile on my MacBook Pro, I can just do it, and set it and forget it and go do other things while it quietly works in the background. It's fans do kick on as well but nowhere near the point of it being so loud. You can hear it but it's so quiet it sounds like a small desk fan, and is only audible when you're next to it. It gets an image done every minute for me so I can make batches of 6-10 at a time, and even 20, taking longer the more steps or the bigger the image I make it. In fact last night, I had a batch of 100 images at 786x786 each queued at 80 steps. I went to bed while it ran in sleep mode and I couldn't hear the fans at all from my bed, and when I woke up at 5:30, I found it finished hours ago.

Why so secretive about time completion and parameters? 1 min at what image size and steps? How long to complete 100 images at 786x786 80 steps?

Here's 10 images at 768x768 80 steps in 8 mins 39 secs on laptop 3060 so 100 images is <= 1 hour 26 mins in quiet mode or 1 hour 13 mins in performance mode which is closer to small Honeywell Turboforce desk fan on 1 setting.

1665593350074.png
 
Last edited:
  • Like
Reactions: AAPLGeek and jerryk
Why so secretive about time completion and parameters? 1 min at what image size and steps? How long to complete 100 images at 786x786 80 steps?

Here's 10 images at 768x768 80 steps in 8 mins 39 secs on laptop 3060 so 100 images is <= 1 hour 26 mins in quiet mode or 1 hour 13 mins in performance mode which is closer to small Honeywell Turboforce desk fan on 1 setting.

View attachment 2093424

I'm not being secretive. You're asking me questions I don't know the answer to. If I'm doing wrong with the settings of my PC let me know
 
  • Like
Reactions: tmoerel
...

Here's 10 images at 768x768 80 steps in 8 mins 39 secs on laptop 3060 so 100 images is <= 1 hour 26 mins in quiet mode or 1 hour 13 mins in performance mode which is closer to small Honeywell Turboforce desk fan on 1 setting.

View attachment 2093424
That is nice for 9 minutes work on a laptop. I want to try this on my desktop (10xxx CPU, 3070 Card, 32 GB of memory, multiple SSDs) and compare it to my Mac Studio (32 GB, 512 GB internal drive, 512 GB external drive).
 
That is nice for 9 minutes work on a laptop. I want to try this on my desktop (10xxx CPU, 3070 Card, 32 GB of memory, multiple SSDs) and compare it to my Mac Studio (32 GB, 512 GB internal drive, 512 GB external drive).

I'm curious how well the Mac Studio handles it too since I got an M1 Pro and you got an M1 Max
 
I'm not being secretive. You're asking me questions I don't know the answer to. If I'm doing wrong with the settings of my PC let me know
There is no secret sauce here. Just test it! Pick a prompt and a group of parameter settings and time how long it takes. Invoke probably reports generation speed in seconds per iteration (or iterations per second), which would be useful as well. We're not asking you to justify your choice of Mac over PC. I just want to get a concrete idea of the speed available.

Note that there may be a considerable difference between PC and Mac introduced by the math precision used. Half precision is slightly less coherent than full precision, but much faster, so is generally preferred where available. However, I believe Macs are forced to use full precision due to some PyTorch incompatibility.
 
Except both OpenGL and Vulcan are available for macOS.

MoltenVK wrapper but not direct Vulkan. And, as someone else hinted only OpenGL 4.1 (2010) is supported so missing 4.2 (2011), 4.3 (2012), 4.4 (2013), 4.5 (2014) and 4.6 (2017).
 
The M1 GPU is an integrated GPU with the CPU and I therefore do not think it is going to beat a intel 9th gen CPU desktop which has a dedicated GTX 3060 GPU because if it did then in my opinion every graphics artist, designer, movie maker, video maker would be dropping their widows machines and purchasing M1 macbooks but somehow i do not think that is happening.

My MacBook Pro M1 Max easily beats my desktop with Ryzen 5900X & RTX 3080 when it comes to editing photos & videos with its 32-core "integrated GPU". Similarly, PS5 and Xbox Series X play many games at 4K 60 FPS when many computers with comparable specs struggle to do so.

How powerful a GPU is a completely different topic than whether that GPU is a part of and SoC or not. In fact, there are quite a lot of scenarios benefiting from all processing units being on the same die using a unified memory.
 
Last edited:
Except both OpenGL and Vulcan are available for macOS.
For peak performance, Apple has their proprietary Metal, Nvidia has their proprietary CUDA. Apple’s not going to make it easy for someone to use non-Metal and get excellent results on macOS because what Apple needs is to have a continuous years long focus from developers around the world building Metal-specific tools. Not much different from the developers that have done the same for Nvidia. Metal performance over time will get better as the Metal knowledge pool grows, which will benefit any developer making apps for macOS (and, by extension, their customers).
 
  • Like
Reactions: jerryk
My MacBook Pro M1 Max easily beats my desktop with Ryzen 5900X & RTX 3080 when it comes to editing photos & videos with its 32-core "integrated GPU". Similarly, PS5 and Xbox Series X plays many games at 4K 60 FPS when many computers with comparable specs struggle to do so.

How powerful a GPU is a completely different topic than whether the GPU is a part of and SoC or not. In fact, there are quite a lot of scenarios benefiting from all processing units being on the same die using a unified memory.
If game console makers and Apple can make integrated CPU/GPU chips outperform a dedicated top of the range desktop GPU then why haven't Intel or AMD been able to do what Apple, Sony and Microsoft have been able to do with there integrated CPU/GPU chips.
 
If game console makers and Apple can make integrated CPU/GPU chips outperform a dedicated top of the range desktop GPU then why haven't Intel or AMD been able to do what Apple, Sony and Microsoft have been able to do with there integrated CPU/GPU chips.
Because the solutions that perform that well are fairly non-backwards compatible to Windows. And AMD and Intel MUST be backwards compatible to Windows and the legacy of the x86 instruction set to the detriment of almost everything else.
 
If game console makers and Apple can make integrated CPU/GPU chips outperform a dedicated top of the range desktop GPU then why haven't Intel or AMD been able to do what Apple, Sony and Microsoft have been able to do with there integrated CPU/GPU chips.
Because no one want to buy those solutions separately. For the record AMD makes the SoC in the PlayStation and Xbox. So they know how to.
 
There is no secret sauce here. Just test it! Pick a prompt and a group of parameter settings and time how long it takes. Invoke probably reports generation speed in seconds per iteration (or iterations per second), which would be useful as well. We're not asking you to justify your choice of Mac over PC. I just want to get a concrete idea of the speed available.

Note that there may be a considerable difference between PC and Mac introduced by the math precision used. Half precision is slightly less coherent than full precision, but much faster, so is generally preferred where available. However, I believe Macs are forced to use full precision due to some PyTorch incompatibility.

So I found what I did wrong on PC. CUDA was not installed. It's now rendering faster than my Mac. However it's still loud, and my images on PC are making some really weird mutations. I find that the renders I do on Mac are a lot more stable. There's some renders the Mac does faster on too such as k_lms.

So I guess it goes back to that Mac Address video. The x86 PCs are faster but jankier while the ARM Macs are a lot more precise and stable.
 
If game console makers and Apple can make integrated CPU/GPU chips outperform a dedicated top of the range desktop GPU then why haven't Intel or AMD been able to do what Apple, Sony and Microsoft have been able to do with there integrated CPU/GPU chips.

They have been. PS5 and Xbox SoC are made by AMD. Nvidia has custom SoC for the likes of Nintendo Switch (there is a new one rumoured for the next iteration of the console) and they are quite capable of making more powerful ones if Windows ARM devices were more common.

Some AMD Ryzen 7000 laptop CPUs are rumoured to have 16-core and 24-core integrated GPU instead of the current 8-core and 12-core variants.
 
Dunno if we can really call the Mac a niche market anymore with how much the ARM machines have been kicking ass in the laptop market. Hell it's been outselling the iPad for two years now.

They are a niche market for gaming when they Mac makes up less than 3% of that market. I wish it weren't true, but it's going to be a long damn time before lots of good games start appearing on the Mac, if ever.

It being technologically possible and highly desirable isn't enough to make devs actually start taking risks on Apple Silicon Mac gaming.
 
So I found what I did wrong on PC. CUDA was not installed.

CUDA is part of the driver. Never had to install CUDA separately. Just minimal driver install without Geforce Experience, PhysX and USB unless you have VR headset.

This thread makes little sense so for anyone on the sideline interested in dabbling with Stable Diffusion it's all about the prompt input, sampler method and parameters. Here's a great place to start. Hover mouse over an image you're interested in to see variation in prompt input or open image in new tab.

https://publicprompts.art/

1665636666770.png
 
  • Like
Reactions: Xiao_Xi
AMD will not only manufacture APUS for notebooks,

but also for data centers.

Yes, I know. Nvidia also does so. When the previous member questioned why they do not, I wanted to give examples from consumer devices 👍🏼
 
So anyone who has been doing AI image generating on their Macs as well, what prompts have you been using? I mean you already know what I've been doing. =w=
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.