Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

TechnoMonk

macrumors 68030
Oct 15, 2022
2,604
4,112
Ah ok. Hey, maybe you should get a PhD in the field and a professorship at a leading university teaching this stuff. But of course it's always the others that have no idea.

I didn't say you're using FCP, I said the M series is optimised for workflow similar to FCP. You're using video workflow and maybe, just maybe you should check what exactly happens in these models and the model output. How many of these models have you created yourself? How many have you published at peer-reviewed conferences? None. You can't even get your 4090 going. 'Nuff said.

There is no A8000, only a A6000. The 8000 is a RTX8000, no A there. Good thing you know your stuff and don't have to rely on people who don't know what they're talking about. Oh wait...
For what is worth, I do have a PhD. Sure a typo of RTx 8000 typing on phone. Lol. I have checked my models, I have used my own custom models trained on A100.
It’s not hard to understand there are issues with Nvidia just like apple or any other vendor. Unlike most folks all I care is fixing my work flow. I use Apple silicon for certain tasks, Nvidia for others.
 

TechnoMonk

macrumors 68030
Oct 15, 2022
2,604
4,112
lol, who even use ColeML to run WebUI? Using GPU is the most fastest way to generate AI images and I have no idea what you are talking about?
You just posted screen shots of CPU cores being used by Automatic1111. Let’s see your GPU.
 
  • Haha
Reactions: sunny5

sunny5

macrumors 68000
Jun 11, 2021
1,835
1,706
You just posted screen shots of CPU cores being used by Automatic1111. Let’s see your GPU.
Seriously, do you even care to check GPU history at all with blue bars? You did not. Also, CPU ALWYAS use with or without WebUI so you just prove yourself ignorant. 2 efficiency cores are really meaningless and do you really think2 cores is enough to generate images?

Since you don't know what you are talking about from the beginning, I'll just ignore you as you are wasting my precious time on AI.
 

TechnoMonk

macrumors 68030
Oct 15, 2022
2,604
4,112
Seriously, do you even care to check GPU history at all with blue bars? You did not. Also, CPU ALWYAS use with or without WebUI so you just prove yourself ignorant. Since you don't know what you are talking about from the beginning, I'll just ignore you as you are wasting my precious time on AI.
Ok. I was on phone, and it wasn't clear in the pic till I flipped it to landscape mode. how long did it take you to generate the image? Looks like automatic1111 is poorly optimized, if at all for apple silicon. How much memory do you have? WHy do I use Coreml, coz I have my own inference, and use dynamic batching.

Poor Performance:​

Currently GPU acceleration on macOS uses a lot of memory. If performance is poor (if it takes more than a minute to generate a 512x512 image with 20 steps with any sampler) first try starting with the --opt-split-attention-v1 command line option (i.e. ./webui.sh --opt-split-attention-v1) and see if that helps. If that doesn't make much difference, then open the Activity Monitor application located in /Applications/Utilities and check the memory pressure graph under the Memory tab. If memory pressure is being displayed in red when an image is generated, close the web UI process and then add the --medvram command line option (i.e. ./webui.sh --opt-split-attention-v1 --medvram). If performance is still poor and memory pressure still red with that option, then instead try --lowvram (i.e. ./webui.sh --opt-split-attention-v1 --lowvram). If it still takes more than a few minutes to generate a 512x512 image with 20 steps with with any sampler, then you may need to turn off GPU acceleration. Open webui-user.sh in Xcode and change #export COMMANDLINE_ARGS="" to export COMMANDLINE_ARGS="--skip-torch-cuda-test --no-half --use-cpu all".


This fix apparently reduces but still doesn't look like Automatic1111 has any AS optimizations in the code.

 
Last edited:

leman

macrumors Core
Oct 14, 2008
19,520
19,670
Isn't the A-Series SoC already geared towards AI? They named their SoC Bionic for a reason.

Apple’s AI accelerators are geared towards energy-efficient, low-power ML inference to support app needs. These are relatively small devices, running small models. The chips @Xiao_Xi is talking about are dedicated cloud computing ML, for demanding applications. If Apple builds something like that, it would be for internal consumption (like Siri). Does it make sense? No idea.
 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
Apple’s AI accelerators are geared towards energy-efficient, low-power ML inference to support app needs. These are relatively small devices, running small models. The chips @Xiao_Xi is talking about are dedicated cloud computing ML, for demanding applications. If Apple builds something like that, it would be for internal consumption (like Siri). Does it make sense? No idea.
IMHO edge computing is where it's at. Apple is likely skating to where the puck is going.
 
  • Like
Reactions: dgdosen

leman

macrumors Core
Oct 14, 2008
19,520
19,670
IMHO edge computing is where it's at. Apple is likely skating to where the puck is going.

Sure, but does Apple want to become an edge computing provider? It might be cheaper (and simpler) for them to just buy it from somewhere else... after all, building good ML hardware for smartphones or even desktops is not the same as building good ML hardware for cloud computing.
 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
Sure, but does Apple want to become an edge computing provider? It might be cheaper (and simpler) for them to just buy it from somewhere else... after all, building good ML hardware for smartphones or even desktops is not the same as building good ML hardware for cloud computing.
Well, 10 years ago, Apple doesn't have anything that can power macOS. Maybe 10 years from now, whatever the iPhone will morph into will be good enough to for ML.
 

senttoschool

macrumors 68030
Original poster
Nov 2, 2017
2,626
5,482
IMHO edge computing is where it's at. Apple is likely skating to where the puck is going.
Depends. It's possible that the best models have to run in the cloud because of how big current and future LLMs can be.

Also, AIs aren't very latency sensitive.

And quite honestly, who knows what the future computing device actually is? It could be just a big screen that connects to a giant AI in the cloud and nothing else.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
IMHO edge computing is where it's at. Apple is likely skating to where the puck is going.
Even if Apple didn't do cloud inference, it needs to train models in the cloud. Training models is very expensive and Apple could save a lot of money by using its own chips.
 

ocimpean

macrumors newbie
Apr 27, 2023
1
0
Interesting. I was going to build a new PC with a RTX 4090 for AI, but based on what I am reading it sounds like a bad idea?

Is it not a driver issue that can simply be fixed with a software / firmware update?
I’m in a similar position. My current laptop has a 1060 with 6GB vram and an old icore7 and 16gb ram.
Running locally Automatic 1111 with Stable Diffusion 1.5, and other models that can generate text to image batches of maximum 512x512 pixels. More than that I’m running out of memory. I was ready to build a desktop with an Rtx 4090, when I decided to run Stable Diffusion on an iPad Pro with an M1 chip. Imagine my surprise when I was able to generate 1024x1024 images on the iPad. That got me thinking and I postponed getting the 4090.
Second thing that bothers me: I’m running locally Alpaca 7b 4bit, with NPX, and a variant via a web interface, that can be adjusted to run on GPU, or CPU, first case using Vram, second using regular Ram. The models are loaded ok, but after about 20 lines of dialogue I’m getting the dreaded out of memory message on the web ui, regardless of CPU or GPU choice. The NPX behaves better, but memory issues arrive sooner or later.
Llama 13b cannot be used, not to talk about 65b.

After the surprise I got with the iPad running Stable Diffusion, I started to think that the shared memory could bring more to the table than the brutal power that Nvidia offers and maybe allows the Ai to use a 64-128 ram as resources for GPU on Mac, and as such, permitting the local installation of large models like Llama 65b.

I see very knowledgeable people on this forum, so I would like to ask if any of you managed to load one of the large 65b models on your machine, be it a PC or a Mac, what specifications your computer has, also how was the performance?

Thank you.
 

arinamichel911

macrumors member
May 4, 2023
54
11
As of 2023, Apple Silicon has made significant strides in terms of its support for AI frameworks such as PyTorch and TensorFlow. Apple has invested heavily in optimizing these frameworks for their hardware, and both frameworks are now fully supported on Apple Silicon. Many developers have reported significant performance improvements when running AI workloads on Apple Silicon-based Macs, especially for tasks involving image and video processing.
 

senttoschool

macrumors 68030
Original poster
Nov 2, 2017
2,626
5,482
As of 2023, Apple Silicon has made significant strides in terms of its support for AI frameworks such as PyTorch and TensorFlow. Apple has invested heavily in optimizing these frameworks for their hardware, and both frameworks are now fully supported on Apple Silicon. Many developers have reported significant performance improvements when running AI workloads on Apple Silicon-based Macs, especially for tasks involving image and video processing.
This reads like a response from an LLM. ChatGPT?

Anyways, the age of LLM internet spam is here.

Let's cherish the remaining days we have of talking to real people on the internet.
 

arinamichel911

macrumors member
May 4, 2023
54
11
This reads like a response from an LLM. ChatGPT?

Anyways, the age of LLM internet spam is here.

Let's cherish the remaining days we have of talking to real people on the internet.
If it looks you like a chatbot or whatever another ai, over the internet lot of detectors, are avail
 

TechnoMonk

macrumors 68030
Oct 15, 2022
2,604
4,112
Does the unified memory offer some advantages since most consumer gpu's top out at 24gb?
Absolutely. My 64 GB M1 Max uses 40-48 GB running some of the inferences 4090 runs out of memory. M1 Max would be slower with lack of RT core and lower T flops.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
On Thursday:

Optimize machine learning for Metal apps​

Discover the latest enhancements to accelerated ML training in Metal. Find out about updates to PyTorch and TensorFlow, and learn about Metal acceleration for JAX. We'll show you how MPS Graph can support faster ML inference when you use both the GPU and Apple Neural Engine, and share how the same API can rapidly integrate your Core ML and ONNX models. For more information on using Metal for machine learning, check out “Accelerate machine learning with Metal” from WWDC22.

It seems that Apple has created a Metal backend for JAX.
 
Last edited:
  • Like
Reactions: dgdosen
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.