Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
One additional note I didn’t mention is when running LLaVA. This works fine on Ollama and is spookily accurate sometimes (not always perfect), but I’ve had no success on LM Studio (it produces gibberish) or Jan (it doesn’t want to run). More experimentation is required to figure out what’s going on, but there is a clear difference when it comes to ease of use.
 
One additional note I didn’t mention is when running LLaVA. This works fine on Ollama and is spookily accurate sometimes (not always perfect), but I’ve had no success on LM Studio (it produces gibberish) or Jan (it doesn’t want to run).
You have to be sure you're running the same model, i.e. same number of parameters, same quantization.

In my experience, the same Llama 3.3 70B model at different levels of quantization gave very different answers (all wrong though) for more specialized queries.
 
You have to be sure you're running the same model, i.e. same number of parameters, same quantization.

In my experience, the same Llama 3.3 70B model at different levels of quantization gave very different answers (all wrong though) for more specialized queries.
LLaVA is the model that can describe images. It can be used as a regular LLM (like Llama) but I’ve only used it for describing images so far.
 
I checked informally (as I don't use this model).
On Ollama, I see it's at 1.6.
On LM Studio, LLaVa it's at an earlier version.
I’ve just done a test and, before I did, LM Studio required an update.

After that, I did an image description test.

Llava 1.5 (7B) did work this time. This is the result:
Image 27-01-2025 at 10.30 pm.jpeg

Jan failed, but it’s still new and so this is probably to be expected:
Image 27-01-2025 at 10.29 pm.jpeg

Ollama with LLaVA (13B) produced the best result, although it’s not 100% accurate:
Image 27-01-2025 at 10.27 pm.jpeg
 
Try Ollama. It seems to allow me to run bigger models on my M1 Max, so I use that. LM Studio doesn't even let me load a 43GB model (which is under the 75% limit GPU max memory available under my 64GB RAM). 75% of 64GB is 48GB.
View attachment 2476171
I got that for the first time yesterday when trying out a larger LLM (I think it was deepseek-r1-distill-qwen-14b - 15.71GB). I turned off the "Guardrails" for force it to load, and my MBP had a bit of a meltdown (admittedly, I was also playing about with context length at the time). The first time I've seen it run into issues - it even rebooted itself. I guess it's not a good idea to turn off the Guardrails!

Today I’ve been trying out the 8.84GB version of deepseek-coder-v2-lite-instruct-mlx (4bit) in LM Studio, and I’m not all that impressed to be honest. I know smaller models are going to be “hit and miss”, which is why my future will see an M4 Max Studio (hopefully with 128GB RAM) on my desk, but for now I’m just experimenting / playing around.

Today’s task was to work through scenes in my stories to extract the names of characters so that I can put them in a list. Context length is a big problem here (proving to me that 24GB really isn’t enough!) but I’m getting around that by doing one scene at a time.

Listing the characters is, I’d say, about 60%-80% “good enough for what I need”. However, occasionally I ask it questions about the scene and it’s not very useful at all. If there’s anything slightly risqué in the scene, it skirts around it - leaving about a chasm between its method of avoidance and what is actually happening. If I use another model (like gemma2 in Ollama, or gemma-the-writer-n-restless-quill-10b-uncensored), I get far more useful responses.

Also, if I ask it to reason why something is happening, it’ll just provide a generic list of possibilities and will only get close to anything relevant to the events of the scene after I’ve explained what’s going on - at which point it effectively repeats what I just said.

For anything story-related, I won’t be using this version of DeepSeek.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.