Try Ollama. It seems to allow me to run bigger models on my M1 Max, so I use that. LM Studio doesn't even let me load a 43GB model (which is under the 75% limit GPU max memory available under my 64GB RAM). 75% of 64GB is 48GB.
View attachment 2476171
I got that for the first time yesterday when trying out a larger LLM (I think it was
deepseek-r1-distill-qwen-14b - 15.71GB). I turned off the "Guardrails" for force it to load, and my MBP had a bit of a meltdown (admittedly, I was also playing about with context length at the time). The first time I've seen it run into issues - it even rebooted itself. I guess it's not a good idea to turn off the Guardrails!
Today I’ve been trying out the 8.84GB version of
deepseek-coder-v2-lite-instruct-mlx (4bit) in LM Studio, and I’m not all that impressed to be honest. I know smaller models are going to be “hit and miss”, which is why my future will see an M4 Max Studio (hopefully with 128GB RAM) on my desk, but for now I’m just experimenting / playing around.
Today’s task was to work through scenes in my stories to extract the names of characters so that I can put them in a list. Context length is a big problem here (proving to me that 24GB really isn’t enough!) but I’m getting around that by doing one scene at a time.
Listing the characters is, I’d say, about 60%-80% “good enough for what I need”. However, occasionally I ask it questions about the scene and it’s not very useful at all. If there’s anything slightly risqué in the scene, it skirts around it - leaving about a chasm between its method of avoidance and what is actually happening. If I use another model (like
gemma2 in Ollama, or
gemma-the-writer-n-restless-quill-10b-uncensored), I get far more useful responses.
Also, if I ask it to reason why something is happening, it’ll just provide a generic list of possibilities and will only get close to anything relevant to the events of the scene after I’ve explained what’s going on - at which point it effectively repeats what I just said.
For anything story-related, I won’t be using this version of DeepSeek.