Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
I just want to know when this guy's coming back.
Xserve_Mid_2002.jpeg
I’m guessing never. Who would trust them again?
 
I wonder if Apple's plan/promise of becoming 100% carbon neutral by 2030 extends to server farms for AI? They use a freakish amount of power. Hopefully M4 Ultra can assist. Coupled with a LOT of solar panels up there on the roof.
 
Did Apple upgrade to M4 because the base memory was increased? 😄

But seriously, the base M4 NPU outperforms the M2 Ultra. The upgrade is well deserved.

An M2 Ultra goes up to 192Gb. An M4 Max tops at 128Gb. So unless they produce an M4 Ultra or an M4 Extreme for those servers, they won't be switching due to memory. [Dammit, now you have me dreaming of a 512Gb M4 Extreme in a Mac Studio.]
 
Given the scale of Nvidia GPU sales I'd like to hope the Apple gets back into the server game since there is a lot of money to be made on high memory inference hardware. Assume the M4 Ultra is able to come with 384 GB of memory each it will be quite competitive for the 400B+ parameter models.

Base on the M2, the Ultra variant doubles the memory of the base chip. So 256Gb is what you would expect in an M4 Ultra. You would expect 512Gb in an M4 Extreme.
 
they will not re-enter that space. Not because you are wrong, but because it will take far too many resources to revamp, rebuild and maintain enterprise level customers, especially for AI/LLM.
Will they make capable hardware? Yes - and we can all buy it and do whatever we want with it. The M-series will likely just be soon complemented by some L-series or something like that (LLM is my hint) and those chips/GPUs will be specialized units to support even higher Gflops.
Quite interesting that the new rumor around Lenovo and Apple which would go quite far in address the support side of things.

Waste of time. M4 Ultra doesn't touch EPYC 5 and never will. The roadmap for EPYC Apple nor Intel will ever touch, and Nvidia will continue living of Compute with their GPGPUs as their own Arm chips are absolute trash.
I don't think they need to compete with EPYC since LLM workloads are primarily GPU/NPU based. Given how power efficient Apple Silicon is it would be quite competitive in 2U or blade deployments for LLMs inference/training vs the current Nvidia SXM modules which is already start to require water cooling. But obviously that will depend on performance of the libraries and how the interconnects will actually perform.

Also on the flip side if you are looking for the best single core performance M4 is currently where it is at.
 
Base on the M2, the Ultra variant doubles the memory of the base chip. So 256Gb is what you would expect in an M4 Ultra. You would expect 512Gb in an M4 Extreme.
Actually just looked at the available memory modules and it looks like 192 will probably still remain the max size since Apple uses 8 chips and the largest modules are only 192Gb from Micron. One possibility is they technically would do more than a single chip per channel but it might not make it this generation.
 
  • Like
Reactions: vantelimus
I just want to know when this guy's coming back.
Xserve_Mid_2002.jpeg
Just thinking about Apple branded servers. I know they have the rack mount Mac Pro, but wouldn't mind an Xserve style thing, too. Use the M4's built-in SSD for OS/apps, and get some user-replaceable drives for data storage. Considering how small the new Mac mini is, I'd wonder if blade servers would be an option?

I doubt it though. Probably too niche and too low margin to make it worthwhile for Apple. Even then, with integrated, unified memory, it would be far too expensive. I remember the last Intel Mac Pro having 1.5 TERA bytes of RAM, plus however many gigs of VRAM on the GPUs. Now, we're topped out at 128/192 GB unified memory.
 
  • Like
Reactions: Thisismattwade
I've seen this hypothetical situation referred to many times. Does anyone have some examples of AI requests that might have to leave a user's device? Not being snarky: is it possible that requests from my base M1 MBA might cross that threshold due to its 8GB of RAM? I'm not worried because I don't currently plan to use AI, but I'm intellectually curious.

M4 chips seem like a nice sweet spot for Apple, and good reason for some people to (finally) upgrade from M1.
See this informative article: https://machinelearning.apple.com/research/introducing-apple-foundation-models

So for example, it appears that for the Summarization functionality, Mail/Messages/Notification (previews) will be directed to the on-device model as specific adapters have been created and fine-tuned to work well with the smaller model with these use cases, and all other Summarization requests (e.g., Safari web page or Mail full-message summary) will be sent to Private Cloud Compute's larger model(s).
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.