Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

sunny5

macrumors 68000
Jun 11, 2021
1,838
1,706
Maybe once they make Mac Pro, they might be able to ditch x86 server for their own chips to replace.
 

jasoncarle

Suspended
Jan 13, 2006
623
460
Minnesota
If Apple is doing this, it is so they can run their services on their own systems, and do it more efficiently and at less cost than using outside service providers.
 

sunny5

macrumors 68000
Jun 11, 2021
1,838
1,706
If Apple is doing this, it is so they can run their services on their own systems, and do it more efficiently and at less cost than using outside service providers.
But I dont think they can replace Amazon. Amazon's server scale is beyond Apple's reach.
 

Xiao_Xi

macrumors 68000
Original poster
Oct 27, 2021
1,628
1,101
If Apple is doing this, it is so they can run their services on their own systems, and do it more efficiently and at less cost than using outside service providers.
Could an Apple's server chip based on M1 be better than AWS's Graviton?
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
Could an Apple's server chip based on M1 be better than AWS's Graviton?

The M1 die ? No.
Even the M1 Max. No.

If talking pull just the P cores out and build an entirely new un-core portion of the die "top to bottom" is it really based on the the M1 anymore? No; that isn't all that accurate connotation. If chop off 80% of the die and replace it with something else, then pretty much have something effectively new. It is myopic perversion to view the M1 die as primarily just being the P cores and the rest is "window dressing".

The soldered on RAM die thing won't scale in capacity to match Graviton 2. The I/O doesn't scale. The core count doesn't scale. The internal bus network probably doesn't scale to 64+ P cores.

Could it beat Graviton 2 on some single threaded drag race (Geekbench ST ) . Probably But is that the point in a high multiple tenant workload load set up ? No.
 
  • Like
Reactions: Xiao_Xi

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
Maybe once they make Mac Pro, they might be able to ditch x86 server for their own chips to replace.

Their Mac Pro with a custom chip is still going to be substantively skewed to providing best single user performance as a single user workstation.

The workloads for a high end server in a cloud environment where trying to service workloads of multiple clients concurrently is substantively different.


TLB+CLR thrash. Graviton2 gets to about 100-108 ns latency at over 16MB Likewise full random is fine until past 16MB. ( the log scale graph)

lat-log-G.png





The M1 Max with next generation memory subsystem, next generation process node ( N5 instead of N7) and twice the memory channels is in the same zone for full random above 16MB.

Latency-M1-Max.png


Neoverse N2 will be better. Even Neoverse N1 could be done better than what Amazon did.

Ampere 80
latency-q80-mono.png




Latencies here under 100ns even out to over 525MB. Graviton2 isn't what Apple's has to beat to be competitive.
Amazon probably has the least expensive server CPU costs out there; not the best.

Single thread drag racing on relatively small test depths above ... yes the M1 will be substantively better. It is tuned that way. That isn't a mainstream cloud service heavy workload though.
 

senttoschool

macrumors 68030
Nov 2, 2017
2,626
5,482
Anything Apple makes for the server will almost certainly be Apple-user focused first. Apple isn't going to compete with AWS to host your Postgres/web services. They're going to offer Apple Silicon Cloud to Apple users to accelerate their workflows first.

Think about businesses that used to have racks of Macs in a closet. Those will hugely benefit from an Apple Silicon Cloud.

Then think about the worker who can carry around a Macbook Air with 20 hours of battery life and be able to tap into a 40/128 chip with a single swipe on the touchpad.

Apple can also offer application developers access to integrate cloud acceleration directly into their apps. For example, the application is installed locally on your computer but for certain functions on the app, you can have it use Apple Silicon in the cloud.

Apple Silicon isn't going to be competing with Graviton2 or Neoverse. Those are designed to be low-cost, low ST performance chips used in microservices and web servers where handling the number of tiny requests is more important than raw performance.
 
Last edited:

Xiao_Xi

macrumors 68000
Original poster
Oct 27, 2021
1,628
1,101
AWS's Graviton2 uses Neoverse N1 CPU microarchitecture.
Even Neoverse N1 could be done better than what Amazon did.
What do you mean by that? Does it mean that another company could have designed a better SOC than AWS?
 

Xiao_Xi

macrumors 68000
Original poster
Oct 27, 2021
1,628
1,101
Anything Apple makes for the server will almost certainly be Apple-user focused first. Apple isn't going to compete with AWS to host your Postgres/web services. They're going to offer Apple Silicon Cloud to Apple users to accelerate their workflows first.
How could server chips for Apple internal use be? Could Apple be designing chips for running Apache Cassandra clusters or the App Store?
 

vigilant

macrumors 6502a
Aug 7, 2007
715
288
Nashville, TN
MLID is reliable? Don't think so.


He's just a clickbait youtube guy who spits out lots of guesses (often contradicting himself), hypes the hits, and pretends the misses never happened.
Coming in late to this thread, but wanted to contribute.

“Moore’s Law Is Dead“ isn’t clickbait. The host often misses specific contexts because his focus is largely on gaming.

So let me break this down the best way I can. For iCloud to go 100% internal would require Apple to either rent space in existing data centers to support their users at the edge in a reliable way, which is something they 100% could do. The logistics of that however would cost $100’s of millions if not billions to do so.

Apple largely tries to outsource others do better to those companies. I have 0 doubt Apple will continue to build iCloud on AWS and GCP. Why? Why build data centers when Amazon and Google provide many of the building blocks they need on a consumption basis in a way that scales globally.

I am BEYOND happy to be wrong on this. Truly, I am. I’d LOVE for Apple to build a server product.

Do you know what I see when I look at the various configurations of the Mac Pro though? I see an IO intensive single socket server, that could easily be slide into a rack. What does the Mac Pro have? An option to get it rack mounted.

I enjoy the content from “Moore’s Law Is Dead”, for many reasons. It’s important to remember that their target audience is gamers.

If you show a Gaming enthusiast a rig like the Mac Pro that has a rack mounted option, they’ll probably assume it’s a Server.

To that point, ”Moore’s Law is Dead” host today posted a video covering AMD’s new EPYC chip set to go into mass production video. He says specifically that he doesn’t focus on servers, but he’s covering it because because of the ramifications long term to gaming.

He does have intelligent points, but he does often misunderstand what’s going on outside of gaming and tries to bring it down to his frame of reference.

Is he the most brilliant commentary fellow? No, as someone thats watched him for a couple of months I can tell you that his perspective is narrow. At least in my opinion he says that pretty clearly.

If you show someone who’s focus is on gaming, a machine that can be rack mounted (like the Mac Pro) a system that has 4 M1 Maxs in it, that can be rack mounted, they’ll probably assume it’s a Server. Why? Because Fortnight probably can’t use something that in many cases is faster than the NVIDIA stuff coming out the pipe, let alone 4x that.

Apple is more likely to work with Cloud providers to get open things from their server stack to work well in a Kubernetes and Serverless environment than spend a ton of money building everything themselves to just not be on AWS and GCP.

If Apple wanted to literally own everything for service transactions, they would have built their own fab, and factories for manufacturing a long time ago. TSMC, and Foxconn are both happy to take the money in high volumes and agree to Apples terms. There is little to no reason to think that outside of having one primary massive scale data center (which I believe they did when the sapphire company they put $400 million into failed to produce and turned their facilities into a massive scale data center) that Apple would spend that same kind of money globally when they can do what they do with TSMC and Foxconn and rent it out on a per millisecond basis.
 

vigilant

macrumors 6502a
Aug 7, 2007
715
288
Nashville, TN
MLID uses three degrees of confidence (very high confidence, high confidence, and mostly confident) in his slides. So, I can expect that he misses more when he has less confidence in his info.

Anyway, for the sake of the conversation, let's pretend that MLID is correct and Apple is designing its server chips.

Does it make sense to have CPU, GPU and RAM in the same SOC for a server chip? From my limited knowledge, cloud computing requires a lot of flexibility. The requirements for a rendering farm are different to an Apache Cassandra cluster.
I largely agree, Cloud computing assumes flexibility. I don’t know how macOS is doing it’s virtualization, but I will tell you that as far as a workstation goes, it’s the best solution I’ve used having used workstation laptops and Macs Out of the many years I’ve experimented with it.

It “could” be done. I have little doubt about that. I don’t think it makes sense for them though.

I have no doubt the hypervisor framework grown out in scale could do what AWS and GCP does. But on return on investment, unless they tell an incredibly competitive story, I just don’t see it happening.

It costs a lot of money to build out an actual competitive Cloud platform. More money than what was probably spent to secure 5nm for a couple of years from TSMC.
 

Adarna

Suspended
Jan 1, 2015
685
429
Using TSMC's chiplet tech it is possible to mask-stitch 9-16 M1 Max dies together to reach 90-160 CPU cores together.

The 17.004cm² die could fit into the 19.7cm² Mac mini enclosure although it would need at a 1-1.4kW PSU to power

Form Factor (As of 9 Nov 2021)​
AMD Zen 4 Epyc 128 core cpu Rival​
AMD Zen 4 Epyc 128 core cpu Rival​
Mac Pro
iMac Pro​
Mac mini Pro
Mac Pro
iMac Pro​
MBP 14"
MBP 16"
Mac mini Pro
iMac 24"
iMac Pro​
300mm² Silicon Wafer​
Apple silicon chip​
M1 Max Jade-16C​
M1 Max Jade-9C​
M1 Max Jade-4C​
M1 Max Jade-2C​
M1 Max​
M1 Max Jade-49C​
Launch​
>Speculation<​
>Speculation<​
Q2 or Q4 2022​
Q2 or Q4 2022​
Q4 2021​
>Speculation<​
# of dies​
16​
9​
4​
2​
1​
49​
CPU​
160​
90​
40​
20​
10​
490​
performance cores​
128​
72​
32​
16​
8​
392​
efficiency cores​
32​
18​
8​
4​
2​
98​
GPU core​
512​
288​
128​
64​
32​
1568​
Neural Engine core​
256​
144​
64​
32​
16​
784​
memory bandwidth​
6.4TB/s​
3.6TB/s​
1.6TB/s​
800GB/s​
400GB/s​
19.6TB/s​
Max Memory​
1024GB​
576GB​
256GB​
128GB​
64GB​
3,136GB​
Hardware-accelerated H.264, HEVC, ProRes, and ProRes RAW​
16​
9​
4​
2​
1​
49​
Video decode engines​
16​
9​
4​
2​
1​
49​
Video encode engines​
32​
18​
8​
4​
2​
98​
ProRes encode and decode engines​
32​
18​
8​
4​
2​
98​
Peak Quoted Transistor Densities using TMSC's 5nm (2020) at the same die size
171.3 million transistors per mm²
912 Billion​
513 Billion​
228 Billion​
114 Billion​
57 Billion​
5.19 Trillion​
Estimated Die Size​
17.004cm²​
12.753cm²​
8.502cm²​
6.3765cm²​
4.251cm²​
30cm²​
Peak Quoted Transistor Densities using IBM's 2nm (2025) at the same die size
333.33 million transistors per mm²
1.774 Trillion​
998.24 Billion​
443.66 Billion​
221.83 Billion​
110.92 Billion​
10.1 Trillion​
Estimated AMD Ryzen 9 5950X Performance​
16x​
9x​
4x​
2x​
1x​
49x​
Estimated RTX 3080 Performance​
16x​
9x​
4x​
2x​
1x​
49x​
 
  • Like
Reactions: Xiao_Xi

leman

macrumors Core
Oct 14, 2008
19,521
19,678
Their Mac Pro with a custom chip is still going to be substantively skewed to providing best single user performance as a single user workstation.

The workloads for a high end server in a cloud environment where trying to service workloads of multiple clients concurrently is substantively different.


TLB+CLR thrash. Graviton2 gets to about 100-108 ns latency at over 16MB Likewise full random is fine until past 16MB. ( the log scale graph)

lat-log-G.png





The M1 Max with next generation memory subsystem, next generation process node ( N5 instead of N7) and twice the memory channels is in the same zone for full random above 16MB.

Latency-M1-Max.png


Neoverse N2 will be better. Even Neoverse N1 could be done better than what Amazon did.

Ampere 80
latency-q80-mono.png




Latencies here under 100ns even out to over 525MB. Graviton2 isn't what Apple's has to beat to be competitive.
Amazon probably has the least expensive server CPU costs out there; not the best.

Single thread drag racing on relatively small test depths above ... yes the M1 will be substantively better. It is tuned that way. That isn't a mainstream cloud service heavy workload though.

Latency to RAM depends on the RAM... and obviously regular DDR has lower latency than LPDDR. But why is it even that interesting? If you are hitting cache so rarely that RAM latency starts to matter your CPU performance is already down the drain anyway...
 

sunny5

macrumors 68000
Jun 11, 2021
1,838
1,706
Using TSMC's chiplet tech it is possible to mask-stitch 9-16 M1 Max dies together to reach 90-160 CPU cores together.

The 17.004cm² die could fit into the 19.7cm² Mac mini enclosure although it would need at a 1-1.4kW PSU to power

Form Factor (As of 9 Nov 2021)​
AMD Zen 4 Epyc 128 core cpu Rival​
AMD Zen 4 Epyc 128 core cpu Rival​
Mac Pro
iMac Pro​
Mac mini Pro
Mac Pro
iMac Pro​
MBP 14"
MBP 16"
Mac mini Pro
iMac 24"
iMac Pro​
300mm² Silicon Wafer​
Apple silicon chip​
M1 Max Jade-16C​
M1 Max Jade-9C​
M1 Max Jade-4C​
M1 Max Jade-2C​
M1 Max​
M1 Max Jade-49C​
Launch​
>Speculation<​
>Speculation<​
Q2 or Q4 2022​
Q2 or Q4 2022​
Q4 2021​
>Speculation<​
# of dies​
16​
9​
4​
2​
1​
49​
CPU​
160​
90​
40​
20​
10​
490​
performance cores​
128​
72​
32​
16​
8​
392​
efficiency cores​
32​
18​
8​
4​
2​
98​
GPU core​
512​
288​
128​
64​
32​
1568​
Neural Engine core​
256​
144​
64​
32​
16​
784​
memory bandwidth​
6.4TB/s​
3.6TB/s​
1.6TB/s​
800GB/s​
400GB/s​
19.6TB/s​
Max Memory​
1024GB​
576GB​
256GB​
128GB​
64GB​
3,136GB​
Hardware-accelerated H.264, HEVC, ProRes, and ProRes RAW​
16​
9​
4​
2​
1​
49​
Video decode engines​
16​
9​
4​
2​
1​
49​
Video encode engines​
32​
18​
8​
4​
2​
98​
ProRes encode and decode engines​
32​
18​
8​
4​
2​
98​
Peak Quoted Transistor Densities using TMSC's 5nm (2020) at the same die size
171.3 million transistors per mm²
912 Billion​
513 Billion​
228 Billion​
114 Billion​
57 Billion​
5.19 Trillion​
Estimated Die Size​
17.004cm²​
12.753cm²​
8.502cm²​
6.3765cm²​
4.251cm²​
30cm²​
Peak Quoted Transistor Densities using IBM's 2nm (2025) at the same die size
333.33 million transistors per mm²
1.774 Trillion​
998.24 Billion​
443.66 Billion​
221.83 Billion​
110.92 Billion​
10.1 Trillion​
Estimated AMD Ryzen 9 5950X Performance​
16x​
9x​
4x​
2x​
1x​
49x​
Estimated RTX 3080 Performance​
16x​
9x​
4x​
2x​
1x​
49x​
What if Apple Silicon uses HBM2 or HBM3?
 

Boil

macrumors 68040
Oct 23, 2018
3,478
3,173
Stargate Command
The 17.004cm² die could fit into the 19.7cm² Mac mini enclosure although it would need at a 1-1.4kW PSU to power​

Make that Mac mini taller, say 9.8", and you would have room for that PSU & a heat sink filling the remaining interior volume; something about the size of the G4 Cube...! ;^p
 
  • Like
Reactions: Adarna

Adarna

Suspended
Jan 1, 2015
685
429
Make that Mac mini taller, say 9.8", and you would have room for that PSU & a heat sink filling the remaining interior volume; something about the size of the G4 Cube...! ;^p
Make that Mac mini taller, say 9.8", and you would have room for that PSU & a heat sink filling the remaining interior volume; something about the size of the G4 Cube...! ;^p
$16,000?
 

Boil

macrumors 68040
Oct 23, 2018
3,478
3,173
Stargate Command
Using TSMC's chiplet tech it is possible to mask-stitch 9-16 M1 Max dies together to reach 90-160 CPU cores together.

The 17.004cm² die could fit into the 19.7cm² Mac mini enclosure although it would need at a 1-1.4kW PSU to power​

Make that Mac mini taller, say 9.8", and you would have room for that PSU & a heat sink filling the remaining interior volume; something about the size of the G4 Cube...! ;^p


For 16 maxed-out M1 Max SoCs that would be a bargain...!

16-way M1 Max MCM
160-core CPU (128P/32E)
512-core GPU
256-core Neural Engine
1TB LPDDR5 RAM
6.4TB/s memory bandwidth
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
AWS's Graviton2 uses Neoverse N1 CPU microarchitecture.

What do you mean by that? Does it mean that another company could have designed a better SOC than AWS?

the Ampere 80 ( and 128) are also based on N1. And Ampere (another company) does a better job with an internal interconnect that scales better . That is what the rest of the post covered.

[ Some folks might read Ampere and think Nvidia's codename for their current GPU microarch family. There is a company with that name.
https://amperecomputing.com/altra/

Unlike Amazon/AWSl, they are quite willing to sell to individual cloud services vendors. Got a major stack of cash... you too can buy some. No impediment here to Apple just buying something. Or commissioning a semi-custom future version that they slap their logo on for internal use. ]
 
  • Like
Reactions: Xiao_Xi

Xiao_Xi

macrumors 68000
Original poster
Oct 27, 2021
1,628
1,101
@deconstruct60 Thank you!

AMD has two types of GPU: CDNA for scientific computing and RDNA for gaming/rendering.

Which type of GPU should Apple's server chips have: CDNA-like GPU or RDNA-like GPU?
 

Boil

macrumors 68040
Oct 23, 2018
3,478
3,173
Stargate Command
@deconstruct60 Thank you!

AMD has two types of GPU: CDNA for scientific computing and RDNA for gaming/rendering.

Which type of GPU should Apple's server chips have: CDNA-like GPU or RDNA-like GPU?

One would think that answer would depend on what Apple plans on doing with these hypothetical servers...?

Crunching data, probably CDNA-like; on-demand streaming gaming, gotta be RDNA-like...?
 

jjcs

Cancelled
Oct 18, 2021
317
153
I'm retired from being a Defense Contractor and been to many places and seen them with mine own eyes walking back to my section for a new Unix crypto device worked! So don't assume you know everything!
Well, I'm not retired. An M1 SOC Mini is not a competitive supercomputer node. Sorry. Too few cores, inadequate memory, limited to ethernet.... I don't think HPE is worried. Vastly different market. I thought you were joking.

If I'm parsing "mine own eyes walking back to my section for a new Unix crypto device worked!" correctly, that's not a supercomputer.
 
Last edited:

Adarna

Suspended
Jan 1, 2015
685
429
For 16 maxed-out M1 Max SoCs that would be a bargain...!

16-way M1 Max MCM
160-core CPU (128P/32E)
512-core GPU
256-core Neural Engine
1TB LPDDR5 RAM
6.4TB/s memory bandwidth
Let us double it to $32,000, then. ;) Default storage would be 8TB?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.