Apple in the server market again? Moore's law is dead confirms that Apple is making its server chips

satcomer · Nov 8, 2021

jjcs said:
That's comedy gold!

I'm retired from being a Defense Contractor and been to many places and seen them with mine own eyes walking back to my section for a new Unix crypto device worked! So don't assume you know everything!

sunny5 · Nov 8, 2021

Maybe once they make Mac Pro, they might be able to ditch x86 server for their own chips to replace.

jasoncarle · Nov 8, 2021

If Apple is doing this, it is so they can run their services on their own systems, and do it more efficiently and at less cost than using outside service providers.

sunny5 · Nov 8, 2021

jasoncarle said:
If Apple is doing this, it is so they can run their services on their own systems, and do it more efficiently and at less cost than using outside service providers.

But I dont think they can replace Amazon. Amazon's server scale is beyond Apple's reach.

Xiao_Xi · Nov 8, 2021

jasoncarle said:
If Apple is doing this, it is so they can run their services on their own systems, and do it more efficiently and at less cost than using outside service providers.

Could an Apple's server chip based on M1 be better than AWS's Graviton?

deconstruct60 · Nov 8, 2021

Xiao_Xi said:
Could an Apple's server chip based on M1 be better than AWS's Graviton?

The M1 die ? No.
Even the M1 Max. No.

If talking pull just the P cores out and build an entirely new un-core portion of the die "top to bottom" is it really based on the the M1 anymore? No; that isn't all that accurate connotation. If chop off 80% of the die and replace it with something else, then pretty much have something effectively new. It is myopic perversion to view the M1 die as primarily just being the P cores and the rest is "window dressing".

The soldered on RAM die thing won't scale in capacity to match Graviton 2. The I/O doesn't scale. The core count doesn't scale. The internal bus network probably doesn't scale to 64+ P cores.

Could it beat Graviton 2 on some single threaded drag race (Geekbench ST ) . Probably But is that the point in a high multiple tenant workload load set up ? No.

deconstruct60 · Nov 8, 2021

sunny5 said:
Maybe once they make Mac Pro, they might be able to ditch x86 server for their own chips to replace.

Their Mac Pro with a custom chip is still going to be substantively skewed to providing best single user performance as a single user workstation.

The workloads for a high end server in a cloud environment where trying to service workloads of multiple clients concurrently is substantively different.

TLB+CLR thrash. Graviton2 gets to about 100-108 ns latency at over 16MB Likewise full random is fine until past 16MB. ( the log scale graph)

Amazon's Arm-based Graviton2 Against AMD and Intel: Comparing Cloud Compute

www.anandtech.com

The M1 Max with next generation memory subsystem, next generation process node ( N5 instead of N7) and twice the memory channels is in the same zone for full random above 16MB.

Apple's M1 Pro, M1 Max SoCs Investigated: New Performance and Efficiency Heights

www.anandtech.com

Neoverse N2 will be better. Even Neoverse N1 could be done better than what Amazon did.

Ampere 80

The Ampere Altra Review: 2x 80 Cores Arm Server Performance Monster

www.anandtech.com

Latencies here under 100ns even out to over 525MB. Graviton2 isn't what Apple's has to beat to be competitive.
Amazon probably has the least expensive server CPU costs out there; not the best.

Single thread drag racing on relatively small test depths above ... yes the M1 will be substantively better. It is tuned that way. That isn't a mainstream cloud service heavy workload though.

senttoschool · Nov 8, 2021

Anything Apple makes for the server will almost certainly be Apple-user focused first. Apple isn't going to compete with AWS to host your Postgres/web services. They're going to offer Apple Silicon Cloud to Apple users to accelerate their workflows first.

Think about businesses that used to have racks of Macs in a closet. Those will hugely benefit from an Apple Silicon Cloud.

Then think about the worker who can carry around a Macbook Air with 20 hours of battery life and be able to tap into a 40/128 chip with a single swipe on the touchpad.

Apple can also offer application developers access to integrate cloud acceleration directly into their apps. For example, the application is installed locally on your computer but for certain functions on the app, you can have it use Apple Silicon in the cloud.

Apple Silicon isn't going to be competing with Graviton2 or Neoverse. Those are designed to be low-cost, low ST performance chips used in microservices and web servers where handling the number of tiny requests is more important than raw performance.

Xiao_Xi · Nov 8, 2021

AWS's Graviton2 uses Neoverse N1 CPU microarchitecture.

deconstruct60 said:
Even Neoverse N1 could be done better than what Amazon did.

What do you mean by that? Does it mean that another company could have designed a better SOC than AWS?

Xiao_Xi · Nov 8, 2021

senttoschool said:
Anything Apple makes for the server will almost certainly be Apple-user focused first. Apple isn't going to compete with AWS to host your Postgres/web services. They're going to offer Apple Silicon Cloud to Apple users to accelerate their workflows first.

How could server chips for Apple internal use be? Could Apple be designing chips for running Apache Cassandra clusters or the App Store?

vigilant · Nov 8, 2021

mr_roboto said:
MLID is reliable? Don't think so.

Is Youtuber "Moore's Law Is Dead" legit?

Randomly found this guys channel a few days ago. He claims to know a bunch of industry insiders and supposedly has a ton of info that he "can't talk about". But he only has 70K subs so I'm a little suspicious. Does anyone know if this guy is for real?

linustechtips.com

He's just a clickbait youtube guy who spits out lots of guesses (often contradicting himself), hypes the hits, and pretends the misses never happened.

Coming in late to this thread, but wanted to contribute.

“Moore’s Law Is Dead“ isn’t clickbait. The host often misses specific contexts because his focus is largely on gaming.

So let me break this down the best way I can. For iCloud to go 100% internal would require Apple to either rent space in existing data centers to support their users at the edge in a reliable way, which is something they 100% could do. The logistics of that however would cost $100’s of millions if not billions to do so.

Apple largely tries to outsource others do better to those companies. I have 0 doubt Apple will continue to build iCloud on AWS and GCP. Why? Why build data centers when Amazon and Google provide many of the building blocks they need on a consumption basis in a way that scales globally.

I am BEYOND happy to be wrong on this. Truly, I am. I’d LOVE for Apple to build a server product.

Do you know what I see when I look at the various configurations of the Mac Pro though? I see an IO intensive single socket server, that could easily be slide into a rack. What does the Mac Pro have? An option to get it rack mounted.

I enjoy the content from “Moore’s Law Is Dead”, for many reasons. It’s important to remember that their target audience is gamers.

If you show a Gaming enthusiast a rig like the Mac Pro that has a rack mounted option, they’ll probably assume it’s a Server.

To that point, ”Moore’s Law is Dead” host today posted a video covering AMD’s new EPYC chip set to go into mass production video. He says specifically that he doesn’t focus on servers, but he’s covering it because because of the ramifications long term to gaming.

He does have intelligent points, but he does often misunderstand what’s going on outside of gaming and tries to bring it down to his frame of reference.

Is he the most brilliant commentary fellow? No, as someone thats watched him for a couple of months I can tell you that his perspective is narrow. At least in my opinion he says that pretty clearly.

If you show someone who’s focus is on gaming, a machine that can be rack mounted (like the Mac Pro) a system that has 4 M1 Maxs in it, that can be rack mounted, they’ll probably assume it’s a Server. Why? Because Fortnight probably can’t use something that in many cases is faster than the NVIDIA stuff coming out the pipe, let alone 4x that.

Apple is more likely to work with Cloud providers to get open things from their server stack to work well in a Kubernetes and Serverless environment than spend a ton of money building everything themselves to just not be on AWS and GCP.

If Apple wanted to literally own everything for service transactions, they would have built their own fab, and factories for manufacturing a long time ago. TSMC, and Foxconn are both happy to take the money in high volumes and agree to Apples terms. There is little to no reason to think that outside of having one primary massive scale data center (which I believe they did when the sapphire company they put $400 million into failed to produce and turned their facilities into a massive scale data center) that Apple would spend that same kind of money globally when they can do what they do with TSMC and Foxconn and rent it out on a per millisecond basis.

vigilant · Nov 8, 2021

Xiao_Xi said:
MLID uses three degrees of confidence (very high confidence, high confidence, and mostly confident) in his slides. So, I can expect that he misses more when he has less confidence in his info.

Anyway, for the sake of the conversation, let's pretend that MLID is correct and Apple is designing its server chips.

Does it make sense to have CPU, GPU and RAM in the same SOC for a server chip? From my limited knowledge, cloud computing requires a lot of flexibility. The requirements for a rendering farm are different to an Apache Cassandra cluster.

I largely agree, Cloud computing assumes flexibility. I don’t know how macOS is doing it’s virtualization, but I will tell you that as far as a workstation goes, it’s the best solution I’ve used having used workstation laptops and Macs Out of the many years I’ve experimented with it.

It “could” be done. I have little doubt about that. I don’t think it makes sense for them though.

I have no doubt the hypervisor framework grown out in scale could do what AWS and GCP does. But on return on investment, unless they tell an incredibly competitive story, I just don’t see it happening.

It costs a lot of money to build out an actual competitive Cloud platform. More money than what was probably spent to secure 5nm for a couple of years from TSMC.

Adarna · Nov 9, 2021

Using TSMC's chiplet tech it is possible to mask-stitch 9-16 M1 Max dies together to reach 90-160 CPU cores together.

The 17.004cm² die could fit into the 19.7cm² Mac mini enclosure although it would need at a 1-1.4kW PSU to power

Form Factor (As of 9 Nov 2021)	AMD Zen 4 Epyc 128 core cpu Rival	AMD Zen 4 Epyc 128 core cpu Rival	Mac Pro iMac Pro	Mac mini Pro Mac Pro iMac Pro	MBP 14" MBP 16" Mac mini Pro iMac 24" iMac Pro	300mm² Silicon Wafer
Apple silicon chip	M1 Max Jade-16C	M1 Max Jade-9C	M1 Max Jade-4C	M1 Max Jade-2C	M1 Max	M1 Max Jade-49C
Launch	>Speculation<	>Speculation<	Q2 or Q4 2022	Q2 or Q4 2022	Q4 2021	>Speculation<
# of dies	16	9	4	2	1	49
CPU	160	90	40	20	10	490
performance cores	128	72	32	16	8	392
efficiency cores	32	18	8	4	2	98
GPU core	512	288	128	64	32	1568
Neural Engine core	256	144	64	32	16	784
memory bandwidth	6.4TB/s	3.6TB/s	1.6TB/s	800GB/s	400GB/s	19.6TB/s
Max Memory	1024GB	576GB	256GB	128GB	64GB	3,136GB
Hardware-accelerated H.264, HEVC, ProRes, and ProRes RAW	16	9	4	2	1	49
Video decode engines	16	9	4	2	1	49
Video encode engines	32	18	8	4	2	98
ProRes encode and decode engines	32	18	8	4	2	98
Peak Quoted Transistor Densities using TMSC's 5nm (2020) at the same die size 171.3 million transistors per mm²	912 Billion	513 Billion	228 Billion	114 Billion	57 Billion	5.19 Trillion
Estimated Die Size	17.004cm²	12.753cm²	8.502cm²	6.3765cm²	4.251cm²	30cm²
Peak Quoted Transistor Densities using IBM's 2nm (2025) at the same die size 333.33 million transistors per mm²	1.774 Trillion	998.24 Billion	443.66 Billion	221.83 Billion	110.92 Billion	10.1 Trillion
Estimated AMD Ryzen 9 5950X Performance	16x	9x	4x	2x	1x	49x
Estimated RTX 3080 Performance	16x	9x	4x	2x	1x	49x

leman · Nov 9, 2021

deconstruct60 said:
Their Mac Pro with a custom chip is still going to be substantively skewed to providing best single user performance as a single user workstation.

The workloads for a high end server in a cloud environment where trying to service workloads of multiple clients concurrently is substantively different.

TLB+CLR thrash. Graviton2 gets to about 100-108 ns latency at over 16MB Likewise full random is fine until past 16MB. ( the log scale graph)

Amazon's Arm-based Graviton2 Against AMD and Intel: Comparing Cloud Compute

www.anandtech.com

The M1 Max with next generation memory subsystem, next generation process node ( N5 instead of N7) and twice the memory channels is in the same zone for full random above 16MB.

Apple's M1 Pro, M1 Max SoCs Investigated: New Performance and Efficiency Heights

www.anandtech.com

Neoverse N2 will be better. Even Neoverse N1 could be done better than what Amazon did.

Ampere 80

The Ampere Altra Review: 2x 80 Cores Arm Server Performance Monster

www.anandtech.com

Latencies here under 100ns even out to over 525MB. Graviton2 isn't what Apple's has to beat to be competitive.
Amazon probably has the least expensive server CPU costs out there; not the best.

Single thread drag racing on relatively small test depths above ... yes the M1 will be substantively better. It is tuned that way. That isn't a mainstream cloud service heavy workload though.

Latency to RAM depends on the RAM... and obviously regular DDR has lower latency than LPDDR. But why is it even that interesting? If you are hitting cache so rarely that RAM latency starts to matter your CPU performance is already down the drain anyway...

sunny5 · Nov 9, 2021

Adarna said:
Using TSMC's chiplet tech it is possible to mask-stitch 9-16 M1 Max dies together to reach 90-160 CPU cores together.

The 17.004cm² die could fit into the 19.7cm² Mac mini enclosure although it would need at a 1-1.4kW PSU to power

Form Factor (As of 9 Nov 2021)
AMD Zen 4 Epyc 128 core cpu Rival
AMD Zen 4 Epyc 128 core cpu Rival
Mac Pro
iMac Pro
Mac mini Pro
Mac Pro
iMac Pro
MBP 14"
MBP 16"
Mac mini Pro
iMac 24"
iMac Pro
300mm² Silicon Wafer
Apple silicon chip
M1 Max Jade-16C
M1 Max Jade-9C
M1 Max Jade-4C
M1 Max Jade-2C
M1 Max
M1 Max Jade-49C
Launch
>Speculation<
>Speculation<
Q2 or Q4 2022
Q2 or Q4 2022
Q4 2021
>Speculation<
# of dies
16
9
4
2
1
49
CPU
160
90
40
20
10
490
performance cores
128
72
32
16
8
392
efficiency cores
32
18
8
4
2
98
GPU core
512
288
128
64
32
1568
Neural Engine core
256
144
64
32
16
784
memory bandwidth
6.4TB/s
3.6TB/s
1.6TB/s
800GB/s
400GB/s
19.6TB/s
Max Memory
1024GB
576GB
256GB
128GB
64GB
3,136GB
Hardware-accelerated H.264, HEVC, ProRes, and ProRes RAW
16
9
4
2
1
49
Video decode engines
16
9
4
2
1
49
Video encode engines
32
18
8
4
2
98
ProRes encode and decode engines
32
18
8
4
2
98
Peak Quoted Transistor Densities using TMSC's 5nm (2020) at the same die size
171.3 million transistors per mm²
912 Billion
513 Billion
228 Billion
114 Billion
57 Billion
5.19 Trillion
Estimated Die Size
17.004cm²
12.753cm²
8.502cm²
6.3765cm²
4.251cm²
30cm²
Peak Quoted Transistor Densities using IBM's 2nm (2025) at the same die size
333.33 million transistors per mm²
1.774 Trillion
998.24 Billion
443.66 Billion
221.83 Billion
110.92 Billion
10.1 Trillion
Estimated AMD Ryzen 9 5950X Performance
16x
9x
4x
2x
1x
49x
Estimated RTX 3080 Performance
16x
9x
4x
2x
1x
49x

What if Apple Silicon uses HBM2 or HBM3?

Adarna · Nov 9, 2021

sunny5 said:
What if Apple Silicon uses HBM2 or HBM3?

Then pay for the additional R&D.

Boil · Nov 9, 2021

Adarna said:
The 17.004cm² die could fit into the 19.7cm² Mac mini enclosure although it would need at a 1-1.4kW PSU to power

Make that Mac mini taller, say 9.8", and you would have room for that PSU & a heat sink filling the remaining interior volume; something about the size of the G4 Cube...! ;^p

Adarna · Nov 9, 2021

$16,000?

Adarna · Nov 9, 2021

Boil said:
Make that Mac mini taller, say 9.8", and you would have room for that PSU & a heat sink filling the remaining interior volume; something about the size of the G4 Cube...! ;^p

$16,000?

Boil · Nov 9, 2021

Adarna said:
Using TSMC's chiplet tech it is possible to mask-stitch 9-16 M1 Max dies together to reach 90-160 CPU cores together.

The 17.004cm² die could fit into the 19.7cm² Mac mini enclosure although it would need at a 1-1.4kW PSU to power

Boil said:
Make that Mac mini taller, say 9.8", and you would have room for that PSU & a heat sink filling the remaining interior volume; something about the size of the G4 Cube...! ;^p

Adarna said:
$16,000?

For 16 maxed-out M1 Max SoCs that would be a bargain...!

16-way M1 Max MCM
160-core CPU (128P/32E)
512-core GPU
256-core Neural Engine
1TB LPDDR5 RAM
6.4TB/s memory bandwidth

deconstruct60 · Nov 9, 2021

Xiao_Xi said:
AWS's Graviton2 uses Neoverse N1 CPU microarchitecture.

What do you mean by that? Does it mean that another company could have designed a better SOC than AWS?

the Ampere 80 ( and 128) are also based on N1. And Ampere (another company) does a better job with an internal interconnect that scales better . That is what the rest of the post covered.

[ Some folks might read Ampere and think Nvidia's codename for their current GPU microarch family. There is a company with that name.
https://amperecomputing.com/altra/

Unlike Amazon/AWSl, they are quite willing to sell to individual cloud services vendors. Got a major stack of cash... you too can buy some. No impediment here to Apple just buying something. Or commissioning a semi-custom future version that they slap their logo on for internal use. ]

Xiao_Xi · Nov 9, 2021

@deconstruct60 Thank you!

AMD has two types of GPU: CDNA for scientific computing and RDNA for gaming/rendering.

Which type of GPU should Apple's server chips have: CDNA-like GPU or RDNA-like GPU?

Boil · Nov 9, 2021

Xiao_Xi said:
@deconstruct60 Thank you!

AMD has two types of GPU: CDNA for scientific computing and RDNA for gaming/rendering.

Which type of GPU should Apple's server chips have: CDNA-like GPU or RDNA-like GPU?

One would think that answer would depend on what Apple plans on doing with these hypothetical servers...?

Crunching data, probably CDNA-like; on-demand streaming gaming, gotta be RDNA-like...?

jjcs · Nov 9, 2021

satcomer said:
I'm retired from being a Defense Contractor and been to many places and seen them with mine own eyes walking back to my section for a new Unix crypto device worked! So don't assume you know everything!

Well, I'm not retired. An M1 SOC Mini is not a competitive supercomputer node. Sorry. Too few cores, inadequate memory, limited to ethernet.... I don't think HPE is worried. Vastly different market. I thought you were joking.

If I'm parsing "mine own eyes walking back to my section for a new Unix crypto device worked!" correctly, that's not a supercomputer.

Adarna · Nov 9, 2021

Boil said:
For 16 maxed-out M1 Max SoCs that would be a bargain...!

16-way M1 Max MCM
160-core CPU (128P/32E)
512-core GPU
256-core Neural Engine
1TB LPDDR5 RAM
6.4TB/s memory bandwidth

Let us double it to $32,000, then.

Default storage would be 8TB?

Apple in the server market again? Moore's law is dead confirms that Apple is making its server chips

Suspended

macrumors 68000

Suspended

macrumors 68000

macrumors 68000

macrumors G5

macrumors G5

macrumors 68030

macrumors 68000

macrumors 68000

macrumors 6502a

macrumors 6502a

Suspended

macrumors Core

macrumors 68000

Suspended

macrumors 68040

Suspended

Suspended

macrumors 68040

macrumors G5

macrumors 68000

macrumors 68040

Cancelled

Suspended

Our Staff