Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Status
Not open for further replies.

Appletoni

Suspended
Original poster
Mar 26, 2021
443
177
For the record the 3995 won't use over 280W of power.
Power-64Core.png

Yes the "consumer" 5000 series chips can pull more than rated TDP. It is probably why AMD doesn't include a cooling solution and recommends watercooling the 5950/5900.
Thx. Hopefully Apple can build such a big chip into the new MacBook Pro 16-inch.

Let’s talk about GPU too.
1623793114480.jpeg

It looks like the MacBook Pro 16-inch needs to come with a new chip, which has 1032 or 2064 GPU cores to catch up.
Or maybe Apple will build the GPU with 500 Tensor cores.
 

Kung gu

Suspended
Oct 20, 2018
1,379
2,434
no
Thx. Hopefully Apple can build such a big chip into the new MacBook Pro 16-inch.

Let’s talk about GPU too.
View attachment 1793768
It looks like the MacBook Pro 16-inch needs to come with a new chip, which has 1032 or 2064 GPU cores to catch up.
Or maybe Apple will build the GPU with 500 Tensor cores.
no Apple won't a 200+ watt CPU in a 16" MBP and no the 16" won't come with 1000+ GPU cores.

sorry to burst your bubble
 
  • Like
Reactions: eltoslightfoot

Homy

macrumors 68030
Jan 14, 2006
2,510
2,462
Sweden
Thx. Hopefully Apple can build such a big chip into the new MacBook Pro 16-inch.

Let’s talk about GPU too.
View attachment 1793768
It looks like the MacBook Pro 16-inch needs to come with a new chip, which has 1032 or 2064 GPU cores to catch up.
Or maybe Apple will build the GPU with 500 Tensor cores.

I thought chess players had broad imagination and ability to plan and calculate many future moves and scenarios in their game and seeing things from different perspectives instead of staring blindly at chess scores.

No, MBP 16" doesn't need 2064 GPU cores to catch up with 2060 or 3060. Only 32 GPU cores would be on par with Radeon 5700 XT, RTX 2070 Super or even 2080 or 1080 Ti in many cases, like other games than chess benchmarks, but if chess is all that matters to you then you already know the answer. Apple won't make a 2064-core GPU to satisfy chess players. There you have it! Stop wondering and buy those Asus chess computers and live happily ever after. M1 is simply not made for you or chess benchmarks. I'm afraid even M10 will disappoint you.
 
Last edited:

Fomalhaut

macrumors 68000
Oct 6, 2020
1,993
1,724
I thought chess players had broad imagination and ability to plan and calculate many future moves and scenarios in their game and seeing things from different perspectives instead of staring blindly at chess scores.

No, MBP 16" doesn't need 2064 GPU cores to catch up with 2060 or 3060. Only 32 GPU cores would be on par with Radeon 5700 XT, RTX 2070 Super or even 2080 or 1080 Ti in many cases, like other games than chess benchmarks, but if chess is all that matters to you then you already know the answer. Apple won't make a 2064-core GPU to satisfy chess players. There you have it! Stop wondering and buy those Asus chess computers and live happily ever after. M1 is simply not made for you or chess benchmarks. I'm afraid even M10 will disappoint you.

Some people insist on using the wrong tool for the job, and then love to complain about it…

1623810741303.jpeg
 

MarkC426

macrumors 68040
May 14, 2008
3,699
2,097
UK
I thought chess players had broad imagination and ability to plan and calculate many future moves and scenarios in their game and seeing things from different perspectives instead of staring blindly at chess scores.

No, MBP 16" doesn't need 2064 GPU cores to catch up with 2060 or 3060. Only 32 GPU cores would be on par with Radeon 5700 XT, RTX 2070 Super or even 2080 or 1080 Ti in many cases, like other games than chess benchmarks, but if chess is all that matters to you then you already know the answer. Apple won't make a 2064-core GPU to satisfy chess players. There you have it! Stop wondering and buy those Asus chess computers and live happily ever after. M1 is simply not made for you or chess benchmarks. I'm afraid even M10 will disappoint you.
I assumed anyone serious about Chess, would use a proper 'physical' board..... ?
 
  • Like
  • Haha
Reactions: Homy and dmccloud

bobcomer

macrumors 601
May 18, 2015
4,949
3,699
Using a computer to practice and playing long distance games is a given these days, for the non pro like myself on up to the grand masters. And yes, they'd use those sweet benchmarks (but not as benchmarks, but real positions) to figure out new attacks and defenses. Chess has actually evolved quite a bit because of that.
 

yaxomoxay

macrumors 604
Mar 3, 2010
7,439
34,276
Texas
I thought chess players had broad imagination and ability to plan and calculate many future moves and scenarios in their game and seeing things from different perspectives instead of staring blindly at chess scores.
Not really. Main thing is pattern recognition.
 
  • Like
Reactions: yitwail

leman

macrumors Core
Oct 14, 2008
19,522
19,679
I thought chess players had broad imagination and ability to plan and calculate many future moves and scenarios in their game and seeing things from different perspectives instead of staring blindly at chess scores.

I have to disappoint you. Specialized cognitive skills are rarely transferable. Which is fascinating in itself…
 
  • Like
Reactions: Homy

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Why exactly LC0 runs slow on M1?

Someone would need to do a code review to give an in-depth explanation, but the initial observation is that LC0 is not developed or tested for M1 (or for Mac for what matters), does not use the correct APIs and most likely does not use the GPU at all.
 

Appletoni

Suspended
Original poster
Mar 26, 2021
443
177
Someone would need to do a code review to give an in-depth explanation, but the initial observation is that LC0 is not developed or tested for M1 (or for Mac for what matters), does not use the correct APIs and most likely does not use the GPU at all.
It's not worth writing special code for m1, when the performance will suck anyway. The m1 gpu (like all igp) is not powerful. Writing special code for m1 gpu would probably make it 2 to 5x faster, but it will still be slow compared to a real gpu.

It looks like the MacBook Pro 16-inch needs to come with a new chip, which has 1032 or 2064 GPU cores to catch up.
Or maybe Apple will build the GPU with 500 Tensor cores.
1623997558338.jpeg
 

Fomalhaut

macrumors 68000
Oct 6, 2020
1,993
1,724
It's not worth writing special code for m1, when the performance will suck anyway. The m1 gpu (like all igp) is not powerful. Writing special code for m1 gpu would probably make it 2 to 5x faster, but it will still be slow compared to a real gpu.

It looks like the MacBook Pro 16-inch needs to come with a new chip, which has 1032 or 2064 GPU cores to catch up.
Or maybe Apple will build the GPU with 500 Tensor cores.
View attachment 1794795
The M1 seems to be performing quite well against other laptops with “real” GPUs in this AnandTech report: https://www.anandtech.com/show/16252/mac-mini-apple-m1-tested/3

it appears to be roughly equivalent to a GTX-1650 or RX-560X, which by all accounts are real GPUs and not figments of the imagination.

It is an entry level mobile chip that is competing well against integrated GPUs or mid-level portable dGPUs. What do you expect?
 

nieks

macrumors 6502
Apr 7, 2016
401
332
The Netherlands
And here I was, thinking this topic had ran its course weeks ago.

So the topic starts on April 24th, with Appletoni essentially stating that the M1 sucks, because it won't run the test properly he/she wants it to run.

And finally, 2 months and 18 pages of discussion later, he/she pops the question:

Why exactly LC0 runs slow on M1?

Maybe you @Appletoni should have asked this 2 months ago?
And when literally everybody explains to you that the M1 needs code written for this type of chip, you just state:

It's not worth writing special code for m1, when the performance will suck anyway.

Your tests are NOT made for the M1. Therefore, it will run quite badly on computers with this chip. That's not a fault of the M1 chip design, that's a fault of the design of the test. Thats your fault, for trying to make the M1 do something it is not optimized for.

To illustrate, think of the following metaphor:
Kitchen #1 only has an oven in it. It is perfectly fine to prepare food in this oven.
Kitchen #2 only has a dishwasher. It is not suited to prepare food in it. It will only get wet in there. Is that the fault of the dishwasher? No, it's your fault, cause you are trying to do something in that kitchen, for which that kitchen is not optimized.
 

dmccloud

macrumors 68040
Sep 7, 2009
3,146
1,902
Anchorage, AK
It's not worth writing special code for m1, when the performance will suck anyway. The m1 gpu (like all igp) is not powerful. Writing special code for m1 gpu would probably make it 2 to 5x faster, but it will still be slow compared to a real gpu.

It looks like the MacBook Pro 16-inch needs to come with a new chip, which has 1032 or 2064 GPU cores to catch up.
Or maybe Apple will build the GPU with 500 Tensor cores.
View attachment 1794795

Get a version of that test recompiled for the M1 that takes advantage of the GPU cores on the SoC, then come back. Until that time, EVERYTHING you could possibly bring to the discussion regarding LC0 is garbage at best. The existing LC0 code was specifically written to take advantage of x86-specific instructions and features that are not present (or honestly even needed) on Apple Silicon. Because that code will always attempt to utilize those features instead of take advantage of the M1s unique feature set and capabilities, the benchmark will NEVER be accurate in its current state.
 

Leifi

macrumors regular
Nov 6, 2021
128
121
Do we see a pattern here.. when the Apple-silicon is lacking in performance and benchmarks like chess-engines that truly hammers the hw multithreaded big-time. The same people here blame the messenger, an claims only certtain "use-cases" and benches are meaningful...
 
Last edited by a moderator:
  • Like
Reactions: Appletoni

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
@Leifi The issue being referred to in this thread has zero to do with multi-threaded performance. The issue has to do with GPGPU, and, as has been extensively explained, goes away when you change a setting in the configuration so that the working set is the right size for the caches.
 
Last edited by a moderator:
  • Like
Reactions: thekev

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Do we see a pattern here.. when the Apple-silicon is lacking in performance and benchmarks like chess-engines that truly hammers the hw multithreaded big-time. The same people here blame the messenger, an claims only certtain "use-cases" and benches are meaningful...


Not really. Apple Silicon performs exceedingly well in most of real-world (or synthetic) tests. The problem of these chess engines tests is not that they "hammer the hw multithreaded big-time". The problem is that they either don't support Apple platform fully (e.g. no GPU acceleration, no native SIMD) or are not well optimized and or not tested. Same goes for benchmarks like Cinebench, of which we know that they rely on compatibility layers that generate suboptimal code.

When discussing performance, one must carefully analyze the reasons why something is slow/fast. Where is the bottleneck? Is it running the optimal path? You cannot assess hardware performance without talking about these things.
 

Leifi

macrumors regular
Nov 6, 2021
128
121
Those benchmarks discussed are using the same code and compiling it using native compilers on M1. It is obvious to serious programmers that the main issue with Apple silicon is the sole focus on single-threaded performance (and some specific benchmarks for video/geekbench type performance). Heck Apple silicon does not even support similar technologies like hyperthreading technologies to use cores more efficiently with more threads, this is a huge limitation for hpf workloads like chess etc.

Apple's NEON instructions are also pretty weak, compared to newer ARMv9 CPUs..

Optimiziations will only take you so far.. And lots of optimizations for Neon, metal etc. has already been tried and it is still not really very fast, compared to for example an old 4700U....

This has been discussed extensively on developer forums for chess-engines, and lots of benchmarks have been done on M1 silicion which shows it is still lacking compared to cheaper alternatives from AMD..
 
Last edited:

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
Those benchmarks discussed are using the same code and compiling it using native compilers on M1. It is obvious to serious programmers that the main issue with Apple silicon is the sole focus on single-threaded performance (and some specific benchmarks for video/geekbench type performance). Heck Apple silicon does not even support similar technologies like hyperthreading technologies to use cores more efficiently with more threads, this is a huge limitation for hpf workloads like chess etc.

Apple's NEON instructions are also pretty weak, compared to newer ARMv9 CPUs..

Optimiziations will only take you so far.. And lots of optimizations for Neon, metal etc. has already been tried and it is still not really very fast, compared to for example an old 4700U....

This has been discussed extensively on developer forums for chess-engines, and lots of benchmarks have been done on M1 silicion which shows it is still lacking compared to cheaper alternatives from AMD..

What?

Hyperthreading is only necessary when you have so poorly designed your instruction dispatch that you can’t keep your ALUs busy. M1 has tremendous IPC. It’s execution units are always busy. Adding hyper threading would slow the cpu down because it would cause unnecessary context switching.

As per forum rules, please cite your source for the statement beginning “it is obvious…”
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Those benchmarks discussed are using the same code and compiling it using native compilers on M1.

Code which is not optimal for Apple Silicon or even does not do the same thing (as in case of GPU-based benchmarks).

It is obvious to serious programmers that the main issue with Apple silicon is the sole focus on single-threaded performance (and some specific benchmarks for video/geekbench type performance).

I suppose you are one of those "serious programmers"? Then go check out how Apple Silicon performs for multi-threaded code builds or number-crunching. The result might surprise you.

Heck Apple silicon does not even support similar technologies like hyperthreading technologies to use cores more efficiently with more threads, this is a huge limitation for hpf workloads like chess etc.

Complete and utter nonsense. Apple does not support hyper threading because they don't need hyper threading. Their multi-core performance is excellent as it is without artificially inflating hardware thread count.


Apple's NEON instructions are also pretty weak, compared to newer ARMv9 CPUs..

Which ARMv9 CPUs? Where can I buy one? Why do you talk about that anyway? Apple's SIMD implementation offers the same max throughput as any modern x86 CPU via AVX256, with Apple having a more flexible implementation (better ILP for many cases).

Optimiziations will only take you so far.. And lots of optimizations for Neon, metal etc. has already been tried and it is still not really very fast, compared to for example an old 4700U.... This has been discussed extensively on developer forums for chess-engines, and lots of benchmarks have been done on M1 silicion which shows it is still lacking compared to cheaper alternatives from AMD..

Now you are just saying random things. What is so special about chess that M1 breaks the charts with scientific number crunching but is somehow unable to run chess engines?
 

Leifi

macrumors regular
Nov 6, 2021
128
121
What?

Hyperthreading is only necessary when you have so poorly designed your instruction dispatch that you can’t keep your ALUs busy. M1 has tremendous IPC. It’s execution units are always busy. Adding hyper threading would slow the cpu down because it would cause unnecessary context switching.

As per forum rules, please cite your source for the statement beginning “it is obvious…”
As per forum rules. Your claim a lot of things about SMT without a source.. let's start there. What your source claiming SMT-support hurts CPU performance (given the die size and same number of physical cores), and only help "poorly designed" code :)

As per your question to me there are already references in this thread to multiple GitHub discussions etc. which you can look up discussion with the top-tier chess engine developers about the performance of Apple silicon.. for example:


etc.


Can you name one top programmer of chess engines that would support a claim that Apple silicon beats cheaper AMD CPUs currently? ?
 
Last edited:

Leifi

macrumors regular
Nov 6, 2021
128
121
Now you are just saying random things. What is so special about chess that M1 breaks the charts with scientific number crunching but is somehow unable to run chess engines?

Not Only chess.... Just look at openbenchmarking... Chess is just one demanding use-case where the performanceof the Apple silicion is slower than much cheaper alternatives..

Average CPU perf on https://openbenchmarking.org/


OpenBench.jpg


 

Appletoni

Suspended
Original poster
Mar 26, 2021
443
177
@Leifi The issue being referred to in this thread has zero to do with multi-threaded performance. The issue has to do with GPGPU, and, as has been extensively explained, goes away when you change a setting in the configuration so that the working set is the right size for the caches.
But it has to do with multi-threaded… and cpu performance which is very low in lot of tasks.
 
  • Like
Reactions: Leifi
Status
Not open for further replies.
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.