Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Status
Not open for further replies.

throAU

macrumors G3
Feb 13, 2012
9,204
7,355
Perth, Western Australia
But for the non existant prospect of increasing Stockfish speed by 200-300% on a Apples ARMv8.4 CPU should be ovious for you if you even have a basic grasp of the coding you refer to.
You keep saying that, yet refuse to acknowledge that properly utilising hardware instructions can have an order of magnitude speedup, and fitting your algorithm to hardware can have similar benefits.

I've personally seen (in my earlier programming days) a 50-100x speed up by simply changing a buffer size for file IO (to a hard drive).

AES instructions on intel for example can be up to a 30x performance improvement for crypto. Not 30%. 30x!

Bad code can be very slow. Even "good" code running in an environment it is not written for can expose problems or fail to exploit the machine it is running on resulting in bad performance.
 

Taz Mangus

macrumors 604
Mar 10, 2011
7,815
3,504
It's a moot OT dicsucssion, and yes I do Rust, Go, C/C++, Ada, haskl, Java, x86, sparc, 6502, 680xx, Z/OS HLASM, Ruby, Python, C# etc.. But for the non existant prospect of increasing Stockfish speed by 200-300% on a Apples ARMv8.4 CPU should be ovious for you if you even have a basic grasp of the coding you refer to.

Why are you here? According to you, Apple hardware is overpriced. According to you, Apple hardware is inferior to its competition. You just sound like an angry person, that is how you are posting.
 
  • Like
Reactions: throAU

Leifi

macrumors regular
Nov 6, 2021
128
121
This is one of the countless reasons why anyone who knows anything about benchmarking methodology distrusts Phoronix. The guy who runs that site operates purely on a quantity-over-quality basis, whether it's his news articles or how he designs his benchmark suite. It isn't useful data.

Yeh we sort of know already that the "only" benchmarks to be trustred by some people here are those that some tweaked Apple-version "wins" :)
 
  • Haha
Reactions: tabbytab10

JimmyjamesEU

Suspended
Jun 28, 2018
397
426
Yeh we sort of know already that the "only" benchmarks to be trustred by some people here are those that some tweaked Apple-version "wins" :)
You keep saying this and yet you continually fail to provide a single chess benchmark to support your claims.
 

Taz Mangus

macrumors 604
Mar 10, 2011
7,815
3,504
Yeh we sort of know already that the "only" benchmarks to be trustred by some people here are those that some tweaked Apple-version "wins" :)
You come across as if you don't understand basic software engineering. Why are you here? You are just ranting nonsense.
 

bcortens

macrumors 65816
Aug 16, 2007
1,324
1,796
Canada
Yeh we sort of know already that the "only" benchmarks to be trustred by some people here are those that some tweaked Apple-version "wins" :)
and yet you post open source code which has clearly had time and attention given to optimizing for x86_64?
So is this a heads I win tails you lose kind of situation?
 

throAU

macrumors G3
Feb 13, 2012
9,204
7,355
Perth, Western Australia
Yeh we sort of know already that the "only" benchmarks to be trustred by some people here are those that some tweaked Apple-version "wins" :)

No, people who actually have a verifiable, sane benchmark methodology.

I think we've established: If you want to run chess apps, buy a cheaper intel box.

That's not apple's target market, the Mac version looks to be unmaintained for several years and pre-dates Apple Silicon in Macs.


With that in mind, do you have any other examples of how bad the M1 sucks?
 

Appletoni

Suspended
Original poster
Mar 26, 2021
443
177
Why was it a bad decision? Eight P cores is good enough for Apple to have taken the crown for laptop multithreaded performance by a very comfortable margin.

You seem a bit confused about heat. Each M1 P core can use about 5W, so someone looking to design a system with eighty of them on a single chip would be looking at trying to supply and cool 400W. Absolutely insane in a desktop, just forget about it in a laptop, it's a terrible idea. And there wouldn't be any die area left over for GPU cores.
Maybe not 400W but look at other high end devices. The consumption can be very high. I believe that Apple can build 40 performance cores inside a 16, 18 and 20-inch MacBook Pro Plus with better passive cooling and better fans. The new performance cores will be 3nm.
 

januarydrive7

macrumors 6502a
Oct 23, 2020
537
578
I would be more happy If Apple spent some time trying to improve their CPUs, GPUs, and stop being so d*mn overpriced, proprietary, and un-open .... More honesty and less dishonest sales-pitches, and less uninformative misleading graphs when presenting new stuff.. Thanks.... Please :)
"I hate that apple is un-open"
...posts opensource files provided by apple...
Clearly... :)

Do you even know what stuff like this is:
🤔
 

Taz Mangus

macrumors 604
Mar 10, 2011
7,815
3,504
Maybe not 400W but look at other high end devices. The consumption can be very high. I believe that Apple can build 40 performance cores inside a 16, 18 and 20-inch MacBook Pro Plus with better passive cooling and better fans. The new performance cores will be 3nm.
You are just spouting nonsense.
 

Leifi

macrumors regular
Nov 6, 2021
128
121
With that in mind, do you have any other examples of how bad the M1 sucks?
Why are you here? According to you, Apple hardware is overpriced. According to you, Apple hardware is inferior to its competition. You just sound like an angry person, that is how you are posting.
and yet you post open source code which has clearly had time and attention given to optimizing for x86_64?
So is this a heads I win tails you lose kind of situation?
You come across as if you don't understand basic software engineering. Why are you here? You are just ranting nonsense.
Lol, did you google some random stuff and put together a list? 🤣 "haskl"... like this one?

I guess the M1 at least wins the personal-attack benches here anyway :)
 
  • Like
Reactions: Appletoni

throAU

macrumors G3
Feb 13, 2012
9,204
7,355
Perth, Western Australia
I believe that Apple can build 40 performance cores inside a 16, 18 and 20-inch MacBook Pro Plus with better passive cooling and better fans

What do you base this belief on? What are you smoking?

40 performance cores of similar design, with similar overhead for busses, etc. would make the die... physically the largest desktop/laptop die on the market (like... 1100-1200 mm^2 plus?) with super-bad yield rates (thus much higher price), much higher power consumption and the requirement to down-clock it (impacting single thread performance - which does matter) to fit within the thermal/power envelope of the target devices.

Quite likely the the on die fabric would be overwhelmed with the bandwidth contention with that many dies on board, and also quite likely that there wouldn't be sufficient DRAM bandwidth to feed it in any case.

And at the end of the day... they're winning with 8 P cores.


And surely, if Apple can do it, Intel or AMD with their far greater CPU manufacturing experience could do it also. Where's the 40 core intel laptop CPU (or desktop for that matter)?

🤔
 

hans1972

Suspended
Apr 5, 2010
3,759
3,398
Interesting opinion.
Looking fast at the internet I found this:



It looks more like it’s Apples fault.

You link really shows how bad things can be on the Mac if you are using (low-level) software developed for Windows or Linux.

People writing these libraries usually don't have a Mac and they don't try to do things the Mac way, but trying to make it cross-plattform. That's particular bad for Mac which often are quite different from Intel, Windows and Linux.

Matrix calculations is a good example. By using macOS specific APIs the number of operations increased by about x4.3.

Mac mini (M1): Cross-plattform library: 6492 macOS M1 specific API: 27676
AMD Ryzen 5900X: Cross-plattform library: 22568
 

Taz Mangus

macrumors 604
Mar 10, 2011
7,815
3,504
Where's the 40 core intel laptop CPU?

🤔
Might be hard to fit the fan in the case.

Rolls_Ultrafan_illus_1540.5e4366d969b0e.jpeg
 
  • Haha
Reactions: ddhhddhh2

Leifi

macrumors regular
Nov 6, 2021
128
121
And surely, if Apple can do it, Intel or AMD with their far greater CPU manufacturing experience could do it also. Where's the 40 core intel laptop CPU (or desktop for that matter)?

🤔

Xeon Platinum 9282 56 cores 112 threads SMT.
 
  • Like
Reactions: Appletoni

hans1972

Suspended
Apr 5, 2010
3,759
3,398
Please go read the referenced links to talkchess.com discussion.. Apples GPU is indeed used (trough OpenCL libraries).. Metal is not possible to use effeciently for these tensor AI backends.

OpenCL is depreciated on the Mac and the performance is quite bad even on Intel Macs.

To get the best result from an M1 Mac you must use Metal compute and the Accelerate API for matrix calculations.
It seems to me they haven't optimised their software for the M1 Macs. I'm not saying an M1 Mac would beat any PC or Linux machine but the result would be much better.
 

throAU

macrumors G3
Feb 13, 2012
9,204
7,355
Perth, Western Australia
Xeon Platinum 9282 56 cores 112 threads SMT.
  1. That's not a desktop CPU, its a datacenter part
  2. Lets see the clock speed
  3. Lets see the cost (it will be 1-3x the cost of a 16" MBP or more)
Screen Shot 2021-12-01 at 8.29.03 am.png



400 watts hey? Even datacenter don't want that, hence its having its lunch money taken by AMD EPYC.
 

Taz Mangus

macrumors 604
Mar 10, 2011
7,815
3,504
Please go read the referenced links to talkchess.com discussion.. Apples GPU is indeed used (trough OpenCL libraries)..
All this time you knew that the chess game benchmark was using a depreciated library and even using it will give back poor results.

Metal is not possible to use effeciently for these tensor AI backends.
Then that chess benchmark is poorly coded to run on Apple silicon.
 

Leifi

macrumors regular
Nov 6, 2021
128
121
You keep saying that, yet refuse to acknowledge that properly utilising hardware instructions can have an order of magnitude speedup, and fitting your algorithm to hardware can have similar benefits.

Yes, I refuse to accept your notion that you could increase the speed (program output intact) by 2-3 times with the kind of tweaks that have been thrown around here. I have seen no indication that you even could improve anything in those benches at all to get a significant boost on Apple silicon.

You are also completely neglecting the fact that there may be similar optimzations that could be done for other architectures as well.

I am happy to be proven wrong. But let's get real here as soon as real code or compiling or "proving" something is mentioned excuses like "I don't actually even have Apple silicon" or "I could "probably" do it if I got paid"..or "It is possible but I don't care" yadda yadda yadda.

Just empty talk, I am afraid.

You claim others "suck" as developers programmers and don't know coding, yet you have little to showcase... yourself.
 
Last edited:
  • Like
Reactions: Appletoni

hans1972

Suspended
Apr 5, 2010
3,759
3,398
Can you be specific please.. What exactly optimizations do you think can be done on an M1 nott already done on stockfish and cFish c compiles with NEON. And how much would you think could gained exactly in theory and in practice..

  1. Become Mac developers and develop using a Mac
  2. Don't use the architecture which works on Intel/AMD/NVidia and Windows/Linux
  3. Use Xcode and don't rely on cross-plattform C/C++ too much (they tend not to be optimised for the Mac at all)
  4. Use Grand Central Dispatch, Metal, Accelerate and ML Compute
  5. Don't use OpenCL
  6. Don't use TensorFlow (yet)
I don't think anyone would know how much it would help, since none are doing it the correct way for the Mac.
 
  • Like
Reactions: Homy and throAU

throAU

macrumors G3
Feb 13, 2012
9,204
7,355
Perth, Western Australia
More like 10x… Probably 40K and up.

Intel did drop their datacenter pricing a couple years back by a lot. but yes, I was being conservative. Their 28 core datacenter parts were 10-15k USD a while ago depending if you wanted the crippled memory version or not.

And yes, comparing a 10-40k server part running in 400 watts is, again, totally outside the scope of reality expecting Apple could "stuff a 40 core part in a 16-18-20" laptop" - at any sort of "not over-priced" purchase price (if at all).

That 56 core xeon also has no GPU cores, no ML cores, no video transcoder units, etc. It will likely be beaten on video decode/transcode by the afterburner unit inside the m1-pro/max
 
Status
Not open for further replies.
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.