Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

mi7chy

macrumors G4
Original poster
Oct 24, 2014
10,625
11,298
Based on hashcat GPU compute workload.

M1 (generated on MacBook Pro M1 before Big Sur 11.5 and after broke OpenCL)
Code:
OpenCL API (OpenCL 1.2 (Oct 29 2020 19:50:08)) - Platform #1 [Apple]
====================================================================
* Device #1: Apple M1, 10858/10922 MB (1024 MB allocatable), 8MCU

Benchmark relevant options:
===========================
* --optimized-kernel-enable

Hashmode: 0 - MD5

Speed.#1.........:  2835.2 MH/s (4.50ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

Hashmode: 100 - SHA1

Speed.#1.........:  1024.9 MH/s (12.53ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

Hashmode: 1400 - SHA2-256

Speed.#1.........:   306.7 MH/s (41.89ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

Hashmode: 1700 - SHA2-512

Speed.#1.........:   101.2 MH/s (63.51ms) @ Accel:256 Loops:512 Thr:256 Vec:1

Hashmode: 22000 - WPA-PBKDF2-PMKID+EAPOL (Iterations: 4095)

Speed.#1.........:    51103 H/s (61.31ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

Hashmode: 1000 - NTLM

Speed.#1.........:  4842.9 MH/s (2.62ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

Hashmode: 3000 - LM

Speed.#1.........:   242.3 MH/s (53.02ms) @ Accel:1024 Loops:1024 Thr:64 Vec:1

Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS

Speed.#1.........:  2921.3 MH/s (4.29ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

Hashmode: 5600 - NetNTLMv2

Speed.#1.........:   205.6 MH/s (62.48ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

Hashmode: 1500 - descrypt, DES (Unix), Traditional DES

Speed.#1.........: 16953.0 kH/s (94.89ms) @ Accel:128 Loops:1024 Thr:64 Vec:1

Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)

Speed.#1.........:  1232.9 kH/s (9.96ms) @ Accel:256 Loops:1000 Thr:256 Vec:1

Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)

Speed.#1.........:     1987 H/s (48.51ms) @ Accel:64 Loops:32 Thr:8 Vec:1

Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)

Speed.#1.........:    16071 H/s (78.85ms) @ Accel:128 Loops:1024 Thr:256 Vec:1

Hashmode: 7500 - Kerberos 5, etype 23, AS-REQ Pre-Auth

Speed.#1.........: 32673.1 kH/s (49.20ms) @ Accel:512 Loops:256 Thr:64 Vec:1

Hashmode: 13100 - Kerberos 5, etype 23, TGS-REP

Speed.#1.........: 32509.4 kH/s (49.44ms) @ Accel:512 Loops:256 Thr:64 Vec:1

Hashmode: 15300 - DPAPI masterkey file v1 (Iterations: 23999)

Speed.#1.........:     8709 H/s (61.35ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

Hashmode: 15900 - DPAPI masterkey file v2 (Iterations: 12899)

Speed.#1.........:     2165 H/s (57.26ms) @ Accel:128 Loops:256 Thr:256 Vec:1

Hashmode: 7100 - macOS v10.8+ (PBKDF2-SHA512) (Iterations: 1023)

Speed.#1.........:    27649 H/s (56.21ms) @ Accel:128 Loops:255 Thr:256 Vec:1

Hashmode: 11600 - 7-Zip (Iterations: 16384)

Speed.#1.........:    42556 H/s (73.73ms) @ Accel:256 Loops:4096 Thr:256 Vec:1

Hashmode: 12500 - RAR3-hp (Iterations: 262144)

Speed.#1.........:     9108 H/s (43.12ms) @ Accel:128 Loops:16384 Thr:256 Vec:1

Hashmode: 13000 - RAR5 (Iterations: 32799)

Speed.#1.........:     3720 H/s (52.58ms) @ Accel:256 Loops:512 Thr:256 Vec:1

Hashmode: 6211 - TrueCrypt RIPEMD160 + XTS 512 bit (Iterations: 1999)

Speed.#1.........:    29960 H/s (53.63ms) @ Accel:128 Loops:512 Thr:256 Vec:1

Hashmode: 13400 - KeePass 1 (AES/Twofish) and KeePass 2 (AES) (Iterations: 24569)

Speed.#1.........:    17158 H/s (30.46ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

Hashmode: 6800 - LastPass + LastPass sniffed (Iterations: 499)

Speed.#1.........:   242.1 kH/s (51.14ms) @ Accel:256 Loops:499 Thr:256 Vec:1

Hashmode: 11300 - Bitcoin/Litecoin wallet.dat (Iterations: 200459)

Speed.#1.........:      461 H/s (69.47ms) @ Accel:256 Loops:512 Thr:256 Vec:1

3060 100W
Code:
OpenCL API (OpenCL 3.0 CUDA 11.4.101) - Platform #1 [NVIDIA Corporation]
========================================================================
* Device #1: NVIDIA GeForce RTX 3060 Laptop GPU, 5376/6144 MB (1536 MB allocatable), 30MCU

Benchmark relevant options:
===========================
* --optimized-kernel-enable

-------------------
* Hash-Mode 0 (MD5)
-------------------

Speed.#1.........: 24153.8 MH/s (41.09ms) @ Accel:128 Loops:1024 Thr:256 Vec:8

----------------------
* Hash-Mode 100 (SHA1)
----------------------

Speed.#1.........:  7706.9 MH/s (64.74ms) @ Accel:512 Loops:1024 Thr:32 Vec:1

---------------------------
* Hash-Mode 1400 (SHA2-256)
---------------------------

Speed.#1.........:  3305.9 MH/s (75.53ms) @ Accel:256 Loops:1024 Thr:32 Vec:1

---------------------------
* Hash-Mode 1700 (SHA2-512)
---------------------------

Speed.#1.........:   983.8 MH/s (63.30ms) @ Accel:128 Loops:64 Thr:256 Vec:1

-------------------------------------------------------------
* Hash-Mode 22000 (WPA-PBKDF2-PMKID+EAPOL) [Iterations: 4095]
-------------------------------------------------------------

Speed.#1.........:   379.6 kH/s (78.84ms) @ Accel:16 Loops:1024 Thr:256 Vec:1

-----------------------
* Hash-Mode 1000 (NTLM)
-----------------------

Speed.#1.........: 43046.2 MH/s (22.75ms) @ Accel:128 Loops:1024 Thr:256 Vec:8

---------------------
* Hash-Mode 3000 (LM)
---------------------

Speed.#1.........: 23107.1 MH/s (42.92ms) @ Accel:128 Loops:1024 Thr:256 Vec:1

--------------------------------------------
* Hash-Mode 5500 (NetNTLMv1 / NetNTLMv1+ESS)
--------------------------------------------

Speed.#1.........: 23197.9 MH/s (42.77ms) @ Accel:256 Loops:1024 Thr:128 Vec:2

----------------------------
* Hash-Mode 5600 (NetNTLMv2)
----------------------------

Speed.#1.........:  1712.8 MH/s (72.85ms) @ Accel:16 Loops:1024 Thr:256 Vec:1

--------------------------------------------------------
* Hash-Mode 1500 (descrypt, DES (Unix), Traditional DES)
--------------------------------------------------------

Speed.#1.........:   943.3 MH/s (66.09ms) @ Accel:8 Loops:1024 Thr:256 Vec:1

------------------------------------------------------------------------------
* Hash-Mode 500 (md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5)) [Iterations: 1000]
------------------------------------------------------------------------------

Speed.#1.........:  9898.3 kH/s (85.98ms) @ Accel:128 Loops:1000 Thr:256 Vec:1

----------------------------------------------------------------
* Hash-Mode 3200 (bcrypt $2*$, Blowfish (Unix)) [Iterations: 32]
----------------------------------------------------------------

Speed.#1.........:    22963 H/s (59.49ms) @ Accel:128 Loops:32 Thr:11 Vec:1

--------------------------------------------------------------------
* Hash-Mode 1800 (sha512crypt $6$, SHA512 (Unix)) [Iterations: 5000]
--------------------------------------------------------------------

Speed.#1.........:   154.8 kH/s (83.28ms) @ Accel:256 Loops:1024 Thr:256 Vec:1

--------------------------------------------------------
* Hash-Mode 7500 (Kerberos 5, etype 23, AS-REQ Pre-Auth)
--------------------------------------------------------

Speed.#1.........:   512.6 MH/s (60.85ms) @ Accel:128 Loops:256 Thr:32 Vec:1

-------------------------------------------------
* Hash-Mode 13100 (Kerberos 5, etype 23, TGS-REP)
-------------------------------------------------

Speed.#1.........:   503.5 MH/s (61.93ms) @ Accel:256 Loops:128 Thr:32 Vec:1

---------------------------------------------------------------
* Hash-Mode 15300 (DPAPI masterkey file v1) [Iterations: 23999]
---------------------------------------------------------------

Speed.#1.........:    65961 H/s (77.07ms) @ Accel:16 Loops:1024 Thr:256 Vec:1

---------------------------------------------------------------
* Hash-Mode 15900 (DPAPI masterkey file v2) [Iterations: 12899]
---------------------------------------------------------------

Speed.#1.........:    38023 H/s (62.97ms) @ Accel:16 Loops:256 Thr:256 Vec:1

------------------------------------------------------------------
* Hash-Mode 7100 (macOS v10.8+ (PBKDF2-SHA512)) [Iterations: 1023]
------------------------------------------------------------------

Speed.#1.........:   465.5 kH/s (60.90ms) @ Accel:64 Loops:63 Thr:256 Vec:1

---------------------------------------------
* Hash-Mode 11600 (7-Zip) [Iterations: 16384]
---------------------------------------------

Speed.#1.........:   368.8 kH/s (79.38ms) @ Accel:16 Loops:4096 Thr:256 Vec:1

------------------------------------------------
* Hash-Mode 12500 (RAR3-hp) [Iterations: 262144]
------------------------------------------------

Speed.#1.........:    48321 H/s (78.70ms) @ Accel:8 Loops:16384 Thr:256 Vec:1

--------------------------------------------
* Hash-Mode 13000 (RAR5) [Iterations: 32799]
--------------------------------------------

Speed.#1.........:    41786 H/s (91.44ms) @ Accel:32 Loops:512 Thr:256 Vec:1

-----------------------------------------------------------------------
* Hash-Mode 6211 (TrueCrypt RIPEMD160 + XTS 512 bit) [Iterations: 1999]
-----------------------------------------------------------------------

Speed.#1.........:   289.8 kH/s (48.42ms) @ Accel:32 Loops:128 Thr:256 Vec:1

-----------------------------------------------------------------------------------
* Hash-Mode 13400 (KeePass 1 (AES/Twofish) and KeePass 2 (AES)) [Iterations: 24569]
-----------------------------------------------------------------------------------

Speed.#1.........:    48651 H/s (52.35ms) @ Accel:16 Loops:512 Thr:256 Vec:1

----------------------------------------------------------------
* Hash-Mode 6800 (LastPass + LastPass sniffed) [Iterations: 499]
----------------------------------------------------------------

Speed.#1.........:  2483.2 kH/s (88.76ms) @ Accel:32 Loops:499 Thr:256 Vec:1

--------------------------------------------------------------------
* Hash-Mode 11300 (Bitcoin/Litecoin wallet.dat) [Iterations: 200459]
--------------------------------------------------------------------

Speed.#1.........:     4817 H/s (69.31ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
 

Gnattu

macrumors 65816
Sep 18, 2020
1,107
1,672
Apple's OpenCL driver is known to be buggy and unreliable, and this could just be an exact example of it. In fact hashcat does not even run as of 6.2.4 because it is not compatible with Apple's OpenCL driver. (you have to use an older version).
 

crazy dave

macrumors 65816
Sep 9, 2010
1,453
1,229
Apple's OpenCL driver is known to be buggy and unreliable, and this could just be an exact example of it. In fact hashcat does not even run as of 6.2.4 because it is not compatible with Apple's OpenCL driver. (you have to use an older version).

True though it has to be said raw compute is not the strongest element of Apple’s GPU design. Theoretically the M1 Max GPU’s FP32 performance is about 20% lower than the 3060’s if I remember the relevant numbers right. Still better than what’s posted above unsurprisingly.
 

mi7chy

macrumors G4
Original poster
Oct 24, 2014
10,625
11,298
(you have to use an older version).

Older version of what, OpenCL or hashcat? Hashcat 6.1.1 which is the same version that used to work but no longer works for me on Big Sur 11.6.
 

Gnattu

macrumors 65816
Sep 18, 2020
1,107
1,672
Older version of what, OpenCL or hashcat? Hashcat 6.1.1 which is the same version that used to work but no longer works for me on Big Sur 11.6.
I can still run hashcat 6.1.1 on macOS 11.6, but 6.2.4 no longer runs. Apple's OpenCL support is just broken after the deprecation announcement.
 

crazy dave

macrumors 65816
Sep 9, 2010
1,453
1,229
I can still run hashcat 6.1.1 on macOS 11.6, but 6.2.4 no longer runs. Apple's OpenCL support is just broken after the deprecation announcement.

I do find it slightly amusing that one of the supposed early breaks between Apple and Nvidia was the latter’s lack of good OpenCL support and now Nvidia has finally upgraded their support of OpenCL while Apple is all in on Metal and deprecated OpenCL completely.
 

mi7chy

macrumors G4
Original poster
Oct 24, 2014
10,625
11,298
Maybe it's better off broken so you can't see the real GPU compute performance of M1 Pro and M1 Max vs marketing claims.

Curious what GPU workload they used to make this determination.

1634590869441-png.1870282
 

Gnattu

macrumors 65816
Sep 18, 2020
1,107
1,672
Maybe it's better off broken so you can't see the real GPU compute performance of M1 Pro and M1 Max vs marketing claims.
Metal does offer compute APIs, but the application that supports Metal backend and has relevant workload is still limited now. By limited I mean the number of open source projects supporting Metal. There are a LOT of commercial software runs extremely well though.
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
M1 scores 5k in Wild Life Extreme. RTX 3070 mobile scores ~ 21k in Wild Life Extreme. Given the massive bandwidth improvements, the M1 Max will score ~ 20-25K in Wild Life Extreme. There you have it.

To make it clear: I don't care a bit about your obscure benchmarks using niche software that uses outdated technologies. Rewrite hashcat for Metal and modern computing guidelines, then we can talk.
 
  • Like
Reactions: Homy

mi7chy

macrumors G4
Original poster
Oct 24, 2014
10,625
11,298
To make it clear: I don't care a bit about your obscure benchmarks using niche software that uses outdated technologies. Rewrite hashcat for Metal and modern computing guidelines, then we can talk.

Do you even know what you're talking about? OpenCL and CUDA are the industry standard for GPU compute projects. Similar to Vulkan for cross platform gaming. Metal what?
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Do you even know what you're talking about? OpenCL and CUDA are the industry standard for GPU compute projects.

Not on macOS.

I mean, I am sympathetic to your use case and I understand that you want your software to perform well. However, if your software is not written with a particular platform in mind, it is likely missing major performance gains possible on that platform. It is completely pointless to judge the performance of these new chips by using software that does not properly use the technology offered by those chips.
 
  • Like
Reactions: Homy and Jorbanead

mi7chy

macrumors G4
Original poster
Oct 24, 2014
10,625
11,298
jimmystar889 on reddit was nice enough to share the results from MBP 16" M1 Max 32-core iGPU. It's about 4x faster than MBP M1 as was claimed. Benchmark didn't complete but enough to validate the performance claim.

OpenCL API (OpenCL 1.2 (Oct 1 2021 19:40:58)) - Platform #1 [Apple]
* Device #1: Apple M1 Max, 21781/21845 MB (4096 MB allocatable), 32MCUBenchmark relevant options:
* --optimized-kernel-enable
Hashmode: 0 - MD5
Speed.#1.........: 11539.2 MH/s (4.42ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 100 - SHA1
Speed.#1.........: 4164.4 MH/s (12.35ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 1400 - SHA2-256
Speed.#1.........: 1249.6 MH/s (41.11ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 1700 - SHA2-512
Speed.#1.........: 410.4 MH/s (62.59ms) @ Accel:256 Loops:512 Thr:256 Vec:1
Hashmode: 22000 - WPA-PBKDF2-PMKID+EAPOL (Iterations: 4095)
Speed.#1.........: 206.9 kH/s (60.49ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 1000 - NTLM
Speed.#1.........: 19841.6 MH/s (2.58ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 3000 - LM
Speed.#1.........: 1374.9 MH/s (37.36ms) @ Accel:1024 Loops:1024 Thr:64 Vec:1
Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS
Speed.#1.........: 12153.6 MH/s (4.21ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 5600 - NetNTLMv2
Speed.#1.........: 832.5 MH/s (61.79ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 1500 - descrypt, DES (Unix), Traditional DES
Speed.#1.........: 59284.9 kH/s (54.21ms) @ Accel:64 Loops:1024 Thr:64 Vec:1
Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)
Speed.#1.........: 5075.7 kH/s (9.76ms) @ Accel:256 Loops:1000 Thr:256 Vec:1
Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)
Speed.#1.........: 7811 H/s (49.30ms) @ Accel:64 Loops:32 Thr:8 Vec:1
Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)
UNSUPPORTED (log once): createKernel: newComputePipelineState failed
clCreateKernel(): CL_INVALID_KERNEL
 

JimmyjamesEU

Suspended
Jun 28, 2018
397
426
jimmystar889 on reddit was nice enough to share the results from MBP 16" M1 Max 32-core iGPU. It's about 4x faster than MBP M1 as was claimed. Benchmark didn't complete but enough to validate the performance claim.

OpenCL API (OpenCL 1.2 (Oct 1 2021 19:40:58)) - Platform #1 [Apple]
* Device #1: Apple M1 Max, 21781/21845 MB (4096 MB allocatable), 32MCUBenchmark relevant options:
* --optimized-kernel-enable
Hashmode: 0 - MD5
Speed.#1.........: 11539.2 MH/s (4.42ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 100 - SHA1
Speed.#1.........: 4164.4 MH/s (12.35ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 1400 - SHA2-256
Speed.#1.........: 1249.6 MH/s (41.11ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 1700 - SHA2-512
Speed.#1.........: 410.4 MH/s (62.59ms) @ Accel:256 Loops:512 Thr:256 Vec:1
Hashmode: 22000 - WPA-PBKDF2-PMKID+EAPOL (Iterations: 4095)
Speed.#1.........: 206.9 kH/s (60.49ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 1000 - NTLM
Speed.#1.........: 19841.6 MH/s (2.58ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 3000 - LM
Speed.#1.........: 1374.9 MH/s (37.36ms) @ Accel:1024 Loops:1024 Thr:64 Vec:1
Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS
Speed.#1.........: 12153.6 MH/s (4.21ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 5600 - NetNTLMv2
Speed.#1.........: 832.5 MH/s (61.79ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 1500 - descrypt, DES (Unix), Traditional DES
Speed.#1.........: 59284.9 kH/s (54.21ms) @ Accel:64 Loops:1024 Thr:64 Vec:1
Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)
Speed.#1.........: 5075.7 kH/s (9.76ms) @ Accel:256 Loops:1000 Thr:256 Vec:1
Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)
Speed.#1.........: 7811 H/s (49.30ms) @ Accel:64 Loops:32 Thr:8 Vec:1
Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)
UNSUPPORTED (log once): createKernel: newComputePipelineState failed
clCreateKernel(): CL_INVALID_KERNEL
Thanks for this. I can finally relax. I'm a little concerned it doesn't count as it wasn't done by Phoronix.
 
  • Like
Reactions: JMacHack

mi7chy

macrumors G4
Original poster
Oct 24, 2014
10,625
11,298
But they didn’t run it. Can we really trust anyone else? I think not. I’ve heard they’re optimizing their “yes” test. Open one shell per core and type yes. Take that Anandtech.

Phoronix hosts the results database so anyone besides them can submit results. The more the merrier to validate consistency instead of relying on one data point.
 

JimmyjamesEU

Suspended
Jun 28, 2018
397
426
Phoronix hosts the results database so anyone besides them can submit results. The more the merrier to validate consistency instead of relying on one data point.
Disagree. Clearly they are the most supreme benchmarkers in the world. I don’t think any lowly person could match their vast knowledge. It’s too risky. Let’s assume this vital hashcat benchmark is suspect until they confirm the results or the results show the M1 in a poor light.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.