any intel-M1 comparisons on data science workloads?

project_2501 · Nov 13, 2021

I'm interested in how the M1 performs on data science workloads, specifically the Python ecosystem of numerical computing.

Currently Intel provide libraries like the Intel MKL which help software like Python take advantage of Intel CPU support for things like matrix multiplication, FFTs, neural networks, etc.

Is there something like this for Apple M1 that open source software like Python can take advantage of?

How does Python numpy/scipy perform on M1?

Gnattu · Nov 13, 2021

project_2501 said:
Is there something like this for Apple M1 that open source software like Python can take advantage of?

Apple has its own Accelerate framework: https://developer.apple.com/documentation/accelerate

But if you want to use them in python, you have to find specific implementations that use that framework. For example the fft: https://github.com/andrej5elin/accelerate_fft

project_2501 · Nov 13, 2021

Gnattu said:
Apple has its own Accelerate framework: https://developer.apple.com/documentation/accelerate

But if you want to use them in python, you have to find specific implementations that use that framework. For example the fft: https://github.com/andrej5elin/accelerate_fft

Thanks - that is a useful page.

I wonder if Python will take advantage of Apple's Accelerate?

falainber · Nov 13, 2021

project_2501 said:
Thanks - that is a useful page.

I wonder if Python will take advantage of Apple's Accelerate?

Why would anyone want to use Macs for data science? College project? Maybe. Real work? No way. The ecosystem is too limited. When your best hardware option is a laptop (or even a desktop at some point) it's just too limiting.

theorist9 · Nov 13, 2021

If you're interested in the deep learning subset of data science, you might find this thread interesting:

Apple Silicon deep learning performance

In case anyone is interested, in ran a fairly simple MNIST benchmark (proposed here : https://github.com/apple/tensorflow_macos/issues/25) on my recently acquired M1 Pro MBP (16-core GPU, 16GB RAM). I installed Tensorflow using the following guide...

forums.macrumors.com

project_2501 · Nov 13, 2021

theorist9 said:
If you're interested in the deep learning subset of data science, you might find this thread interesting:

Apple Silicon deep learning performance

In case anyone is interested, in ran a fairly simple MNIST benchmark (proposed here : https://github.com/apple/tensorflow_macos/issues/25) on my recently acquired M1 Pro MBP (16-core GPU, 16GB RAM). I installed Tensorflow using the following guide...

forums.macrumors.com

thanks - that thread is very interesting

chengengaun · Nov 13, 2021

This YouTube video might be relevant:

leman · Nov 14, 2021

project_2501 said:
I'm interested in how the M1 performs on data science workloads, specifically the Python ecosystem of numerical computing.

I don’t know about Python, but M1 is absolutely ridiculous in R and Stan. Also, if you work with matrices and your software takes advantage of Accelerate, you get the benefit of Apples dedicated matrix units.

falainber said:
Why would anyone want to use Macs for data science? College project? Maybe. Real work? No way. The ecosystem is too limited. When your best hardware option is a laptop (or even a desktop at some point) it's just too limiting.

Because they are the fastest portable hardware for this type of workload around? And what do you mean „ecosystem too limited“? You use the laptop for development and prototyping, the real work happens on a supercomputer running Linux. Also depends on the scale of your data. Not everyone doing data science works with TBs of data. Our datasets are much smaller and using a laptop to process it is very feasible. Especially if that laptop is as fast as a large desktop workstation.

Xiao_Xi · Nov 14, 2021

Is there any package similar to CuPy for Apple's GPU?

Romain_H · Nov 14, 2021

leman said:
You use the laptop for development and prototyping, the real work happens on a supercomputer running Linux.

And herein lies the problem: The supercomputer most likely runs CUDA, the Mac doesn‘t

leman · Nov 14, 2021

Romain_H said:
And herein lies the problem: The supercomputer most likely runs CUDA, the Mac doesn‘t

Why is this a problem? For most tasks, the API will choose the appropriate backend. I mean, we have folks prototyping with PyTorch and Tensorflow on their Macs and then deploying to the cluster — the code uses the CPU on the local machine and CUDA on the cluster.

Of course, if you rely on low-level programing via CUDA directly, then sure, Mac is probably not the best platform.

Romain_H · Nov 14, 2021

leman said:
Why is this a problem? For most tasks, the API will choose the appropriate backend. I mean, we have folks prototyping with PyTorch and Tensorflow on their Macs and then deploying to the cluster — the code uses the CPU on the local machine and CUDA on the cluster.

Of course, if you rely on low-level programing via CUDA directly, then sure, Mac is probably not the best platform.

Indeed. In my case… no luck. Still probably porting to metal, since overall the dev experience is superior, so research and develpment may proceed faster. Once the algo is stable I may have to port back to CUDA

leman · Nov 14, 2021

Romain_H said:
Indeed. In my case… no luck. Still probably porting to metal, since overall the dev experience is superior, so research and develpment may proceed faster. Once the algo is stable I may have to port back to CUDA

CUDA and MSL are similar enough that using some macros and strategic planning might allow you to use the same kernel code for both. BTW that’s how Apple is porting Blender Cycles to Metal.

Romain_H · Nov 14, 2021

leman said:
CUDA and MSL are similar enough that using some macros and strategic planning might allow you to use the same kernel code for both. BTW that’s how Apple is porting Blender Cycles to Metal.

Well, its not that easy. Plus its not only the GPGPU code; there‘s quite a bit of CPU code around… hitherto I used Qt for that, but I am not sure if I continue that path. Embedding CUDA / Metal in Qt projects is not necessarily straightforward

leman · Nov 14, 2021

Romain_H said:
Well, its not that easy. Plus its not only the GPGPU code; there‘s quite a bit of CPU code around… hitherto I used Qt for that, but I am not sure if I continue that path. Embedding CUDA / Metal in Qt projects is not necessarily straightforward

No, it’s definitely not. Although I must admire Nvidia’s evil marketing genius a bit - by making CUDA so “easy” to use and locking people into NVCC mixed code paradigm, they made properly disentangling CPU/GPU code very painful, which locks people even more into their platform.

falainber · Nov 14, 2021

leman said:
I don’t know about Python, but M1 is absolutely ridiculous in R and Stan. Also, if you work with matrices and your software takes advantage of Accelerate, you get the benefit of Apples dedicated matrix units.

Because they are the fastest portable hardware for this type of workload around? And what do you mean „ecosystem too limited“? You use the laptop for development and prototyping, the real work happens on a supercomputer running Linux. Also depends on the scale of your data. Not everyone doing data science works with TBs of data. Our datasets are much smaller and using a laptop to process it is very feasible. Especially if that laptop is as fast as a large desktop workstation.

You missed the part where supercomputer running Linux (and all sorts of workstations, servers, server farms, cloud services etc.) are x86 based. Developing software for them on machines with different architecture is amateurish at best (if possible at all)

falainber · Nov 14, 2021

leman said:
CUDA and MSL are similar enough that using some macros and strategic planning might allow you to use the same kernel code for both. BTW that’s how Apple is porting Blender Cycles to Metal.

And why would anyone want to go through all these hassles in a first place? There are tons of hardware options with x86 and CUDA available with stable software stacks, excellent developer tools for this hardware and vast developer communities. Compare this to one forum thread on MR for M1 based Macs and the choice must be clear to anyone.

jerryk · Nov 14, 2021

falainber said:
And why would anyone want to go through all these hassles in a first place? There are tons of hardware options with x86 and CUDA available with stable software stacks, excellent developer tools for this hardware and vast developer communities. Compare this to one forum thread on MR for M1 based Macs and the choice must be clear to anyone.

That is the conclusion I have come to. I do most of my ML work on a Windows deskside machine with Nvidia RTX GPUs. I can load the CUDA toolkit and frameworks like TensorFlow or PyTorch and just go. If I get stuck on some issue there are a lot of other people running the same SW and Hardware stack. And when I finish doing basic model configuration on my desktop, I can push to a cloud environment as required with minimal changes.

With this said, I am finding I don't use the desktop machine as much as I once did. I now do a lot of preliminary work in Colab in the cloud. It's free even with GPU support. And I can design and train models anywhere that I have internet access.

leman · Nov 14, 2021

falainber said:
You missed the part where supercomputer running Linux (and all sorts of workstations, servers, server farms, cloud services etc.) are x86 based. Developing software for them on machines with different architecture is amateurish at best (if possible at all)

Why would you say that? Developing software that ships on a different architecture is actually a standard situation. Especially if you are talking about something as implementation-dependent as data science libraries or GPGPU. There are only few relevant architectural differences between x86-64 and Aarch64, which are all very well documented and can be easily taken care of with some basic planning. Not that it matters for most people doing data science as they are going to use abstractions provided by high-level languages and APIs in the first place.

And sure, you might call it amateurish, but that's how stuff works in real life. All software I wrote in the last ten years or so (using x86 as my dev platform) compiles and works without fail for x86-32, x86-64 and Aarch64.

falainber said:
And why would anyone want to go through all these hassles in a first place? There are tons of hardware options with x86 and CUDA available with stable software stacks, excellent developer tools for this hardware and vast developer communities. Compare this to one forum thread on MR for M1 based Macs and the choice must be clear to anyone.

I never claimed that one would. I certainly would not. If my job were to develop CUDA software (and I hope I will never get there), I will get myself a laptop with a Nvidia GPU. I was merely commenting on a specific post.

Of course, since I don't work with CUDA and none of the tools I use rely on Nvidia's tech, M1 Macs is pretty much the best hardware platform on the market for me right now. Extremely portable with excellent battery life, unmatched usability and performance that rivals large desktop workstations (in workflows that I care about) makes it — as you say — a choice clear to anyone. I mean, why would I choose an x86 platform that ends up being 30-40% slower for my work and has half usable battery life?

falainber · Nov 14, 2021

leman said:
Why would you say that? Developing software that ships on a different architecture is actually a standard situation. Especially if you are talking about something as implementation-dependent as data science libraries or GPGPU. There are only few relevant architectural differences between x86-64 and Aarch64, which are all very well documented and can be easily taken care of with some basic planning. Not that it matters for most people doing data science as they are going to use abstractions provided by high-level languages and APIs in the first place.

And sure, you might call it amateurish, but that's how stuff works in real life. All software I wrote in the last ten years or so (using x86 as my dev platform) compiles and works without fail for x86-32, x86-64 and Aarch64.

I never claimed that one would. I certainly would not. If my job were to develop CUDA software (and I hope I will never get there), I will get myself a laptop with a Nvidia GPU. I was merely commenting on a specific post.

Of course, since I don't work with CUDA and none of the tools I use rely on Nvidia's tech, M1 Macs is pretty much the best hardware platform on the market for me right now. Extremely portable with excellent battery life, unmatched usability and performance that rivals large desktop workstations (in workflows that I care about) makes it — as you say — a choice clear to anyone. I mean, why would I choose an x86 platform that ends up being 30-40% slower for my work and has half usable battery life?

Developing software that ships on a different architecture is actually a standard situation.
No it is not. It is only typical for development of software for devices that can't be used themselves for software development (like smartphones)

leman · Nov 14, 2021

falainber said:
Developing software that ships on a different architecture is actually a standard situation.
No it is not. It is only typical for development of software for devices that can't be used themselves for software development (like smartphones)

Well, duh. And that's exactly why it's a standard situation. Much of the software developed in the last years was for smartphones.

Anyway, are you developing your software on the supercomputer directly? Or are you developing it on a local workstation that uses different CPU and OS? How do you think software for all these PowerPC and ARM supercomputers is developed?

GrumpyCoder · Nov 14, 2021

jerryk said:
If I get stuck on some issue there are a lot of other people running the same SW and Hardware stack. And when I finish doing basic model configuration on my desktop, I can push to a cloud environment as required with minimal changes.

Two interesting things here:

1. It's not just people using the same software/hardware which can help. I do research, so not shipping products and I regularly have to check what other research groups in the world do. So checking out a repo and build it to run it two minutes later is a huge factor. On macOS I have to fiddle around to get things going and it's not a small task.

2. Pushing off to cloud/clusters, Apple doesn't scale (yet?). I can easily deploy to 500 or 1000 GPUs and more. Not only can I deploy, but I can easily buy these systems and have them ready to go in no time. And then there's the whole Nvidia software stack, no matter if I want to do autonomous driving, robotics, physical sensor development, medicine, genetics, biology, whatever... Nvidia has a tool for everything that makes life so much easier and saves a ton of time (and I'm not talking only a few days here).

Nvidia isn't different than Apple, both are trying to lock you into their eco system and keep you there. And while Apple is leading this for video/photo/music work, anything for gaming, science and simulation is Nvidias turf. Apple could hire 50000 world leading engineers and wouldn't get close in the next decade. That ship is sailed.

mi7chy · Nov 14, 2021

Cross platform software repository support is still iffy on M1. Had hashcat installed via Homebrew and although it was older version 6.1.1 vs current 6.2.4 it was working. Found out Homebrew finally updated it to 6.2.4 but after updating it no longer runs with "no devices found/left".

Update: Successfully downgraded to 6.1.1 and semi-working again using comment suggestions in this link.

https://dae.me/blog/2516/downgrade-any-homebrew-package-easily/

Now getting "clCreateKernel(): CL_INVALID_KERNEL" after hashmode 1800 doing benchmark so probably need to downgrade from Monterey to Big Sur.

Update 2: hashcat 6.1.1 fully working with Big Sur 11.6.1. Hopefully this unbreaks other OpenCL apps.

Xiao_Xi · Nov 15, 2021

project_2501 said:
I'm interested in how the M1 performs on data science workloads, specifically the Python ecosystem of numerical computing.

Timothy Liu has made some benchmarks about the performance of M1 Max on data science and machine learning tasks.

Benchmarking the Apple M1 Max

Understanding the Hardware Capabilities of Apple's flagship SOC

tlkh.hashnode.dev

senttoschool · Nov 15, 2021

falainber said:
Why would anyone want to use Macs for data science? College project? Maybe. Real work? No way. The ecosystem is too limited. When your best hardware option is a laptop (or even a desktop at some point) it's just too limiting.

From my experience, Windows sucks for a lot of development because a lot of libraries and packages assume you're running some kind of Unix terminal, which works perfectly fine on Linux and macOS. Yes, there's WSL but it's a pain to setup and not "native".

Then there's Linux which sucks as a general operating system.

For me, the only choice for development is macOS.

any intel-M1 comparisons on data science workloads?

macrumors 6502a

macrumors 65816

macrumors 6502a

macrumors 68040

macrumors 601

macrumors 6502a

macrumors 6502

macrumors Core

macrumors 68000

macrumors 6502a

macrumors Core

macrumors 6502a

macrumors Core

macrumors 6502a

macrumors Core

macrumors 68040

macrumors 68040

macrumors 604

macrumors Core

macrumors 68040

macrumors Core

macrumors 68020

Suspended

macrumors 68000

macrumors 68030

Our Staff