Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Menneisyys2

macrumors 603
Original poster
Jun 7, 2011
6,003
1,106
Screenshot 2022-01-08 at 21.31.24.png


Using the multiprocessing benchmark code at the very end of https://realpython.com/python-concurrency/ (look for the codeblock in the “CPU-Bound multiprocessing Version” section - direct link: https://realpython.com/python-concurrency/#cpu-bound-multiprocessing-version ), I made the tests longer ten and a hundred times by changing the original “20” in “numbers = [5_000_000 + x for x in range(20)]” to 200 and 2000, respectively, and doing the same tests.

All the four Macs were up-to-date: MacOS 12.1. The same Python 3.10.1 was used everywhere. Only Activity Monitor (MacOS) / Task Manager (Windows) running in addition to PyCharm (python) to assess the process number.

The following chart, first, shows the four models (early 2015 13" MBP / 13" 2018 MBP / 2018 MacMini / 2021 MBP 16”) and their CPU configuration. The last four rows show Parallels Pro (latest version!) running on the 2018 MacMini. I’ll compare these figures to some real WinTel figures some time to see how much worse these are.

As you can see, the 2021 MBP 16” is almost three times faster than the second-fastest 2018 MacMini; six times faster than the 13" 2018 MBP and more than ten times faster than the early 2015 13" MBP.

Again, this is a CPU-bound multiprocessing benchmark, which heavily (!!!) profits from multiple cores as it evenly distributes the workload over every single of the cores. Tasks (for example Web browsing, non-optimized or non-optimizable stuff) NOT using multiple / all cores will NOT show so drastic a speed difference!


All values are in seconds.

Also note that the current Python 3.10.1 runtime is universal; that is, it also has an ARM binary. (Strange: MacOS still forced me to install Rosetta upon installing it - dunno why?!)

UPDATE (08/Jan/2022 20:40CET): on my 16” MBP, I’ve installed the current Parallels version (in Trial mode) and let it install the default Win11 ARM. I ran the single update in it; with that, it became version Windows 11 Home / 21H2; 22000.318. I installed PyCharm CE and Python 3.10.1 (all are the latest releases and x86 only).

The benchmark itself ran far better than I anticipated: the 16” was definitely (1,5…2 times) faster than the MacMini. However, PyCharm was really sluggish. It’s borderline useless, while, on the MacMini (which runs it (x86) natively), it’s perfectly usable - not much worse than the MacOS-native PyCharm running on the MacOS host. All in all, while the executing environment (3.10.1) itself definitely surprised me (positively), the downsides of x86 emulation were more than evident WRT PyCharm.

I’ve added the results in a new row starting with “2021 MBP 16” + current Parallels (in Trial mode), Win11 ARM + x86 Python”.


UPDATE (09/Jan/2022 02:04CET): Note: this isn’t related to Python benchmarking but MacOS ARM vs x86 + Rosetta2 emulation. I tested some files from the 8k MKV file archive at https://drive.google.com/drive/folders/1TSdV36G_npDtjRJze54GEYpdxpBt7nCK (also linked from https://www.avsforum.com/threads/8k-demo-videos.3107418/ ). The x86 VLC version running under Rosetta has always exhibited some 1.2…1.33 more CPU usage than the ARM version. I tested with both 30 and 60 fps MKV files. Not even the x96 version playing the 60p video did drop any frames. Some results (avg CPU usage):

8k30 (“First 8K Video From Space~Orig.mkv”): ARM: 27%, x86: 36%
8k60 (“Bulgaria 8K Hdr 60P Fuhd”): ARM: 37%, x86: 45%

UPDATE (09/Jan/2022 04:22CET): I’ve just posted a separate thread on hacking PyCharm to use the ARM JDK instead of the built-in x86 one for a MAJOR speedup - now it seems to be totally usable!!!

 
Last edited:

ADGrant

macrumors 68000
Mar 26, 2018
1,689
1,059
People who care about performance (particularly multi-core performance) don't use Python. Also you are not comparing the 2016 MBP with a comparable Intel Mac, your Intel Macs are all pretty low end performance wise.

I am also a little confused about why someone would care about Python performance in a Window 10 VM on a Mac. Python is a cross platform scripting language.
 

Menneisyys2

macrumors 603
Original poster
Jun 7, 2011
6,003
1,106
I am also a little confused about why someone would care about Python performance in a Window 10 VM on a Mac. Python is a cross platform scripting language.
I just wanted to know how much a perf. hit Parallels causes and whether changing the emulated CPU cores has a positive effect.
 

Menneisyys2

macrumors 603
Original poster
Jun 7, 2011
6,003
1,106
People who care about performance (particularly multi-core performance) don't use Python.

I used this Python example as it's available as it's platform-independent ( ! ) and available in source code form + can easily be modded.

Also you are not comparing the 2016 MBP with a comparable Intel Mac, your Intel Macs are all pretty low end performance wise.
Well, apart from the extra-expensive Mac Pros, the 2018 MacMini isn't THAT bad - the 2019 16" isn't much better (if at all), perf.-wise...

Of course it would have been best to compare to the latest-and-speediest Mac Pro, I just don't have access to it, unlike those other three Macs.
 

ADGrant

macrumors 68000
Mar 26, 2018
1,689
1,059
I just wanted to know how much a perf. hit Parallels causes and whether changing the emulated CPU cores has a positive effect.
Fair point, performance in VMs is always worse than on the bare metal the VM runs on but it is useful to know by how much.
 

ADGrant

macrumors 68000
Mar 26, 2018
1,689
1,059
I used this Python example as it's available as it's platform-independent ( ! ) and available in source code form + can easily be modded.

Well, apart from the extra-expensive Mac Pros, the 2018 MacMini isn't THAT bad - the 2019 16" isn't much better (if at all), perf.-wise...

Of course it would have been best to compare to the latest-and-speediest Mac Pro, I just don't have access to it, unlike those other three Macs.

The problem is that there is a lot going on between the Python code and the CPU. Python also doesn't support multithreading within the same process and forking off processes is very expensive resource wise.

True, Mac Pro does offer the fastest Intel Mac performance at a huge cost but the 27" iMacs are more reasonable and the best performing Intel Macs that don't cost the same as a car.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
@Menneisyys2 Can you create a public repo with your modified script and a markdown file with your results?

The more results this benchmark has, the more valuable it becomes.
 
  • Like
Reactions: Menneisyys2

Menneisyys2

macrumors 603
Original poster
Jun 7, 2011
6,003
1,106
@Menneisyys2 Can you create a public repo with your modified script and a markdown file with your results?

The more results this benchmark has, the more valuable it becomes.
I added them as attachments to this post. It's the same as the one in the article, except for the 20 -> 200 and 2000 changes. NOte: I had to change the file extension from .py to .txt to be able to attach them.
 

Attachments

  • 2000.txt
    423 bytes · Views: 87
  • 200.txt
    422 bytes · Views: 81
  • Like
Reactions: Xiao_Xi

leman

macrumors Core
Oct 14, 2008
19,521
19,678
People who care about performance (particularly multi-core performance) don't use Python.

I have to protest ? For example, most of my work is done in R. Sure, I could rewrite all of our pipeline in C++ and get a speed up of at least 100x, but it will likely take me years and make everything unmaintainable and undeployable, not to mention breaking workflows for my entire group (who are scientists, not programmers). There are very few organizations that care about performance only, it usually just one among many other constraints and concerns.

Given that it is not realistic for us to use a different ecosystem, we are very happy that these new Mac laptops run our scripts 3-4x quicker than their Intel predecessors.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
I could rewrite all of our pipeline in C++ and get a speed up of at least 100x, but it will likely take me years and make everything unmaintainable and undeployable
Every data scientist faces the eternal two-language programming problem.

People who care about performance (particularly multi-core performance) don't use Python.
This benchmark may not prove the true potential of a computer, but it helps to construe a better picture of a computer.
 
  • Like
Reactions: ahurst

mi7chy

macrumors G4
Oct 24, 2014
10,625
11,296
M1 and Alder Lake seem to do well on these type of workloads. Would like to see Alder Lake results for comparison.

AMD 5950x isn't much faster.

200 4.4s
2000 39.5s
 
Last edited:

ADGrant

macrumors 68000
Mar 26, 2018
1,689
1,059
I have to protest ? For example, most of my work is done in R. Sure, I could rewrite all of our pipeline in C++ and get a speed up of at least 100x, but it will likely take me years and make everything unmaintainable and undeployable, not to mention breaking workflows for my entire group (who are scientists, not programmers). There are very few organizations that care about performance only, it usually just one among many other constraints and concerns.

Given that it is not realistic for us to use a different ecosystem, we are very happy that these new Mac laptops run our scripts 3-4x quicker than their Intel predecessors.
All my work is done in C++ though a decent amount of my team's code base is Python (and there is even some R). We are gradually migrating some of the Python to C++.

I would not call C++ unmaintainable or undeployable but I am happy to concede that it is more challenging to work with than Python.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
Some have argued that the solution to this problem is Swift though Swift for Tensor flow
Python has some limitations regarding automatic differentiation, so devs use new languages such as Swift or Julia to experiment with automatic differentiation.

Julia devs also claim that Julia solves the two-language problem.
 

ikramerica

macrumors 68000
Apr 10, 2009
1,658
1,961
Every data scientist faces the eternal two-language programming problem.


This benchmark may not prove the true potential of a computer, but it helps to construe a better picture of a computer.
Not only that, but I found the exact same result comparing CPU bound rendering in ArchiCAD using a 16” M1Pro vs an older gen 4 core MBP. 3x faster rendering the same view with the same settings on the same model.

And ArchCAD is not native yet. So that’s using Rosetta 2.
 

ADGrant

macrumors 68000
Mar 26, 2018
1,689
1,059
Python has some limitations regarding automatic differentiation, so devs use new languages such as Swift or Julia to experiment with automatic differentiation.

Julia devs also claim that Julia solves the two-language problem.
Well both Swift and Julia use the same LLVM toolchain also used by the Clang C++ compiler to build platform native binaries. OTOH Swift, like C++, is statically typed but Julia like Python appears to be dynamically typed. Static typing typically provides better runtime performance.
 
  • Like
Reactions: Xiao_Xi

ahurst

macrumors 6502
Oct 12, 2021
410
815
People who care about performance (particularly multi-core performance) don't use Python.
As a scientific researcher I absolutely care about how well Python performs. I mean, unless some code is horrendously slow and easy to optimize in Cython, we’re going to write our analysis pipelines in Python using Numpy and Scipy where possible.

All sorts of data science and scientific workflows are Python-based (for good reason), so it’s a huge practical benefit when those workflows run fast!
 

resoverlord

macrumors newbie
Nov 2, 2021
18
34
As a scientific researcher I absolutely care about how well Python performs. I mean, unless some code is horrendously slow and easy to optimize in Cython, we’re going to write our analysis pipelines in Python using Numpy and Scipy where possible.

All sorts of data science and scientific workflows are Python-based (for good reason), so it’s a huge practical benefit when those workflows run fast!
People who say python is slow are speaking of something that used to be true more than 10 years ago. Between pandas, numpy, and scipy (all which have compiled c libraries), python is becoming (or arguably already is) the language of choice for data scientists, stock traders, and others because of both how robust it is, and how fast it is.
 

resoverlord

macrumors newbie
Nov 2, 2021
18
34
The problem is that there is a lot going on between the Python code and the CPU. Python also doesn't support multithreading within the same process and forking off processes is very expensive resource wise.

True, Mac Pro does offer the fastest Intel Mac performance at a huge cost but the 27" iMacs are more reasonable and the best performing Intel Macs that don't cost the same as a car.
Python supports multithreading. You do have the GIL you have to contend with, but it is supported.
 
  • Like
Reactions: ahurst

mi7chy

macrumors G4
Oct 24, 2014
10,625
11,296
The libraries I mentioned are written in C and accessed via python code. Artificial benchmarks aren’t telling the whole story.

It's not really Python then if it's calling C code. Almost forty years ago we'd call assembly language or machine language code from slow AppleSoft BASIC. That's not AppleSoft BASIC either.
 

Menneisyys2

macrumors 603
Original poster
Jun 7, 2011
6,003
1,106
Added a section (see "UPDATE (09/Jan/2022 02:04CET)") on VLC's ARM vs x86 + Rosetta2 emulation resuts on the same 16" base model
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.