Graphic Performance of ARM Mac's - Steam/Cities Skylines

bill-p · Nov 12, 2020

Metal doesn't automagically mean better performance with Rosetta 2:

Before you port your app, test run it under Rosetta translation. When you run an app linked against macOS 10.15 or earlier under Rosetta translation, Metal supports backward compatibility through software workarounds for common programming errors; these workarounds trade some GPU performance for behavior that’s more consistent with Intel-based Macs.

Source: https://developer.apple.com/documentation/metal/porting_your_metal_code_to_apple_silicon

As for how OpenGL is implemented, I think it makes more sense for Apple to actually write native OpenGL driver rather than a OpenGL to Metal wrapper if their goal is to showcase performance level of M1.

Note that it's not just OpenGL. OpenCL is also available when targeting the GPU. So I think it's safe to assume Apple also wrote OpenGL and OpenCL drivers for M1... while they were working on Rosetta 2. And all for the sake of "backward compatibility" so that their customers will not detect the slightest hint of something "missing".

I'm sure there will still be compatibility issues and performance penalty to be seen with Rosetta 2. But Apple's effort really deserves a pat on the back here. They could have been like Microsoft: announce a half-complete emulator that doesn't really work, and when it does, it runs like arse.

leman · Nov 12, 2020

bill-p said:
As for how OpenGL is implemented, I think it makes more sense for Apple to actually write native OpenGL driver rather than a OpenGL to Metal wrapper if their goal is to showcase performance level of M1.

Note that it's not just OpenGL. OpenCL is also available when targeting the GPU. So I think it's safe to assume Apple also wrote OpenGL and OpenCL drivers for M1... while they were working on Rosetta 2. And all for the sake of "backward compatibility" so that their customers will not detect the slightest hint of something "missing".

Look at it from this perspective: would you rather occupy a team of grizzled high-paid GPU driver veterans for couple of months to write drivers for a morally obsolete framework that you have already deprecated or would you just hire some interns for a couple of weeks and ask them to put together available open source components, while your grizzly driver veterans are working on the Metal drivers (you know, stuff you actually care about)? Figuratively speaking of course, I doubt they used actual interns. But frankly, I would trust myself to write a semi-functional core OpenGL wrapper for Metal in about a month, and I have absolutely no clue of driver development.

Anyway, there is actually more indirect evidence that you are incorrect. OpenCL running Apple Silicon does not expose the CPU device. If they had written an OpenCL implementation for M1, why wouldn't they expose the CPU? Those have some pretty powerful vector units. But if you are running a wrapper on top of Metal and are just recompiling OpenCL shaders to the Metal shading language... well, only exposing the GPU device suddenly makes more sense.

Anyway, as @Ritsuka has already mentioned, this dispute can be easily solved by sampling the OpenGL call stack for an application running on M1.

bill-p · Nov 12, 2020

leman said:
Look at it from this perspective: would you rather occupy a team of grizzled high-paid GPU driver veterans for couple of months to write drivers for a morally obsolete framework that you have already deprecated or would you just hire some interns for a couple of weeks and ask them to put together available open source components, while your grizzly driver veterans are working on the Metal drivers (you know, stuff you actually care about)? Figuratively speaking of course, I doubt they used actual interns. But frankly, I would trust myself to write a semi-functional core OpenGL wrapper for Metal in about a month, and I have absolutely no clue of driver development.

Anyway, there is actually more indirect evidence that you are incorrect. OpenCL running Apple Silicon does not expose the CPU device. If they had written an OpenCL implementation for M1, why wouldn't they expose the CPU? Those have some pretty powerful vector units. But if you are running a wrapper on top of Metal and are just recompiling OpenCL shaders to the Metal shading language... well, only exposing the GPU device suddenly makes more sense.

Anyway, as @Ritsuka has already mentioned, this dispute can be easily solved by sampling the OpenGL call stack for an application running on M1.

A wrapper has potential compatibility issues and is overall much slower compared to native driver. Does Apple want to run a wrapper on top of a translation/emulation layer (Rosetta 2)? I highly doubt that. They wouldn't be able to claim 5x improvement then.

And OpenCL is GPU only:

OpenCL is deprecated, but is available on Apple silicon when targeting the GPU. The OpenCL CPU device is not available to arm64 apps.

Source: https://developer.apple.com/documentation/xcode/porting_your_macos_apps_to_apple_silicon

I think it makes more sense overall for them to provide OpenGL and OpenCL at the driver level. That way, the next Apple Silicon chips will also have great performance and compatibility during the transition. Also it makes sense that it's only available to the GPU with driver implementation. If it was a wrapper, why not provide wrapping for CPU functionality as well? OpenCL compute shaders are much more generic than OpenGL shaders... so it's actually not that much more work to compile them to work on ARM64 if you're already doing a wrapper.

The hint for me was Fusion 360. Last I checked, Autodesk is lazy and has not updated it to run on Metal yet. We can see how well it ran during the demo. That level of performance with a wrapper is not possible for interns.

jeanlain · Nov 12, 2020

bill-p said:
A wrapper has potential compatibility issues and is overall much slower compared to native driver.

As far as speed is concerned, I'm not sure that's the case considering the inefficiency of openGL. MoltenGL is supposed to be faster than openGL native, IIRC.

leman · Nov 12, 2020

bill-p said:
A wrapper has potential compatibility issues and is overall much slower compared to native driver.

I find this statement highly dubious. You are arguing that implementing an incredibly complex driver from scratch will yield a more stable interface than implementing it on top of an already stable and well behaved base. What is your basis for claiming that? Don’t forget that one of the reasons Apple deprecated OpenGL was because the drivers were notoriously difficult to write and maintain.

bill-p said:
And OpenCL is GPU only:

Source: https://developer.apple.com/documentation/xcode/porting_your_macos_apps_to_apple_silicon

That’s what I said, yes. Intel Macs support OpenCL for both the CPU and the GPU

bill-p said:
If it was a wrapper, why not provide wrapping for CPU functionality as well? OpenCL compute shaders are much more generic than OpenGL shaders... so it's actually not that much more work to compile them to work on ARM64 if you're already doing a wrapper.

No, it’s not much work, but still considerably more work than using an open source tool to translate an OpenCL kernel to a Metal compute shader and call it a day

The very fact that ARM does not expose the CPU clearly indicates that it’s not a fully fledged OpenCL implementation but a minimal comparability layer.

bill-p said:
That level of performance with a wrapper is not possible for interns.

We are literally talking about a simple piece of software that tracks OpenGL state machine and remaps gl_ calls into Metal API calls. There is nothing complicated to it. The performance impact will be solely the bookkeeping overhead inherent to the state machine - you know, the reason why OpenGL was canned.

Yebubbleman · Nov 12, 2020

leman said:
OpenGL is implemented on top of Metal on ~~AMD~~ ARM Big Sur, so I won't expect any noteworthy performance penalty. Of course, OpenGL is a less efficient API, but that is also true for Intel-based Macs. The WWDC presentations — if I remember correctly — mentioned that they are adding a compatibility layer to both Metal and OpenGL applications to guard agains common API misuse.

But regardless of what API your graphical application uses — it is likely to get a speed boost under Apple Silicon simply because the GPU is faster. Even if there is some penalty here and there.

You're missing my point; Apple has said that 64-bit Intel games and apps that employ Metal will not see speed hits on GPU performance the way that 64-bit Intel games and apps that don't employ Metal will see. That's why Shadow of the Tomb Raider was shown off at the WWDC20 keynote. It looked flawless in Rosetta 2 and the reason why is that it used Metal.

bwillwall · Nov 12, 2020

Wait for somebody to test it before you buy one. We don’t know if complex games will all work on Rosetta 2 properly.

bill-p · Nov 12, 2020

jeanlain said:
As far as speed is concerned, I'm not sure that's the case considering the inefficiency of openGL. MoltenGL is supposed to be faster than openGL native, IIRC.

That's the claim. Reality is different and depends on a per case basis. Read this blog post from the Roblox developers:

3 Years of Metal - Roblox Blog

Three years ago, we ported our renderer to Metal. It didn’t take much time, it was a blast and it worked really well on iOS.

blog.roblox.com

Also, it's worth noting that OpenGL ES is not the same as OpenGL. OpenGL is vastly more complex. MoltenGL only supports up to OpenGL ES 2.0.

leman said:
I find this statement highly dubious. You are arguing that implementing an incredibly complex driver from scratch will yield a more stable interface than implementing it on top of an already stable and well behaved base. What is your basis for claiming that? Don’t forget that one of the reasons Apple deprecated OpenGL was because the drivers were notoriously difficult to write and maintain.

That’s what I said, yes. Intel Macs support OpenCL for both the CPU and the GPU

No, it’s not much work, but still considerably more work than using an open source tool to translate an OpenCL kernel to a Metal compute shader and call it a day The very fact that ARM does not expose the CPU clearly indicates that it’s not a fully fledged OpenCL implementation but a minimal comparability layer.

We are literally talking about a simple piece of software that tracks OpenGL state machine and remaps gl_ calls into Metal API calls. There is nothing complicated to it. The performance impact will be solely the bookkeeping overhead inherent to the state machine - you know, the reason why OpenGL was canned.

OpenGL drivers were not developed by Apple. For recent Mac computers, they were written by Intel and AMD. And I have to add: both Intel and AMD are not known for writing good OpenGL drivers. nVidia provides the best OpenGL performance by far, and it shows: OpenGL vs Metal performance is not significant in Mac computers with nVidia GPUs, as opposed to more recent Macs with Intel or AMD GPUs. That's even considering Metal is lower level, so its performance advantage is due to having less overhead.

In the same token, what is your basis for claiming that a translation layer is good enough and has no performance or compatibility drawback?

OpenGL may not be faster than Metal enough for this kind of thing to matter, but... OpenCL was faster than Metal in some cases:

OpenCL versus Metal > DaVinci Resolve

real world speed test results for performance minded Macintosh users

barefeats.com

And at this point, your argument is this: since Metal is stable, translating OpenGL to Metal is more stable than writing an OpenGL driver from scratch.

I don't think that's how it works. You yourself stated that OpenGL is far too complex. So is writing a translation layer and mapping all OpenGL functions to some equivalence in Metal easier? Or is it easier to just write the drivers, even if it's half-assed?

Actually, here's a more realistic question: if a game has a rendering bug, is it because of the game itself? Or because of OpenGL? Or because of Metal? How do you even debug that?

That same question would be far easier to address if Apple was working on an OpenGL driver.

Same goes for OpenCL. And in fact, what is this open-source tool to translate OpenCL to Metal you are talking about? I'd love to know since porting OpenCL is still very much a manual process for me.

jeanlain · Nov 12, 2020

I couldn't find the link, but some have claimed that openGL ES has been implemented on top of Metal since the A11 (iPhone 8 is believe).
I'm with Leman, I would be surprised if Apple bothered to write drivers for a deprecated API. The fact that there is no software fallback for openGL, just like Metal, supports the hypothesis that openGL on the M1 is a wrapper around Metal.
And why should Apple care about openGL performance? It's deprecated! Apple wants openGL to die. They want 32 bit to die. Apple broke compatibly with many apps and games by releasing Catalina. Do you think they care about openGL performance?

Using a Metal wrapper so would also allow intel openGL apps to run on the M1 without much effort, as Metal calls are passed directly to the GPU. Apple never said that openGL calls in an intel App are passed directly to the Apple GPU.

We'll now when when get our hands on these new Macs.

bill-p · Nov 12, 2020

You mean this?

An Apple sandwich

Is graphics more reliable on newer iPhones?

medium.com

There's a partial crashlog that he posted when OpenGL ES crashed on the iPhone X/8, and guess what? It didn't reference any call to Metal.

In fact, his whole premise was that because his app crashed on Metal in the same manner, it must be crashing in OpenGL ES because OGLES is being wrapped to Metal. There are two comments that would agree with me. Honestly, there's not much for us to see there.

Also as above: it's much harder to write a wrapper than you think. When you have crashes, sure, you can fix them. But if you run into a rendering error? There isn't any easy way to know what caused the issue. Did the original code always have this bug? Is it because OpenGL is wrong? Is it because your wrapper is doing something wrong? Or hell, is it because Metal is doing something wrong? And actually, do you even want to deal with that on top of potentially wondering if it's also an Intel > ARM translation issue?

Writing a driver is easier: if a rendering error occurs, it's either the original code (or translated code), or the driver. Plain and simple.

Also, OpenGL is very complex, so why does that make writing a wrapper any easier than writing a driver? Just to note: there are other projects that attempt to wrap OpenGL to some other APIs as well. Those never got too far. MoltenGL only supports up to OpenGL ES 2.0. There's also Angle, which currently has only partial OpenGL ES 3.0 support.

If writing a wrapper was such an easy task, you'd think those folks would have been able to achieve full compatibility ages ago.

On the other hand, I don't see GPU manufacturers running into any issue writing new OpenGL drivers. Hell, Mesa exists, and it kinda proves my point.

bossy22 · Nov 13, 2020

AltecX said:
That would mean it will also benchmark similar to Intels new Xe graphics.

I am not sure. In raw benchmarking maybe, but it'll be so optimised for Big Sur/FCPX, that it'll blow Xe out of the water..

leman · Nov 13, 2020

bill-p said:
You mean this?

An Apple sandwich

Is graphics more reliable on newer iPhones?

medium.com

There's a partial crashlog that he posted when OpenGL ES crashed on the iPhone X/8, and guess what? It didn't reference any call to Metal.

The article you linked literally discusses that OpenGL on iOS is being implemented on top of Metal. And yes the crashlog directly references `AppleMetalGLRenderer 0x00000001a0d1f0f4`. If they use a Metal wrapper on ARM iOS, why would they use a direct driver on ARM macOS?

bill-p said:
Also as above: it's much harder to write a wrapper than you think. When you have crashes, sure, you can fix them. But if you run into a rendering error?

bill-p said:
I don't think that's how it works. You yourself stated that OpenGL is far too complex. So is writing a translation layer and mapping all OpenGL functions to some equivalence in Metal easier? Or is it easier to just write the drivers, even if it's half-assed?

I can claim fairly good familiarity with both OpenGL and Metal APIs. Yes, OpenGL is more complex, due to its more complex specification and the interactions of its internal API states. It's a bad API from the perfective of the API design. But the "rules" for rendering between Metal and OpenGL are the same. A wrapper would simply take care of tracking that complex OpenGL state and generating Metal calls. It's not a difficult thing to do if you understand the APIs. The biggest challenge is the shaders — but that is a solved problem with good quality shader transpires existing and being actively used in production.

Basically, you have two problems to solve when implementing OpenGL. First is implementing the OpenGL state machine and it's API. Second is mapping the changes in OpenGL state machine to your rendering hardware. Using Metal — a robust, simple and predictable interface — for the second step is much less work than writing a full low-level driver. And the reliability is likely to be better, because, well, Metal is robust, simple and predictable. Kind of more difficult to introduce a bug when you work with a well-behaved software layer that even provide debugging support, than when you directly try to program a hardware GPU interface.

bill-p said:
Writing a driver is easier: if a rendering error occurs, it's either the original code (or translated code), or the driver. Plain and simple.

I think you are trivializing the issue. With a wrapper, the fault lies either in the application, the OpenGL frontend, or the Metal backend. With a native driver, the fault lies either in the application, the OpenGL frontend, or the driver backend. The only difference is that you already have high quality backend implementation if you are writing a wrapper.

You seem to think that an OpenGL driver is this monolithic thing, but it's not. As I said before, OpenGL is a bad API. It doesn't match the hardware behavior. Every OpenGL implementation needs a complex frontend that would translate the OpenGL state into the hardware state. Metal also needs a frontend of course, but it's much simpler, since Metal API much ore closely reflect the states of the actual hardware.

bill-p said:
Also, it's worth noting that OpenGL ES is not the same as OpenGL. OpenGL is vastly more complex. MoltenGL only supports up to OpenGL ES 2.0.

Apple only supports a frozen, limited subset of OpenGL which is not that far off from OpenGL ES

bill-p said:
In the same token, what is your basis for claiming that a translation layer is good enough and has no performance or compatibility drawback?

It will obviously have performance overhead. But that overhead will be fairly minimal since Metal itself is low-overhead. As I wrote about, the trick is translating OpenGL state changes into something hardware can work with. That is where most of the overhead comes from. If you already need to walk though multiple lookup tables and validate a bunch of partial states, an additional function call or two don't play a big role. There also might be compatibility issues, but nobody cares much as long as it's not too bad.

bill-p said:
OpenGL may not be faster than Metal enough for this kind of thing to matter, but... OpenCL was faster than Metal in some cases:

OpenCL versus Metal > DaVinci Resolve

real world speed test results for performance minded Macintosh users

barefeats.com

Implementation details, and of course, the hardware they use did ship with a native OpenCL driver.

bill-p said:
Also, OpenGL is very complex, so why does that make writing a wrapper any easier than writing a driver?

Because you literally have to do half the work. And it's the other half (writing the hardware interface) that is really error-prone and laborious.

bill-p said:
Just to note: there are other projects that attempt to wrap OpenGL to some other APIs as well. Those never got too far.

Depends on the scope of the project I suppose. And on the interest. These wrappers focus on OpenGL ES because desktops generally don't support OpenGL ES, and it's a nice thing to have because you can write code that targets both desktop and mobile.

bill-p said:
Same goes for OpenCL. And in fact, what is this open-source tool to translate OpenCL to Metal you are talking about? I'd love to know since porting OpenCL is still very much a manual process for me.

I already posted a link, but here you go again

https://github.com/KhronosGroup/SPIRV-Cross

OpenCL kernels themselves can be compiled to SPIR-V using clang, it's one of the supported targets

bill-p · Nov 13, 2020

Well, I guess we'll have to wait and see what solution Apple ships with M1.

I'll concede that I was wrong about OpenGL ES implementation on iOS. It does seem that Apple started wrapping OpenGL ES to Metal around the time the iPhone 8 was introduced. But this solution doesn't seem to have high compatibility.

And as you said, it probably doesn't matter in the grand scheme. OpenGL is deprecated and Metal should replace it.

All of this to me just means Apple may not be highly committed to OpenGL compatibility with older apps. While this may be fine for apps that will be updated soon (if they have not already been updated), there are dinosaurs that may not age so gracefully. Older games, CAD apps, some obscure 3D applications may not work so well with Rosetta 2 then.

App compatibility was one of the pain points of the last transition, and I'm hoping it won't be as bad this time. But... again, we'll see.

leman · Nov 13, 2020

bill-p said:
All of this to me just means Apple may not be highly committed to OpenGL compatibility with older apps. While this may be fine for apps that will be updated soon (if they have not already been updated), there are dinosaurs that may not age so gracefully. Older games, CAD apps, some obscure 3D applications may not work so well with Rosetta 2 then.

Many of those apps probably don’t work on Catalina. The x32 cut hit legacy software hard.

Personally, I’m quite confident that if it runs well on Catalina, it will most likely run under Rosetta on M1 Big Sur.

Homy · Nov 13, 2020

One upside of M1 is that when you choose more RAM at purchase you automatically upgrade your VRAM too. So an iMac with 32 GB can have 20-30 GB VRAM.

bill-p · Nov 13, 2020

leman said:
Many of those apps probably don’t work on Catalina. The x32 cut hit legacy software hard.

Personally, I’m quite confident that if it runs well on Catalina, it will most likely run under Rosetta on M1 Big Sur.

Yeah, some of the apps I use tend to make use of OpenCL and OpenGL. To name a few:
Autodesk Fusion 360
Capture One Pro

And then there are the obscure Java-based ones as well.

I have hopes they will work well. At least Fusion 360 was demoed. But it remains to be seen how the others will work. Their developers are unlikely to update the apps this holiday... and it may be a year or two before they even try to support M1 natively, so Rosetta 2 is all I can count on.

Homy · Nov 14, 2020

In graphic (not compute) tasks, the M1 could be on par with the Radeon Pro 570X (and way ahead of the 560X). The TBDR architecture of the M1 (for which Metal has been tailored) benefits graphics more than it benefits compute. Also, Apple GPUs can use 16-bit AND 32-bit numbers in shaders, for precision and to boost efficiency, which PC GPUs can't.

That's great! I suspected that since it can render more pixels/s:

M1 41 GPixel/s, 82 GTexel/s
Pro 560X 16.06 GPixel/s, 64.26 GTexel/s
Pro 570X 35.36 GPixels/s, 123.8 GTexel/s
Pro 580X 38.4 GPixels/s, 172.8 GTexel/s
Pro 5300 52.8 GPixels/s, 132 GTexel/s
Pro 5500 XT 56.22 GPixels/s, 168.7 GTexel/s
Pro 5700 86.4 GPixels/s, 194.4 GTexel/s
Pro 5700 XT 95.94 GPixels/s, 239.8 GTexel/s

vladi · Nov 14, 2020

Homy said:
One upside of M1 is that when you choose more RAM at purchase you automatically upgrade your VRAM too. So an iMac with 32 GB can have 20-30 GB VRAM.

if you run Chrome you can kiss goodbye at least 8GB

Sanpete · Nov 14, 2020

Homy said:
One upside of M1 is that when you choose more RAM at purchase you automatically upgrade your VRAM too. So an iMac with 32 GB can have 20-30 GB VRAM.

So that's why it costs double!

Maximara · Nov 16, 2020

aliensporebomb said:
Why no transparency about clock speed? Does clock speed not matter anymore with this new design?

Because it is so different that clock speed doesn't mean jack.

Maximara · Nov 16, 2020

Shivetya said:
I have a near complete library of Paradox Games and I certainly won't go to AS if I have to give them up. I am not a believer in having two machines to do what one should do. So I am waiting to see what reviews bring. I would hope Big Sur has a method to ID programs that should work or won't work while still on Intel

I remember reading (back in 2018) that many Paradox Games were at 32-bit which means Intel macs that require 10.15 or higher won't run them. There is a program called Go64 which looks inside programs and it will flag programs with partial 32-bit code (which could cause problems) with a warning. Quite frankly I wish programmers had taken hints Apple was going to kill 32-bit (I seem to recall this speculation back in 2015) and future proofed their code.

In some cases (such as Sim3) the developers were so lazy that the only way to get the program 64-bit was to basicly rewrite the thing.

Jaekae · Nov 16, 2020

the new entry level m1 macs seems as fast at the latest intel models high end versions when using rosetta2

Maximara · Nov 16, 2020

bill-p said:
You're thinking of CrossOver/WINE.

Since WINE is not emulation, but rather kind of a mix between translation and wrappers, theoretically, it should work without fanfare with Rosetta 2.

And... well, we'll see. I have some games on GoG that use CrossOver in order to work on a Mac, so I can give them a try.

That is not how WINE works. It is effectively a Window x86 to MacOs x86 interpreter. More over it, as of this writing and outside of Codeweaver, isn't compatible with 64-bit only MacOS. It has the same issues that VirtualBox does - there is nothing in it that allows for x86 to ARM and it is making calls that Rosetta 2 doesn't handle.

iMacedonian · Nov 17, 2020

So what’s the verdict. Can Cities Skylines run on M1? That’s the only thing I’m waiting to find out before I make a purchase

Tafkaeken · Nov 17, 2020

iMacedonian said:
So what’s the verdict. Can Cities Skylines run on M1? That’s the only thing I’m waiting to find out before I make a purchase

steam worked but none of the games seemed to start in the live unboxing “Heads of tech“ is doing on YouTube. Could be something with the steam client not being a fan of Rosettas translation at first start.

Graphic Performance of ARM Mac's - Steam/Cities Skylines

macrumors 68030

macrumors Core

macrumors 68030

macrumors 68020

macrumors Core

macrumors 603

Suspended

macrumors 68030

macrumors 68020

macrumors 68030

macrumors newbie

macrumors Core

macrumors 68030

macrumors Core

macrumors 68030

macrumors 68030

macrumors 68030

macrumors 65816

macrumors 68040

macrumors 68000

macrumors 68000

macrumors 6502a

macrumors 68000

macrumors member

macrumors member

Our Staff