This is fine for developing early stages of apps but is not actually all that useful for performance testing (so you do all that on device). It works for iOS as there is not generally any architecture specific code. However it is not good for testing architecture transitions: you are still running on x86. It's much easier to imagine apps on MacOS using architecture specific code due to their longer history. They need real ARM hardware to test on
I was imagining it being much slower than the actual device, yes. But I don't really see why performance testing would matter. A dev box wouldn't really reflect shipping products' performance characteristics either.
But actually you can already run ARM code on your Mac and it's actually not that bad performance wise. As part of my course on computer architecture I've written ARM assembly, and it ran at pretty respectable speeds on my Mac through QEMU. And sure, the iOS Simulator is x86. - Do you know if the Android Emulator is as well? I mean I know that Android can run on x86, but the emulator allows using OS images from devices that are definitely ARM. - That runs fine as well.
I don't see a big need for performance testing. You can't really meaningfully use performance testing from a dev platform to determine anything about the shipping product. - Other than perhaps compare different algorithm's relative performance on the architecture, but really that shouldn't be so different to x86. Let the compiler do most of the heavy lifting.
Only time that could be different is if you're handcrafting assembly, in which case, you could either test on another ARM platform like Raspberry Pi for general program execution time hints, or if that's not good enough for you since you want to target Apple's specific ARM implementation specifically, a dev box also wouldn't be good enough, since it might not, for example, have the same caches that the shipping product will, and your hyper specific code testing wouldn't reflect that.
But you shouldn't really handcraft assembly anyway. If you know you want to use a specific x86 instruction, it's much better to use compiler intrinsics for your language, in which case the compiler can make a fallback for compiling on platforms that doesn't have the instruction for you. (in some cases).
Here's an actual performance comparison, run on my Mac. It's testing the collatz conjecture on 10 million numbers and finding the largest stopping time. Left side is ARM, right side is x86
So yeah, you can't expect the actual speed of the final product to be this, but you also couldn't do that with a dev box. This is fine for making sure your programs work, and it's plenty fast for that.
Reason there are two results for each, is that I ran both a recursively defined version and a loop version. - Both are compiled with GCC on -O3.