A14 - what version of the arm architecture ?

leman · Sep 9, 2020

theluggage said:
If you're going to hand-optimise your code for a specific processor implementation, you need access to the specific hardware (meatspace or virtual) for testing/tweaking at some point... and the issue is just as true of x86 if you're comparing i3 vs. i9 vs. Xeon vs. AMD. Apple can't wave a magic wand and make that not so. However - 90% of projects shouldn't need that, and even when they do, 90% of the development work can still be CPU independent.

...heck, a lot today's work is being done in Python or Javascript anyway...

It's tempting to say "OK, we're losing x86 compatibility, but we're gaining ARM compatibility!" but, really, just as the need to have an x86 laptop/desktop to develop for x86 targets can be overstated, the advantages of having an ARM laptop/desktop can also be overstated. The vast majority of modern development work is CPU independent (esp. if 64 bits, little-endian is a given and you're targeting modern OSs with good hardware abstraction frameworks) and the fractions-of-a-project that are the exceptions to that rule can't/shouldn't realistically be done without access to the real hardware.

You are right of course about all these things. As you say, having an ARM machine is probably less relevant for a normal developer who deploys to ARM, but it is crucial for library developers. Given that CPU semantics (e.g. in regards to memory order) differs from x86 and ARM, it is really helpful if your local machine has the same semantics if you are trying to debug some complex multi-threaded code.

I suppose the point I am trying to make is that new Macs will offer easy access to reasonably performance ARM hardware and that this will dramatically improve the software ecosystem for ARM, helping it to gain traction in the server market.

theluggage · Sep 9, 2020

leman said:
Given that CPU semantics (e.g. in regards to memory order) differs from x86 and ARM

Not sure what you mean there - they're both 64-bit, little-endian (well, technically ARM has big- and little-endian modes, but unless Apple are very, very 'courageous' MacOS will run little-endian, as do, AFAIK, most of the current linux distros)... and you'll most likely be running the same compiler/interpreter too (... e.g. clang or gcc) so structure alignment shouldn't be an issue. Whatever - nothing will ever beat the actual target hardware for fine tuning.

leman · Sep 9, 2020

theluggage said:
Not sure what you mean there - they're both 64-bit, little-endian (well, technically ARM has big- and little-endian modes, but unless Apple are very, very 'courageous' MacOS will run little-endian, as do, AFAIK, most of the current linux distros)... and you'll most likely be running the same compiler/interpreter too (... e.g. clang or gcc) so structure alignment shouldn't be an issue. Whatever - nothing will ever beat the actual target hardware for fine tuning.

I mean things like x86-64 using a strong memory model while ARM64 using a weak memory model. This can have a big impact if you are a library writer working on fine-grained multi-threaded code. I am sure there are other little bits that differ between architectures.

There is no question that fine-tuning can't really be done without access to specific target hardware. But we are not talking about fine-tuning here, we are talking about development itself. Another example: should upcoming Apple hardware support SVE, you can prototype and test the code using SVE intrinsics. When you then deploy to a HPC cluster, this code will automatically take advantage of the target hardware. If you must, this is the point at which you can fine-tune. But having an ARM laptop would speed up the development process a lot, since you don't need to constantly SSH to a remote machine to run your tests.

ADGrant · Sep 10, 2020

leman said:
You are right of course about all these things. As you say, having an ARM machine is probably less relevant for a normal developer who deploys to ARM, but it is crucial for library developers. Given that CPU semantics (e.g. in regards to memory order) differs from x86 and ARM, it is really helpful if your local machine has the same semantics if you are trying to debug some complex multi-threaded code.

I suppose the point I am trying to make is that new Macs will offer easy access to reasonably performance ARM hardware and that this will dramatically improve the software ecosystem for ARM, helping it to gain traction in the server market.

Well a server side developer deploying to ARM clusters is probably deploying to Linux/Docker so the developer is going to want to run Linux ARM VMs and Docker images on his or her local Mac. That's one of the advantages of using a Mac for development right now, for development its a lot like using Linux but you have all the Mac apps and the integration to the Apple ecosystem.

vigilant · Sep 16, 2020

theorist9 said:
This is out of my wheelhouse, so I'm just asking, but here's what I had in mind:

Suppose a developer is trying to optimize code for maximum performance, and he's comparing two ways of getting the same thing done, which we'll call code block A and code block B. It so happens that (without the dev's knowledge*), block A makes use of Apple's extended instruction set, while block B does not. As a consequence, code incorporating block A is faster on the AS Mac Pro than that using block B.

However, when deployed on the cluster (which does not have Apple's extended instruction set), B is faster than A.

Hence the AS Mac Pro would not be a good tool for doing such development testing unless it would be possibe to implement that same extended instruction set on the ARM cluster itself.

*You wrote that the extensions aren't visible to the user, suggesting to me that devs could be unaware which instructions are actually used to implement their code.

***************
Come to think of it, if you're spending 10's or 100's of millions for a new supercomputer, wouldn't it make sense to purchase, along with it, several custom workstations (for development work) that use the same processors, and as close as possible to the same architechture, as the supercomputer itself? It seems that's the only way you could do processor-specific (e.g., accounting for the sizes of the various levels of cache) and architechture-specific optimizations using workstations.

The extensions are hidden from the user yes. They are getting called from their own frameworks. If you are looking to do HPC on a server, you would have to assume the Accelerate framework wouldn't be there since the framework is made for Apple by Apple, and isn't intended to run on something like Linux.

Search

Search

A14 - what version of the arm architecture ?

leman

macrumors Core

theluggage

macrumors G3

leman

macrumors Core

ADGrant

macrumors 68000

vigilant

macrumors 6502a

Our Staff