Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
Isn't a lot of scientific computing using clusters now, where a job unexpectedly going multithreaded would be a problem? I've seen people queuing one simulation/process per core, for example (with different parameters). Having each of those processes spawn 16 threads at some point would be hilarious (I don't think it would crash, just that there would be a lot of frivolous context switches).

Depends on how your system works I suppose. The supercomputer I am currently working with relies on VM provisioning, so clusters report as many cores as what the VM is configured with, even if multiple nodes end up running on the same hardware.
 
  • Like
Reactions: Andropov

crazy dave

macrumors 65816
Sep 9, 2010
1,454
1,230
Isn't a lot of scientific computing using clusters now, where a job unexpectedly going multithreaded would be a problem? I've seen people queuing one simulation/process per core, for example (with different parameters). Having each of those processes spawn 16 threads at some point would be hilarious (I don't think it would crash, just that there would be a lot of frivolous context switches).

Oh I’ve definitely had that blow up on me when I forgot to kill numpy‘s ability to spawn multiple threads and I ended up completely clogged. I had to kill the job (which was a test run of dozens of simulations each of which was trying to fill the rest of the cores with numpy threads) and start over otherwise it would’ve taken forever. I think the numpy I was using on my dev machine was compiled not to do that, but the cluster numpy was compiled to spawn as many threads as it could.
 
Last edited:

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
RE multiples: oh they're going to. Here on out, the scaling will mostly be one of E cores for the next 3-5 years from the rumors (certainly for the top SKU). I'm talking about 8/32 P/E ratios for Arrow Lake which is 15th gen. 8/24 for Raptor Lake and Meteor Lake, which are 13th and 14th generation respectively.

For the "SoC" packages that keep the iGPU that won't help them.


Intel-Alder-Lake-Dies-768x410.jpg




The 6+0 saves space ( 53) but it isn't really saving a competitive iGPU addition worth of space. If not bringing up the iGPU on the mobile products then building a "dinosaur". Even in the bulk of mainstream desktop sales it is a shrinking pie. But the primary purpose of the E cores is to put the die size on a die ( 8 core , backported Gen 11 ... has some of the same problems as Gen 12 as pretty likely Intel initially imagined having something smaller available but had to 'settle'. )

If Intel wanted to do a competitor to the mainstream Ryzen desktop (and low end Threadripper) that was either completely GPU less ( or "stuck in time EU core allocation" ... smaller and smaller each process shrink iteration) then that is path , but probably detached from the primary markets that Apple is going after.

Intel moving into being a discrete GPU seller. Yeah.. probably right that they probably will want some "CPU" SKU product that sells more dGPUs.

versus what Apple is doing with die space allocation.

Die-Sizes.jpg



The 8+8+iGPU is in similar ballpark as a Pro in size but graphics performance not even close.

The mid-upper range , "box with slots" workstation the iGPU less allocation of max mid-size "E" cores would get traction. That actually make some sense as some of the commentary from Intel's Hot Chip session on Xeon SP Gen 4 (Sapphire Ridge) said that going to move away from some desktop benchmarks as guides for where server product was going.


It doesn't seem like the top model will be gaining any more P cores for some time. They are basically doing the inverse of Apple, not that the core nomenclature necessarily means anything in an absolute scale, an Apple Firestorm core will probably be more energy efficient than whatever Intel has for several years to come, but you get the idea.

If trying to keep up with Apple then it is probably going to be a 'fail'. That because it isn't just P cores that Apple is focusing on. Going relatively high multiple E cores means likely loose on the GPU front. QuickSync used to be the leading video de/encoder. That's slipping through their fingers also with "more E cores".


70% of Intel's 'client computing' business is selling laptop processors. More E cores isn't going to save their laptop business.
 

MayaUser

macrumors 68040
Nov 22, 2021
3,178
7,204
Since the rumours are going more and more to the fact that the bigger imac and mac mini will come after june...are there any chances that the m1 pro and max to be based on the A15 cores and not a14 since they will be so old 6 months from now ?!
 

Krevnik

macrumors 601
Sep 8, 2003
4,101
1,312
Intel can’t “join the ARM” party. They need their x86 monopoly to be able to afford their own fabs. In a world where Intel is just another ARM-compatible chipmaker, that can’t have their own fabs.

Since Intel has stated they are looking to become a fabrication service for 3rd parties similar to TSMC, this seems to be less of a problem now than it would prior to the current CEO taking over.

But with AMD putting downward pressure on Intel chip prices and eating into Intel market share, there’s no guarantee that dominant position would last anyways. Opening up their fabs is a hedge in case their efforts against AMD fail, and the result is a more durable duopoly than there has been in the past.

EDIT: I would also think having your own fabs would offer a price advantage through vertical integration, rather than having to pay TSMC/Samsung’s margins along with your own.
 

huge_apple_fangirl

macrumors 6502a
Aug 1, 2019
769
1,301
EDIT: I would also think having your own fabs would offer a price advantage through vertical integration, rather than having to pay TSMC/Samsung’s margins along with your own.
It is an advantage when you have a monopoly on the dominant ISA and have the wafer volumes to sustain those fabs. If there is a competitive chip market (like ARM) it’s better for everyone to pool their volumes together for one company to fab (eg TSMC) for better economies of scale. This becomes even more important as new nodes get more and more expensive.

As for Intel becoming a foundry, we shall see. They’ve talked about doing this before and screwed all their foundry customers over with the 10nm disaster. Although this time they may have the advantage of Uncle Sam “suggesting” that government contracts include chips fabbed by a trusted foundry- AKA Intel. Plus CHIPS Act $$.
 

JouniS

macrumors 6502a
Nov 22, 2020
638
399
Isn't a lot of scientific computing using clusters now, where a job unexpectedly going multithreaded would be a problem? I've seen people queuing one simulation/process per core, for example (with different parameters). Having each of those processes spawn 16 threads at some point would be hilarious (I don't think it would crash, just that there would be a lot of frivolous context switches).
There is decades of culture in using shared hardware. The fundamental assumption is that the user is responsible. They are expected to know what they are doing, and they should take other users' needs into account.

If a job can launch multiple threads, the user is expected to know that. And to work defensively by setting the maximum number of threads using environment variables or command line options. If the multithreading is enabled automatically by the compiler, the user is expected to know that as well. They can't just start running random binaries of unknown origin.

Multi-user systems often have various safety mechanisms. Newer systems may be based on virtual machines that only expose the resources the job is allowed to use. On older systems, a supervisor process may kill jobs that try to use more resources than they requested. But even when such mechanisms are present, it's the user's responsibility to ensure that the job only uses the resources it requested.
 

crazy dave

macrumors 65816
Sep 9, 2010
1,454
1,230
There is decades of culture in using shared hardware. The fundamental assumption is that the user is responsible. They are expected to know what they are doing, and they should take other users' needs into account.

If a job can launch multiple threads, the user is expected to know that. And to work defensively by setting the maximum number of threads using environment variables or command line options. If the multithreading is enabled automatically by the compiler, the user is expected to know that as well. They can't just start running random binaries of unknown origin.

Multi-user systems often have various safety mechanisms. Newer systems may be based on virtual machines that only expose the resources the job is allowed to use. On older systems, a supervisor process may kill jobs that try to use more resources than they requested. But even when such mechanisms are present, it's the user's responsibility to ensure that the job only uses the resources it requested.
True but accidents happen ;)


Thankfully I had the node to myself. But yes the test run still had to be killed.
 
  • Like
Reactions: Andropov

Krevnik

macrumors 601
Sep 8, 2003
4,101
1,312
It is an advantage when you have a monopoly on the dominant ISA and have the wafer volumes to sustain those fabs. If there is a competitive chip market (like ARM) it’s better for everyone to pool their volumes together for one company to fab (eg TSMC) for better economies of scale. This becomes even more important as new nodes get more and more expensive.

I agree you need to have the volumes to sustain the fabs, which is required to get the ball rolling, but not the monopoly part. The ability to take margins from multiple parts of the manufacturing chain is one of the ways you create monopolies. But you do need to be big enough to feed the beast to go down that route.

Your statement also requires that TSMC is giving all parties a good deal, rather than using their position to extract higher margins from the entities competing with each other (since in your scenario TSMC doesn’t really have competition itself). So long as TSMC has a node advantage, they don’t have a lot of incentive to pass along the benefits of their ability to scale to customers. Meanwhile, a competitor with their own fab doesn’t have to play the same game. Samsung for example. But Samsung has the flexibility to use the bigger margins how they see fit: invest in fab tech, cut prices, etc. They have more options than they would if they were also a TSMC customer.

As for Intel becoming a foundry, we shall see. They’ve talked about doing this before and screwed all their foundry customers over with the 10nm disaster. Although this time they may have the advantage of Uncle Sam “suggesting” that government contracts include chips fabbed by a trusted foundry- AKA Intel. Plus CHIPS Act $$.

I don’t think Intel has as much choice looking at the future. Mobile is where the market is today, desktops continue to shrink, and AMD is making inroads on laptops and servers. Intel’s health in the long term is effectively dependent on either shedding the fabs (which is one of their biggest assets currently), or leveraging the fabs to keep them fed and continue to invest in them.

While it won’t be easy, I think the best tack Intel can make is to invest in their fabs and compete with TSMC, even if it means losing the grip on the x86 market they’ve enjoyed. Just from the perspective of what would it take to keep Intel “healthy” into the next 2 decades.
 
  • Like
Reactions: BigPotatoLobbyist

pshufd

macrumors G4
Oct 24, 2013
10,151
14,574
New Hampshire
AMD reported this afternoon and they're showing margins at 50%. Seems like EPYC is doing really well in data centers. I saw some numbers on Intel of 52% to 53% for Intel. Intel has the single-core performance crown right now but they have to do well on efficiency too.
 
  • Like
Reactions: BigPotatoLobbyist

huge_apple_fangirl

macrumors 6502a
Aug 1, 2019
769
1,301
I agree you need to have the volumes to sustain the fabs, which is required to get the ball rolling, but not the monopoly part. The ability to take margins from multiple parts of the manufacturing chain is one of the ways you create monopolies. But you do need to be big enough to feed the beast to go down that route.
You don't need a literal monopoly but you need a big moat. And being one of many ARM chipmakers just ain't it.
Your statement also requires that TSMC is giving all parties a good deal, rather than using their position to extract higher margins from the entities competing with each other (since in your scenario TSMC doesn’t really have competition itself). So long as TSMC has a node advantage, they don’t have a lot of incentive to pass along the benefits of their ability to scale to customers. Meanwhile, a competitor with their own fab doesn’t have to play the same game. Samsung for example. But Samsung has the flexibility to use the bigger margins how they see fit: invest in fab tech, cut prices, etc. They have more options than they would if they were also a TSMC customer.
Because TSMC is pure-play, they have a symbiotic relationship with their customers. Price-gouging them would be a terrible idea- it would end up harming TSMC itself. They are just as dependent on their customers as the inverse.
I don’t think Intel has as much choice looking at the future. Mobile is where the market is today, desktops continue to shrink, and AMD is making inroads on laptops and servers. Intel’s health in the long term is effectively dependent on either shedding the fabs (which is one of their biggest assets currently), or leveraging the fabs to keep them fed and continue to invest in them.

While it won’t be easy, I think the best tack Intel can make is to invest in their fabs and compete with TSMC, even if it means losing the grip on the x86 market they’ve enjoyed. Just from the perspective of what would it take to keep Intel “healthy” into the next 2 decades.
Intel may not have a choice. But they aren't guaranteed any success just for recognizing reality.
 

pshufd

macrumors G4
Oct 24, 2013
10,151
14,574
New Hampshire
You don't need a literal monopoly but you need a big moat. And being one of many ARM chipmakers just ain't it.

Because TSMC is pure-play, they have a symbiotic relationship with their customers. Price-gouging them would be a terrible idea- it would end up harming TSMC itself. They are just as dependent on their customers as the inverse.

Intel may not have a choice. But they aren't guaranteed any success just for recognizing reality.

Mobile is where it is because of the cloud. So maybe cloud is also where it's at.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
I don’t think Intel has as much choice looking at the future. Mobile is where the market is today, desktops continue to shrink, and AMD is making inroads on laptops and servers. Intel’s health in the long term is effectively dependent on either shedding the fabs (which is one of their biggest assets currently), or leveraging the fabs to keep them fed and continue to invest in them.
Why did Intel try to buy SiFive? Could Intel/AMD transition to RISC-V instead of ARM?
 

huge_apple_fangirl

macrumors 6502a
Aug 1, 2019
769
1,301
Why did Intel try to buy SiFive? Could Intel/AMD transition to RISC-V instead of ARM?
RISC-V has the same problem (for Intel) that ARM does: it is open. Intel needs to ship large volumes of chips to justify the financial commitments in bleeding edge fabs. They could never achieve the necessary market share for that in a competitive market. For AMD, it doesn’t matter- they are fabless. ARM, RISC-V, x86 or what have you- irrelevant to the business model. Lisa Su said they would be happy to design ARM chips if their customers were interested.
 
  • Like
Reactions: psychicist

thenewperson

macrumors 6502a
Mar 27, 2011
992
912
Since the rumours are going more and more to the fact that the bigger imac and mac mini will come after june...are there any chances that the m1 pro and max to be based on the A15 cores and not a14 since they will be so old 6 months from now ?!
Doubtful. I think M1 pretty much means Firestorm + Icestorm + G13 cores (dunno what they call the Neural Engine cores but + those too). Any new cores will probably get new numbers. iMacs (and Mac Pros) launching so late would probably be down to supply chain issues.
 

crazy dave

macrumors 65816
Sep 9, 2010
1,454
1,230
I’m sure they’d be more interested in the engineers than in the sifive product. They bought strong arm and got rid of it, after all.

There are also non-mainline projects I’m sure Intel could use RV processors for. Intel does ship products with ARM cores and as more than just microcontrollers (eg I think the CPU cores on Intel DPUs are ARM-based)

Edit: yup including 16 ARM N1 cores on Mount Evans
 
Last edited:

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
Intel does ship products with ARM cores and as more than just microcontrollers
It seems Intel IoT uses x64 CPUs. I thought ARM- and RISCV-based SoCs were better for embedded systems because they are more power-efficient. Why does Intel not use ARM- or RISCV-based SoC for Intel IoT?
 

crazy dave

macrumors 65816
Sep 9, 2010
1,454
1,230
It seems Intel IoT uses x64 CPUs. I thought ARM- and RISCV-based SoCs were better for embedded systems because they are more power-efficient. Why does Intel not use ARM- or RISCV-based SoC for Intel IoT?

Dunno. Maybe there’s a company mandate to try to push their own solutions so that they can claim it’s just as good. But often times people use what they have most readily available and are most familiar with or has compatibility with something else they want to keep using.

I mean Apple has all sorts of x86 based controllers on its ARM-based products. Somewhat different I’ll grant you since while switching to ARM or RV might result in significant power savings for that controller, that controller is such a minuscule portion of total system power that it doesn’t really matter and it’s better to go with a known familiar solution that works. In contrast, obviously for the IoT device, it’s main SOC obviously has a big impact on power usage. I just bring it up because often these decisions are made for “reasons” that when examined closely might not make sense from optimality perspective but do from a human production one.

Someone else might have more concrete insights into why.
 

Andropov

macrumors 6502a
May 3, 2012
746
990
Spain
There is decades of culture in using shared hardware. The fundamental assumption is that the user is responsible. They are expected to know what they are doing, and they should take other users' needs into account.

If a job can launch multiple threads, the user is expected to know that. And to work defensively by setting the maximum number of threads using environment variables or command line options. If the multithreading is enabled automatically by the compiler, the user is expected to know that as well. They can't just start running random binaries of unknown origin.

Multi-user systems often have various safety mechanisms. Newer systems may be based on virtual machines that only expose the resources the job is allowed to use. On older systems, a supervisor process may kill jobs that try to use more resources than they requested. But even when such mechanisms are present, it's the user's responsibility to ensure that the job only uses the resources it requested.
Some years ago I took a course on handling large datasets in astronomy applications. The instructor mentioned that one of her students tried to use the department's server to median-filter 10TB worth of data from a radiotelescope array (I believe it was Murchison Widefield Array) by loading all the images in memory and sorting them directly. Server went down.

While that's an extreme example, I've heard of similar things happening at my university's cluster due to causes way easier to predict than the compiler sneaking multithreading in. People who use it aren't necessarily expected to know the vicissitudes of the compiler they're using. Arguably they should, but it's unrealistic in practice as they have other priorities.

Anyway, looks like ICC only does autothreading on Windows builds, Linux/macOS require an extra flag to enable it so probably not a problem in practice.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
Some years ago I took a course on handling large datasets in astronomy applications. The instructor mentioned that one of her students tried to use the department's server to median-filter 10TB worth of data from a radiotelescope array (I believe it was Murchison Widefield Array) by loading all the images in memory and sorting them directly. Server went down.

While that's an extreme example, I've heard of similar things happening at my university's cluster due to causes way easier to predict than the compiler sneaking multithreading in. People who use it aren't necessarily expected to know the vicissitudes of the compiler they're using. Arguably they should, but it's unrealistic in practice as they have other priorities.

Anyway, looks like ICC only does autothreading on Windows builds, Linux/macOS require an extra flag to enable it so probably not a problem in practice.

The funny thing is that an a average cluster user is often a very poor programmer… these folks are scientists, not developers. That’s also why much of the tooling that target this demographics (CUDA, ICC) is focused on quick and dirty code rather than on high quality code.
 
  • Like
Reactions: ahurst and Andropov

pshufd

macrumors G4
Oct 24, 2013
10,151
14,574
New Hampshire
The funny thing is that an a average cluster user is often a very poor programmer… these folks are scientists, not developers. That’s also why much of the tooling that target this demographics (CUDA, ICC) is focused on quick and dirty code rather than on high quality code.

I looked at the code for a Genomics pipeline once and I was really surprised by how inefficient it was. It made multiple passes over large datasets which could have been done in one pass. The solution was correct; just not optimized. Not specifically a problem when you have a big hardware budget but a waste if you have a CS background.
 
  • Like
Reactions: ahurst

leman

macrumors Core
Original poster
Oct 14, 2008
19,522
19,679
I looked at the code for a Genomics pipeline once and I was really surprised by how inefficient it was. It made multiple passes over large datasets which could have been done in one pass. The solution was correct; just not optimized. Not specifically a problem when you have a big hardware budget but a waste if you have a CS background.

Or look at R… now there are some really fit people working in the main team, but they have to work with the legacy code base that does things in the most horrible way possible. No wonder I see a speed up of 3-4x on my M1 compared to an Intel i9, better branch prediction and cache do wonders on crappy code like that.
 
  • Like
Reactions: ahurst

mr_roboto

macrumors 6502a
Sep 30, 2020
856
1,867
I mean Apple has all sorts of x86 based controllers on its ARM-based products. Somewhat different I’ll grant you since while switching to ARM or RV might result in significant power savings for that controller, that controller is such a minuscule portion of total system power that it doesn’t really matter and it’s better to go with a known familiar solution that works.
Citation very much needed. Essentially nobody but Intel uses x86 for deeply embedded microcontrollers (and not even Intel always does it, they've used other architectures too).

"Known familiar solution that works" doesn't solve the real problems here. There's major patent issues, and unlike ARM or RV, there's no good standards document to refer to when implementing x86 - the whole x86 ecosystem is very ad hoc.

And really, who inside Apple would find x86 so compellingly familiar that they'd want to use it in Apple Silicon despite all the downsides? The low-friction low-cost high-familiarity option for Apple would be a Cortex-M0, or an in-house equivalent.
 
  • Like
Reactions: BigPotatoLobbyist

crazy dave

macrumors 65816
Sep 9, 2010
1,454
1,230
Citation very much needed. Essentially nobody but Intel uses x86 for deeply embedded microcontrollers (and not even Intel always does it, they've used other architectures too).

"Known familiar solution that works" doesn't solve the real problems here. There's major patent issues, and unlike ARM or RV, there's no good standards document to refer to when implementing x86 - the whole x86 ecosystem is very ad hoc.

And really, who inside Apple would find x86 so compellingly familiar that they'd want to use it in Apple Silicon despite all the downsides? The low-friction low-cost high-familiarity option for Apple would be a Cortex-M0, or an in-house equivalent.

EDIT: Found it!



So 1 microcontroller, but not all sorts. The rest are indeed off the shelf/custom ARM as far as I can tell. But my larger point still stands, sometimes these idiosyncratic decisions are made for “reasons” that you just have to guess at. It made sense to someone: cost, familiarity, some piece of code that only works with this chip - hard to know sometimes

======

Original:

So I’ll admit I’m having trouble finding the exact post I was thinking of - plenty of M3/chinook coprocessors. I was able to confirm Intel TB repeaters which is not what I was thinking of but overall it’s possible I saw Intel and replaced it with x86 in my head. I’ll look again later to see if I can dig anything up.

It was less that it was so compelling and rather that “we used it on our previous machines so we just use it here for the same task”. But so far I’m coming up with less than I thought. Which is really bugging me because I can almost visualize the post(s) and I can’t find any of what I remember - even peripheral information.
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.