Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

johnnymcc

macrumors regular
Original poster
Jul 30, 2019
131
36
I swapped out my 16-core to a 28-core Xeon. Thinks like Geekbench and Cinebench are faster of course, but exporting 105 RAW photos from Lightroom takes longer? Also, exporting the same 4k CLOG timeline from Davinci Resolve is taking longer as well. Any ideas? Perhaps most apps aren't optimized for a 28-core Xeon?
 
16 core Xeon = 3.2 GHz
28 core Xeon = 2.5 GHz

If an app can’t utilize 28 freaking cores, the faster processor wins.
so, scrubbing through 4k CLOG timelines in Davinci (no proxies) is definitely snappier and better... Yea... I guess it depends on what I'm doing... just was hoping for better export times!
 
Because both exporting photos and exporting video are CPU single thread liming in your case.

So, the slower CPU you go, the longer the export time.

For 16 cores vs 28 cores, most likely only very few usage (single software) can see the improvement. Apart from video work, scientific computation, 3D rendering, software rarely are optimised to use more than 16 cores.

Photo work are mainly CPU single thread limiting. For those filters that can use multi cores, using GPU compute usually faster. So, high core count CPU usually has no benefit in photo editing. Especially your starting point is 16 cores already.

For video work, if you export it by using the GPU hardware encoding, then slower CPU may also means slower export, because the GPU driver itself CPU single thread limiting (or the CPU can only feed the GPU's media engine at a slower speed, etc).

But if you use software encoding, you should able to see the improvement in export time (compare to 16 cores software encoding). However, even 28 cores, the export time may be still not as good as hardware encoding. It really depends on what parameters you use for the video encoding. You may try different settings and see if that make any difference. (N.B. Your 7,1 most likely will consume more power if you switch from hardware encoding to software encoding)

Also, if you export LR photos and videos at the same time, you may see some improvement. More CPU cores definitely helps if you do a few things at the same time. If you want to improve your workflow efficiency, may be you need to re-plan your workflow to utilise more the CPU's multi tasking ability.
 
as @h9826790 said, Lightroom doesn't take advantage of all the cores.

I've had faster results from my overclocked quad-core pc compared to some of my high core systems.

In programs that can actually take advantage of those cores, they are much faster as you've experienced with.
 
16 core Xeon = 3.2 GHz
28 core Xeon = 2.5 GHz

If an app can’t utilize 28 freaking cores, the faster processor wins.

The 28 core 2.5 GHz Xeon posts faster singe threaded results on geekbench vs the “faster” 16 core core Xeon.
 
The 28 core 2.5 GHz Xeon posts faster singe threaded results on geekbench vs the “faster” 16 core core Xeon.
I was just going to note that. I did some benchmarks and the single-core scores are indeed better than the 16-core.
 
I was just going to note that. I did some benchmarks and the single-core scores are indeed better than the 16-core.
Assuming there is no thermal / power throttling issue. Then according to this table, the theory will be...
Xeon.png

1) For Benchmark, there is really only one, or close to one thread work load. So unless the CPU spread it to more than 16 cores to run that thread (e.g. the first 0.01s use core 1, then at 0.02s use core 2, and at 0.03s further switch to use core 3, etc... this is still single thread compute, but the CPU will share the work load to balance the heat exhaust). Then the W-3275 should has same speed as the W-3245. Your better result most likely due to other factors, or may be just within normal error margin. I never use the 7,1, but I believe most likely just need less than 4 cores to run a single thread benchmark. And I don't think it will use more than 16 cores to do that on the W3275.

So, if you lucky enough, on the day that W3275 only use 2 cores to share that work load, then the W3275 will work at 4.4GHz. Give you the best single thread benchmark result.

But if a W3245 share it to 4 cores (or may be some background task activate 2 more cores), then the actual CPU clock speed will be 4.2GHz. Which give you a worse single thread benchmark cores if compare to the above W3275.

However, it won't change the fact that they should have identical max single thread performance.

2) For real world work. e.g. There are now 10 threads of work to do, but one main thread is the bottleneck. For W3245, this world load may be spread to all 16 cores to compute (same reason as in point 1). So, the CPU actually work at 3.9 GHz.

But when using W3275, the workload now may be spread to 20 cores, or even all 28 cores. In this case, the clock speed will be reduced to 3.6GHz, or even 3.2GHz. Therefore, for the same job, it's slower on the W3275.
 
Last edited:
  • Like
Reactions: bernuli
Assuming there is no thermal / power throttling issue. Then according to this table, the theory will be...
View attachment 1990521
1) For Benchmark, there is really only one, or close to one thread work load. So unless the CPU spread it to more than 16 cores to run that thread (e.g. the first 0.01s use core 1, then at 0.02s use core 2, and at 0.03s further switch to use core 3, etc... this is still single thread compute, but the CPU will share the work load to balance the heat exhaust). Then the W-3275 should has same speed as the W-3245. Your better result most likely due to other factors, or may be just within normal error margin. I never use the 7,1, but I believe most likely just need less than 4 cores to run a single thread benchmark. And I don't think it will use more than 16 cores to do that on the W3275.

So, if you lucky enough, on the day that W3275 only use 2 cores to share that work load, then the W3275 will work at 4.4GHz. Give you the best single thread benchmark result.

But if a W3245 share it to 4 cores (or may be some background task activate 2 more cores), then the actual CPU clock speed will be 4.2GHz. Which give you a worse single thread benchmark cores if compare to the above W3275.

However, if won't change the fact that they should have identical max single thread performance.

2) For real world work. e.g. There are now 10 threads of work to do, but one main thread is the bottleneck. For W3245, this world load may be spread to all 16 cores to compute (same reason as in point 1). So, the CPU actually work at 3.9 GHz.

But when using W3275, the workload now may be spread to 20 cores, or even all 28 cores. In this case, the clock speed will be reduced to 3.6GHz, or even 3.2GHz. Therefore, for the same job, it's slower on the W3275.


Wow, great info! Thanks for posting!

Since the turbo freq is the same on the 16 and 28 core I expected the same Geekbench results. Just guessing here but maybe Geekbench benchmark is smart about the cores it uses and keeps the Turbo up while thermal throttling at bay.

So could it be in johnnymcc's case, the apps are sloppy in core usage and forcing a slower Turbo clock rate?

Bernuli
 
I swapped out my 16-core to a 28-core Xeon. Thinks like Geekbench and Cinebench are faster of course, but exporting 105 RAW photos from Lightroom takes longer? Also, exporting the same 4k CLOG timeline from Davinci Resolve is taking longer as well. Any ideas? Perhaps most apps aren't optimized for a 28-core Xeon?

Do you have any timings between the two machines?

Maybe try disabling spotlight on the folder you are exporting to so that mdworker won't grab up cores and maybe slow clock rate as h9826790's chart shows can happen.

What does the CPU Window in Activity Monitor show when you are exporting? How many threads are maxed out at a time?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.