Geekbench 6 takes it too far in the other direction, because it assumes that the computer is doing only one thing at a time. That may be adequate for consumer devices, but it doesn't really reflect the way higher-end computers are often used.
A better benchmark would run a few copies of the multi-core benchmark in parallel, and perhaps also do the subtasks in different order in each copy. Then it would report the sum of multi-core scores as the true multi-core score.
If we think about it, GB6 is designed for multiple threads to co-operate and finish a common task. If a CPU architecture does well using GB6, does it follow that the same CPU architecture will do just as well completing multiple un-related tasks?
For massively parallel work-load that does not relate to one another, my understanding is that it is a main function of how fast the CPU can get at and process memory, how many useful threads can be in-flight at once, and how good the CPU IPC is. Apple's SoC main advantage is that it has larger pipe to main memory, the higher you go (M, M Pro, M Max, M Ultra), the pipe gets increasingly larger.
The main dis-advantage is that Apple's CPU core runs at much lower frequencies compared to the CPU's that it's being compared against. The positive is that single thread GB6 benchmarks from the M3 show that even running at a 2GHz deficit it trades blows with those it's being compared to.