Much has been written over the years about Moore's law, and incredibly it held up for almost 50 years. Every year or two, the number of resistors on a chip doubled, and hence the speed effectively doubled.
There is no "and hence" in Moorse's law. Doubling transistors doesn't necessarily increase clock speed.
The issue is becoming what to do with those additional transistors.
1. Could use same constant amount of transistors ( just now smaller ) and sell cheaper CPU packages.
For users that have hit a plateau in workload demands ( a now limited set of data and computatonal demands ) this is one approach.
2. Can add more stuff to the CPU package. The holy grail of a whole System on a Chip (SoC).
This is already happening. Mainstream Intel packages have x86+GPU+'old Northbridge' on a single die. Some of the new Atom class have the x86+GPU+'old Northbridge'+'old Southbridge' components all on the same die.
The huge, obsolete reasoning flaw is that all transistors in a "CPU" are devoted to computation instruction processing. It isn't. What is commonly referred to as a CPU is more so collection of what were discrete components in the old legacy systems your trying to use as a baseline.
3. "Copy and paste" more cores ( and immediate cache levels ). 6, 10, 12, 14 cores all on a single die. The "core count war" .
Two issues with this. First, you can get to higher core counts by using simpler cores (e.g., 'GPU' cores ). If have just one user who wants faster results from the same program and data set then more 'simpler' cores is generally more space efficient.
Second. this tends to work better for higher number of complex cores (x86 ) when have multiple users and/or applications running at the same time. But that also brings higher L3 cache level demands ( a broader spectrum of data being pulled in at the same time).
None of those is a push to higher x86 core clock speed.
There was/is some coupling between smaller process designs , tighter tolerances , and voltages that allowed clock rates to generally trend upwards. Alot of the slop in designs has been squeezed out over time. the instruction execution paths have been optimized for decades now. But that really isn't what Moore's law covered.
It was more so what could be done if had a bigger transistor budget. An evermore complicated individual x86 core hits the point of diminishing returns after a while. The bigger "band for the buck" returns now are in more narrower computations ( specialized like crypto or single instruction multiple data AVX ).
Over the last couple of years, however, much has been written in the news about the 'death of Moore's law'. It is incredible that the CPU clock speeds have not changed all that much in the last 8 years or so.
But the CPU speeds have changed. Especially on the Intel side. Again viewed through legacy, outdated, lens the clock speed has stalled because only talking about the 'base rate' speed. None of the modern Intel Xeon E5 run at just one constant speed on most workloads.
The change that has happened over last 4 years is a trend toward dynamically adjusting clocks. Cranked up when there is relatively little parallel work to do and set at a base when there is lots of parallel work to do. Single user workstations are not in just one of those modes most of the time when being activity used.
I think that now is the time to be designing hardware that can be incrementally upgraded rather than having to be replaced wholesale. I have a MP 5,1 with dual 3.46 GHz processors, an XP941 drive, and 48 GB memory, up from 12 GB that it started with. It is upgraded with a 7970 video card. With the exception of TB, I can still pretty much upgrade it to whatever connectivity that I want.
That is just as much due to plateauing of your workload as much as the hardware.
As pointed out above the CPU packages are becoming more integrated over time. Sticking with an older CPU package means setting in stone much more than just the x86 cores. Your memory speeds and bandwidths are stuck ( since integrated). Your PCIe speeds and bandwidth are stuck ( since integrated ). Your link to any Southbridge ( IO Hub) chipset is also stuck. Which means also stuck with the chipset.
PCIe isn't a 'cure all' panacea. PCie v2 lanes can't do PCIe v3 levels of bandwidth. Can't do v4.
The two x4 PCIe v2 lanes of the older Mac Pros is stuck either at v1 levels or bandwidth split on a single x4 source (i.e., behind a switch). Fire up the x4 PCIe SSD card and a x2 USB 3.0/SATA card at the same time and they will start resource competing.
I know this is a hackneyed idea on this forum, but in the context of Moore's law grinding to a halt, the idea of incrementally upgraded computers becomes even more attractive.
Moore's law isn' the root cause "Problem". Stagnating demand is. Computers are "fast enough" . While "640K ought to be enough for everybody" was a warning it is getting closer to being mainstream at the current common specs. Your are stopping at 48GB which is isn't even half way to the 2011-2012 Mac Pros limits. ( 128GB with OS X 10.9+ ). That is a sign workload is at least in part under shooting the limits of what the machine can do. No need for new Mac Pro because can't even fill workload capacity of old Mac Pro with base, hardcore requirements.
The market for "top fuel" drag racing CPUs ( max clock at all costs ) is relatively small and getting smaller. That is why the clock speed isn't the end-all-be-all of the CPU market anymore.