AMD64 (what both Intel and AMD use) is an archaic ISA that has to be brute forced to run at any speed with modern CPU design. Long pipelines, complex branch prediction and the use of a lot of power is what hide this for the casual user.
The biggest problem of x86 is non-trivial instructions decoding, but it's a problem that can be solved in constant time. It just costs extra die space (probably even not much) and power. Long pipelines is an implementation detail, as x86 vendors want to go as fast as possible without going too wide with power consumption being an afterthought at best. As to use of complex branch prediction and other complicated performance enhancing shenanigans, there is barely a CPU maker on the market that has more of them than Apple.
All other things being the same a proper RISC ISA will outperform both AMD and Intel by a margin,
ARM and RISC-V will eventually run into the same problem but a higher absolute level of performance.
What are you basing these claims on? There are no RISC CPU that outperforms x86 CPUs on the per core basis. The "higher absolute level of performance" of RISC is a conjecture at best. I mean, RISC has many benefits, but it is not immediately clear to me why RISC CPUs should offer higher absolute performance. It's something people like to mention (at least since Apple Silicon came out and now RISC-V is gaining traction), but it has never been practically demonstrated or even elaborated. Personally,
As of now, the most obvious demonstrable advantage of ARMv8 is that it can be used to build CPUs with same peak performance but much lower power consumption. It is yet unclear how much of the latter is due to ARM ISA itself and how much due to Apple's secret source or reliance on advanced technologies with no costs spared (designing an ultra-wide CPU like Apple did is no easy or cheap enterprise). I mean, even the newest official ARM cores are still significantly behind what x86 offers, despite using latest and greatest ARM ISA. Intel's Gracemont cores for example seem to have performance comparable to ARM X1 (in a Snapdragon 888) at similar power consumption, despite it being an x86 core - both do around 1100 GB5 at roughly 5 watts.
think about what a M1 could do if Apple were to push it past 5GHz alone.
Why didn't Apple push it past 5Ghz then? Surely if it were possible they would do it, at least on their desktop machines, to have a commanding lead over x86 offerings? Instead they only boosted the clock by meagre 50mhz or so. I think this is where we come to design tradeoffs. Apple goes ultra wide, which (along with their other secret sauce) gives them top-notch performance with top-notch power consumption, but their design likely limits the maximal achievable frequency. It's ways better than the "let's go fast and very vey hot" general approach of x86 makers, but it is not clear at all that Apple's approach can achieve higher per-core throughput.