I think it is a fair statement that there should be a similar level of optimization if different hw is going to be benched for software.
But I don't think it is fair to equate optimizing for documented instruction-sets, generic architectures, or using compiler directives, reasonably sized code tweaks with optimizing by completely redesigning or rewriting the benchmark more or less from scratch with a completely different non-compatible toolset, no cross-platform support, only proprietary stuff, no Open-standards, etc.. for completely re-engineered software for comparison.
It is also a value proposition. To do a huge effort to optimize for a proprietary platform with tweaks and optimizations that are completely useless outside a proprietary locked-in system. The evidence that the gains by doing this would have to be big enough to justify the effort also becomes crucial, much greater, compared to more open, and developer-friendly systems.
I think the main takeaway from this long thread is that no one was able to provide any tangible indication that the performance increase would be even close to being worth the effort to tweak chess code for M1 macs as they are so far behind, to begin with running basic cross-platform ARM, x86 code or even base c code compiled with native directives.