How do dynamic linkers work exactly that slows things down like this? Seems like once apps are compiled it shouldn't be an issue. It does seem insane to me though with 5000Mbps NVME SSDs and insanely fast cpus with integrated memory that word doc and spreadsheet apps can take 7-8s to load.
To be fair, Word and Excel are probably some of the most complicated apps on the market today, in part due to their age and compatibility requirements.
A compiled app still needs to be dynamically linked at runtime. What's done at compile time is done to ensure that the dynamic linker (dyld) has what it needs at runtime. Static linking doesn't require extra steps at runtime, but also means you can't substitute one library for another without recompiling. This last bit is important, because it is why you dynamically link in the first place.
Consider the simple case of an app that references AppKit. Every symbol (which can be a function, class information symbol, etc) the app uses from AppKit is recorded as a reference in the binary as part of the linker step when compiling. This makes it possible that
any compatible version of AppKit can be loaded and linked using the symbol tables contained in both binaries. This enables apps to pick up OS-level bugfixes without rebuilding, and not having to recompile for every OS release. But to enable that, dyld is responsible for crawling the symbol tables and making that final link between the binaries themselves. IIRC, because of how Objective-C constants work, things like constant NSStrings exported by AppKit get their own symbols as well, and so you have to link those at runtime too.
If an app references a lot of binaries, and/or imports an excessive number of symbols, it leads to dyld being very slow. Especially if a developer has a lot of libraries that all depend on system libraries (you get a sort of amplification effect of having to link the same system library to multiple frameworks, and then your app). You can improve things by trying to reduce the set of things you need to link at app start, but this is one area where Windows does have an edge: Windows supports delay loading DLLs, but macOS doesn't support the same sort of functionality last I checked. Instead I'd have to manually do the dynamic loading/execution of the binary like I would a plugin binary. Hard to reuse C++ types this way, for example.
When iOS added support for building frameworks (and thus you could dynamically link to non-OS libraries for the first time), there were some rather ugly performance issues with the functionality:
https://github.com/stepanhruda/dyld-image-loading-performance. Prior to iOS 8, you could only statically link on iOS, which bypasses a lot of the startup pain that dlyd induces. That said, this is a space that I know Apple has taken multiple passes on trying to improve the performance of dyld because while it works well for the sort of single-purpose apps that Apple likes to build on iOS/macOS, it has issues with the larger apps like Xcode, Office, Photoshop, etc.
But the truth is that every app is different, and their performance issues on startup will be different as well. But to give you an idea: Outlook on the Mac has 71 frameworks and dylibs in its bundle. Word has 65. Microsoft Word's app bundle is 2.2GB and over 700MB of that is in the Frameworks folder. Multiple Mach-O binaries are 20+ MB in size, with a couple creeping up on 70MB. That's a lot of stuff to link together, even accounting for the doubling up of size from them being Universal binaries.