A lot of folks commenting in this thread seem to be having difficulties trying to use "scaled" display resolutions. Let me take a shot at explaining what's going on.
The process can be abstractly understood by dividing it into three parts: (1) the macOS UI rendering engine; (2) the frame-buffer downscaling engine; and (3) the video datastream output engine.
Part 1, the UI rendering engine, only works at two pixel densities: the "normal" density that has been the same in macOS for decades, and a "HiDPI" density, which was created when the first "Retina" displays came out. The "HiDPI" mode basically assumes that each display pixel is half as wide, and half as tell, as a "normal" display pixel, and therefore renders everything across twice as many pixels in each dimension. (In other words, in terms of the number of pixels used, it draws everything twice as large.) Now, you might think that if you have an intermediate display resolution selected, that macOS actually renders the screen image directly into that pixel density, using a non-integer scaling factor, but it doesn't; there are only "normal" (1x) and "HiDPI" (2x) rendering modes.
Helpful background info: In ancient times, when displays were made with cathode-ray tubes, the images on screen were produced by a sweeping beam of electrons impacting the back side of the display glass, which had a phosphorescent coating that would emit light when excited by the passing beam. The beam itself was an analog physical phenomenon, rather than a discrete digital one. If you ran a CRT at a resolution of 1024x768, for example, the beam would actually trace 768 discrete horizontal lines across the screen. If you switched to a resolution of 1280x960, then the display would trace 960 discrete lines. The phosphors would simply light up wherever the beam hit them; there was no fixed "native" resolution of the screen.
Modern LCD/LED displays are completely different: each has a fixed grid of physically discrete display pixels, the size and arrangement of which can't be changed. Therefore, if a display receives a digital video signal with a resolution different from its own, it must mathematically interpolate the provided signal into its native resolution in order to display it. Interpolating from a lower (courser) resolution signal to a higher (finer) pixel grid is always going to result in some blurriness or jaggedness in the visible result. Interpolating from a higher-resolution signal to a lower-resolution pixel grid fares much better, but is dependent on the quality of the algorithm used, and the computational resources available for the task. (As far as I'm aware, displays are not normally designed to accept incoming video signals that use a higher resolution than the display's own.)
So, let's suppose you have a 4K 16:9 display, with a 3840x2160 grid of physical pixels, and a diagonal screen size of 27". Let's further suppose that if you tell macOS to render to a matching-size 3840x2160 target frame buffer, using the "normal" pixel density, the result makes everything on the screen look too small to you; whereas if you have macOS use "HiDPI" mode (everything double the size in pixels), everything looks too big. Suppose, at that screen size, your most preferred visual size would result from rendering at a pixel density somewhere between "normal" and "HiDPI", but as we know, the renderer can't do that on its own. What then?
Well, one option is to tell macOS to render at "normal" (small-looking) pixel density, but to a frame buffer that's smaller than the actual screen dimensions; then send that signal to the display and let the display "upscale" each video frame it receives to its native resolution. (This is what's described in the "Displays" system settings as "low-resolution modes".) As explained above, this will allow the image to be displayed, but the upscaling will unavoidably give a somewhat fuzzy, blurry appearance to the visible image. In other words, it works, but the result isn't as sharp and pretty as it could be.
The other thing you can do is (1) tell macOS to render at "HiDPI" (big-looking) pixel density, but to a frame buffer that's larger than the actual screen dimensions; then (2) separately interpolate that higher-resolution rendering down to the actual native resolution of the display; and finally (3) send that interpolated video frame to the display in its own native resolution.
This latter method gives outstanding results, but the key thing to understand about it is that it places a much higher workload on the video subsystem of the computer, in two ways: (1) it requires a lot more memory to accommodate the extra-large video frame buffer; and (2) you're adding the whole new computationally-demanding task of interpolating/downscaling the larger frame buffer into the target resolution. Only after that process is completed can the result be sent to the display device.
So, in theory, it really shouldn't matter what interface protocol is being used to convey the video signal from the computer to the display, as the video frames being sent are at the native resolution of the display, even when "scaling" is being used. Rather, the performance bottleneck lies in the capacity of the computer's video subsystem to access enough RAM to hold both the extra-large frame buffer and the downsampled conversion, and also to have access to the necessary computational resources to actually perform the conversion fast enough to keep up with the selected video refresh rate.
The most resource-intensive "scale" setting is always going to be the largest one that's still smaller than the display's native resolution. In that case, the RAM needed to hold the video data will likely be around four times what would be required for an ordinary native-size render, plus all the downscaling computation (which is completely unnecessary for a native-size render).
When Apple says that the M1 Mac mini "supports two 4K displays", it means at those displays' native resolution. Trying to run two displays at an intermediate "scaled" resolution can be as taxing on the computer's internal video subsystem as driving four or five such displays without scaling.
I hope this helps to explain the difficulties some of you have been experiencing.
~ Justin
________
~ Note: I've made several rounds of corrections & refinements to the text above in the five hours since first posting it. Hope you enjoy. ~