Can you think of any examples of what goes into hand-optimization? I’m curious.
I’m not sure exactly what you’re asking? Do you mean what kinds of circuits are hand optimized, or do you mean how is it done?
On every chip I’ve ever worked on, certain things are optimized transistor by transistor. Things like PLLs (used to generate and synchronize clocks), off-chip I/O drivers and receivers, and RAM structures. You start out with a schematic that you draw, transistor by transistor, showing the connections and the W/L ratio of each transistor, and you simulate it in SPICE, which is a dynamic circuit simulator. When you get it the way you want, you start laying out the structure (drawing the polygons), extract the exact transistor parameters and parasitics (resistance, capacitances) from the layout, feed that back into your simulations, and see how you are doing. You keep that cycle repeating until you’re happy.
At DEC, they did this for almost everything - ALUs, etc. At the places I worked, we did it for various other structures as well, but not for things like the ALUs (multipliers, adders, shifters) as far as I can recall.
Of course, the standard cells are also hand-optimized, but once you have a 2-input NAND gate with a certain drive strength, you use it wherever it is needed - you don’t create a new one each time, unless there is some special need, in which case you add the variation to the library. It does happen, but it’s rare.
For things that are not on the critical timing paths, like random control logic, you almost never bother hand optimizing it.