I've thought about this also. One thing to consider as they widen these cores further is that the achievable parallelism on any given stream of instructions might not necessarily always be able to fully utilize the architecture 100% of the time. If the code is structured in such a way that it cannot keep an 8-wide pipeline full, SMT can utilize those resources whereas one thread might not.Another way to look at it is that SMT is a way to reduce the resource waste your architecture is already suffering from. If SMT gives you 40% higher performance in a multithreaded scenario, that kind of means that 30% of your execution resources that are essentially wasted in a single-threaded scenario.
SMT is a cheap way to get some of it back, but now when the race for highest IPC is back on, designers will try to improve the resource efficiency at the core itself (pun). It's not easy, but it can be done, as Apple very clearly illustrates with their 3.6Ghz CPUs reaching the same performance as Intel at 5Ghz. And if your core is already very good at utilising the resources, SMT becomes redundant.
Newer processors seem to be going crazy on execution resources in the hopes of being able to utilize them. The goal of course is to find any achievable parallelism and to exploit it, and SMT can help to fill the cracks where this isn't achievable.
All of that being said, I don't really miss SMT much. I wouldn't complain too much if Apple added it back in, but Apple's strategy has worked very well without it. Keeps things much simpler (I don't have to worry whether audio applications are going to complain if their threads aren't assigned ideally by the scheduler).