A bad test is better than no test at all. This isn't rocket science. Throw some identical light, medium and heavy loads for extended periods of time at the devices and see which one knocks out last. You can get a decent idea what the battery life is going to be like.
That should be true in principle, it often isn’t in practice. Long-lasting iPhones, for example, can be obliterated by extremely heavy usage patterns; the 12.9-inch iPad Pro can be made to look abhorrent with heavy use at full brightness, because of the screen drain, which other iPads don’t suffer as much. Use that iPad with light use and it’s a lot better. On iPhones, I’ve seen some 11 Pro Max tests back on iOS 13 which were utterly abhorrent, they’d drain it in 5 hours and the person who tested implied that “it still isn’t amazing, didn’t last me the whole day, but I recognize that this test was heavy”. I completely disagree: barring that use, it is enough for everyone but extreme users (that is, people who play games at high brightness on LTE, and for a fairly long time).
I‘d argue, where’s the relevance for the vast majority of users in those tests? Practically nobody matches those usage patterns, and the only thing those tests show is “look, I can drain a 3000 mAh battery on an A13 Bionic on iOS 13 (original version) just as fast as I could drain a half-sized battery on the A7 iPhone 5s“. Almost no regular user will get 5 hours on an 11 Pro Max on iOS 13. I’d be confident enough to say that that test is, for its assumed intents and purposes (i.e., show the device’s battery life), irrelevant. Due to the screen and the tested apps’ increased power draw, if I were to only take that test as measurement, I’d correctly conclude based on that data that the battery life between the iPhone 5s and the 11 Pro Max isn’t all that different. For obvious reasons, that conclusion is flawed.
Even the difference - when there is one - is misleading, because oftentimes devices on extreme drain tests are a lot closer than they actually are in reality: processor efficiency and battery size advantages diminish when pushing the device to its limits. Moderate users would see a better, more logical result than that which is obtained by extreme tests.