Just analysing a device driver for hardware few months old... as usual, it's as bloated as a week-old dead pig overall but then I come across pieces like this:
call SomeFunc mov edx, eax test edx, edx mov [esi+48], edx jnz loc1 ... loc1: mov edi, edx ...
Instead of just moving the return value in eax to edi where it eventually gets used in one of the stupid sequences in OP, the compiler somehow decides to thread it through edx, before realising that edx needs to be used for something else and then moves it into edi.
This would be unacceptable even at O0, but this is released code presumably compiled with O2 or better and in active use by millions or more machines worldwide. WTF.