In case of C, the "input -> tokens" stage would be especially convoluted due to the bad language design, requiring feedback from "tokens -> AST"
Really ? I've written an about 80% complete C compiler, and C's tokens are straightforward as fuck. There's only like 30 or so of them. It is however the case that some tokens have different meanings based upon context, but that's something the parser can pretty easily solve if it's properly written.