I am not familiar with ASTMatchers, but am with RecursiveASTVisitor and the issue seems to what some call the “data recursion” optimization that is specific to TraverseStmt. While e.g. TraverseDecl(Decl *) and TraverseType(QualType), for statements the default signature is TraverseStmt(Stmt *, DataRecursionQueue *Queue); this queue argument is employed to avoid excessive stack depths that apparently occur with some very big expressions. The way it works, I believe, is by enqueuing nested statements instead of processing them right away, in exactly the manner you describe.
Fortunately, it’s simple to turn off this optimization: just define a TraverseStmt(Stmt *S) in your Derived (I guess in this case, add this definition to whatever component of ASTMatchers inherits from RecursiveASTVisitor<…>).
To avoid this confusion in the future (the same problem vexed me awhile back in a different context), It might be wise to alter the default TraverseStmt implementation so that, by default, it keeps track of the stack depth and only employs the data recursion queue when the stack depths really do get too great, and otherwise performs nested traversals in the same manner as TraverseDecl and TraverseType, if that makes sense.