Background: recently Anna had to revert r189090 “Refactor conditional expression evaluating code”, which was added in order to handle temporary destructors that fire conditionally. The motivation for that commit was expressions like “(A() && B()) || (C() && D())”, where the destructors for B, C, and D only fire under certain conditions. In order to make sure these destructors only fire at the right times, Pavel structured the CFG so that the various terminator expressions (“A() && …”, “(A() && B()) || …”, etc.) would be used twice: first during normal expression evaluation, and then again during cleanup. This falls out of a long discussion on PR11645.
The problem was that before his commits, logical operators && and || were computed in the analyzer by walking back up the CFG to see how we got to this point. When we’re deciding whether to run destructors, the CFG blocks where the && or || values were computed is long gone. Pavel’s approach (which both Ted and I reviewed) was to (a) explicitly record whether an expression evaluated to true or false when used as a branch, and (b) keep the leaf expressions of && and || expression trees alive, so that the branch information could be recovered later on.
Unfortunately, two problems fell out of this: first, that in a loop it was possible to accidentally reuse a still-live expr value from the previous iteration, and second, that the GNU binary ?: operator uses the left-hand side as both a condition and a result. (Coincidentally, both of these bugs were reported to us at the same time based on one internal team getting a new Clang build.)
In discussing the first problem, Ted and Anna realized that the liveness information for nested && and || expressions was still incorrect: the outermost expression may potentially depend on all of the inner expressions, but that means that those sub-expressions are always considered live (since expressions are marked dead, in a reverse analysis, at the point of evaluation, and there are paths that don’t evaluate a particular sub-expression in a complex logical expression). This wasn’t an issue before because && and || didn’t actually use those expressions to compute their value, but it’s still incorrect.
Ted remarked that if we were more precisely modeling execution, we’d just have an extra bit on the side that got set to 1 if we needed to run a particular temporary’s destructor. We could do something similar in the CFG by inserting fake boolean variables and using those to decide whether or not to run destructors later; we’re not bound by the limits of the AST (on where variables can be inserted), and we already synthesize fake DeclStmts elsewhere (to break up a DeclStmt with multiple decls in it). I will admit we’ve had a handful of problems related to these fake decls in the past, but that doesn’t automatically invalidate the plan. We just haven’t explored it much, and I thought it was worth mentioning.
Anyway, clearly we’d like to properly evaluate temporary destructors, but just as clearly we don’t want to accept regressions in other parts of the analyzer, especially in pure C code or C++ code with cfg-temporary-dtors still turned off. So that’s where we are today.
Still interested in trying to solve these problems?