[analyzer] Crash using `clang_analyzer_explain()` in the `debug.ExprInspection` checker (gh-57270)

The thing is, due to how invalidation works, we bind a fresh conjured symbol as a new value, but that won’t have a statement where it came from. [2]

I tried to investigate this problem, but right now I am coming to the conclusion that the CStringChecker already passes the original statement when invalidating buffers:

and etc.

If I comment out the memory allocation from the snippet from the GitHub ticket, I can see that S::a has the correct statement attached to it:

/Users/georgiy.lebedev/Work/llvm-project/build-debug/bin/clang -cc1 -analyze -analyzer-checker=debug.ExprInspection ../../clang/debug.cpp
../../clang/debug.cpp:22:5: warning: derived_$4{conj_$1{int, LC1, S1092, #1},a} [debug.ExprInspection]
22 | clang_analyzer_dump(S::a);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
../../clang/debug.cpp:23:5: warning: value derived from (symbol of type 'int' conjured at statement 'memset(&x, 1, sizeof (x))') for global variable 'S::a' [debug.ExprInspection]
23 | clang_analyzer_explain(S::a);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.

Unfortunately, I get a truncated output on GodBolt: Compiler Explorer

I guess it has something to do with the PrintingPolicy. Could you please help me out here, i.e., how do I invoke clang++ on GodBolt to get verbose debug checker output (invoking the clang frontend from clang++ doesn’t work AFAIT)?

To sum up, AFAIC, there is no issue with the CStringChecker.

I also tried to investigate why the conjured symbol for S::a doesn’t get a statement, and it seems to be because it is conjured by the conservative destructor evaluation, which does not have a statement, which is expected to be so:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000115949e0c clang`clang::ento::SValBuilder::conjureSymbolVal(this=0x00007f7c33b05050, symbolTag=0x00007f7c35835300, expr=0x0000000000000000, LCtx=0x00006000014b17c0, type=QualType @ 0x00007ff7b0fb53d0, count=1) at SValBuilder.cpp:181:7
frame #1: 0x000000011591d88c clang`(anonymous namespace)::InvalidateRegionsWorker::VisitCluster(this=0x00007ff7b0fb6050, baseR=0x00007f7c35835300, C=0x00007f7c35836490)::BindingKey, clang::ento::SVal, llvm::ImutKeyValueInfo<(anonymous namespace)::BindingKey, clang::ento::SVal>> const*) at RegionStore.cpp:1130:19
frame #2: 0x000000011591c624 clang`(anonymous namespace)::ClusterAnalysis<(this=0x00007ff7b0fb6050)::InvalidateRegionsWorker>::RunWorkList() at RegionStore.cpp:765:36
frame #3: 0x00000001158f820d clang`(anonymous namespace)::RegionStoreManager::invalidateRegions(this=0x00007f7c33b06490, store=0x00007f7c35836458, Values=ArrayRef<clang::ento::SVal> @ 0x00007ff7b0fb6038, Ex=0x0000000000000000, Count=1, LCtx=0x00006000014b17c0, Call=0x00007f7c34810b10, IS=0x00007ff7b0fb6320, ITraits=0x00007ff7b0fb65e0, TopLevelRegions=0x00007ff7b0fb63d0, Invalidated=0x00007ff7b0fb6380) at RegionStore.cpp:1331:5
frame #4: 0x0000000115895de0 clang`clang::ento::ProgramState::invalidateRegionsImpl(this=0x00007f7c35836e38, Values=ValueList @ 0x00007ff7b0fb6368, E=0x0000000000000000, Count=1, LCtx=0x00006000014b17c0, CausedByPointerEscape=true, IS=0x00007ff7b0fb6320, ITraits=0x00007ff7b0fb65e0, Call=0x00007f7c34810b10) const at ProgramState.cpp:201:19
frame #5: 0x000000011589608b clang`clang::ento::ProgramState::invalidateRegions(this=0x00007f7c35836e38, Values=ValueList @ 0x00007ff7b0fb64b8, E=0x0000000000000000, Count=1, LCtx=0x00006000014b17c0, CausedByPointerEscape=true, IS=0x0000000000000000, Call=0x00007f7c34810b10, ITraits=0x00007ff7b0fb65e0) const at ProgramState.cpp:175:10
frame #6: 0x0000000115769f7f clang`clang::ento::CallEvent::invalidateRegions(this=0x00007f7c34810b10, BlockCount=1, Orig=clang::ento::ProgramStateRef @ 0x00007ff7b0fb67a8) const at CallEvent.cpp:282:18
frame #7: 0x00000001158534a2 clang`clang::ento::ExprEngine::conservativeEvalCall(this=0x00007ff7b0fb7988, Call=0x00007f7c34810b10, Bldr=0x00007ff7b0fb6a48, Pred=0x00007f7c35836f18, State=clang::ento::ProgramStateRef @ 0x00007ff7b0fb68a8) at ExprEngineCallAndReturn.cpp:833:16
frame #8: 0x0000000115856d8d clang`clang::ento::ExprEngine::defaultEvalCall(this=0x00007ff7b0fb7988, Bldr=0x00007ff7b0fb6a48, Pred=0x00007f7c35836f18, CallTemplate=0x00007f7c34810b10, CallOpts=0x00007ff7b0fb6f38) at ExprEngineCallAndReturn.cpp:1266:3
frame #9: 0x0000000115849310 clang`clang::ento::ExprEngine::VisitCXXDestructor(this=0x00007ff7b0fb7988, ObjectType=QualType @ 0x00007ff7b0fb6d50, Dest=0x00007f7c35835d90, S=0x00007f7c36009608, IsBaseDtor=false, Pred=0x00007f7c35836f18, Dst=0x00007ff7b0fb7168, CallOpts=0x00007ff7b0fb6f38) at ExprEngineCXX.cpp:917:5
frame #10: 0x00000001157f7e82 clang`clang::ento::ExprEngine::ProcessDeleteDtor(this=0x00007ff7b0fb7988, Dtor=const clang::CFGDeleteDtor @ 0x00007ff7b0fb70c8, Pred=0x00007f7c35836f18, Dst=0x00007ff7b0fb7168) at ExprEngine.cpp:1473:3

frame #11: 0x00000001157f0869 clang`clang::ento::ExprEngine::ProcessImplicitDtor(this=0x00007ff7b0fb7988, D=const clang::CFGImplicitDtor @ 0x00007ff7b0fb71b8, Pred=0x00007f7c35836ec0) at ExprEngine.cpp:1298:5

Also, if I replace the memseted variable used as an array size with a global variable, the analyzer also crashes the same way: Compiler Explorer

We probably shouldn’t aim for conjured symbols with null statement as a permanent solution, because that’s not a correct solution; we need a way to properly discriminate between these symbols through, we can’t be agglutinating them when they are coming from > different sources each of which is a null statement.

At the very least, we should try replacing statement pointer with CFGElementRef; that’s a generalized notion of a statement and it comes in handy because it always exists. (The discourse thread I linked above had much better approaches, but it sounds like it’s stuck.) [3]

Should I try what you suggest, or should I simply fix SValExplainer so that it can handle conjured symbols with no statement?

  1. Discord

  2. Discord

  3. [analyzer] Crash using `clang_analyzer_explain()` in the `debug.ExprInspection` checker · Issue #57270 · llvm/llvm-project · GitHub

@NoQ @steakhal

Here is a minimal repro:

void clang_analyzer_explain(int);
void clang_analyzer_dump(int);

int g;
struct NonTrivial { ~NonTrivial(); };
void top() {
  NonTrivial{}; // dtor is immediately called on the destruction of the temporary

The CStringChecker has nothing to do with how we end up having the conjured symbol with no statements.
We don’t have statements for implicitly calling the destructor, because they are not present in the AST, thus we have nothing to refer to.
When we invalidate symbols, we conjure a fresh symbol to represent the value (possibly written by the “unknown function call”), we also pack the Statement to be part of the identity of that fresh symbol. Now, because we didn’t have any statements there, it’s gonna be null in our case.

As @NoQ suggested at the GH issue:

At the very least, we should try replacing statement pointer with CFGElementRef ; that’s a generalized notion of a statement and it comes in handy because it always exists . (The discourse thread I linked above had much better approaches, but it sounds like it’s stuck.)

We could possibly change the Stmt to a CFGElementRef, which we should always have.
After this is done, we can assert that all conjured symbols should have a non-null element they refer to.
Destructors are the only thing that doesn’t fit into the Stmt model.