Uninitialized Variables Analysis crashing

Hi,

I have implemented a new Stmt in clang which given

__speculate {
// user code
}

generates LLVM IR equivalent to as if the following code was given:

jmp_buf __setjmp_buf;
int __setjmp_status = setjmp(__setjmp_buf);
unsigned int __spec_mode = __spec_begin(__setjmp_buf, __setjmp_status);
if (__spec_mode == SW) {
// unmodified user code
} else
// unmodified user code
}

However, if UninitializedVariablesAnalysis is enabled, clang crashes at runOnBlock function (lib/Analysis/UninitializedValues.cpp). If I disable it via -Wno-uninitialized, the code generated runs flawlessly and works as expected.

By adding some “->dumps()” calls I was able to discover that the crash happens while analyzing:

unsigned int __spec_mode = __spec_begin(__setjmp_buf, __setjmp_status);

The UninitializedVariables Analysis generates the following classifications before crashing:

  • DeclRefExpr 0x22a1940 ‘unsigned int (jmp_buf *, int)’ Function 0x22a17d0 ‘__spec_begin’ ‘unsigned int (jmp_buf *, int)’ as "ClassifyRefs::Ignore"
  • DeclRefExpr 0x22a1680 ‘jmp_buf’:‘struct __jmp_buf_tag [1]’ Var 0x22a1248 ‘__setjmp_buf’ ‘jmp_buf’:‘struct __jmp_buf_tag [1]’ as "ClassifyRefs::Ignore"
  • DeclRefExpr 0x22a16c8 ‘int’ Var 0x22a12a8 ‘__setjmp_ret’ ‘int’ as "ClassifyRefs::Use"

So it seems that the analysis successfully classifies the call and both of its arguments, but crashes while analyzing the __spec_mode variable.

Any guesses on what I am doing wrong?

Hi João Paulo,

Hi,

I have implemented a new Stmt in clang which given

__speculate {
// user code
}

generates LLVM IR equivalent to as if the following code was given:

...

However, if UninitializedVariablesAnalysis is enabled, clang crashes at
runOnBlock function (lib/Analysis/UninitializedValues.cpp). If I disable it
via -Wno-uninitialized, the code generated runs flawlessly and works as
expected.

"crash" is a meaningless term. Does it generate an access violation,
an assertion failure, or something else?
What was the error message? Is there a stack trace?

Csaba

You are right, I should have been more specific. In my case crashing means segmentation fault inside runOnBlock function (lib/Analysis/UninitializedValues.cpp).

Please find the stack trace attached.

stack-trace.txt (5.94 KB)

Is the DeclRefExpr for __spec_mode pointing to a valid VarDecl? Would this VarDecl be encountered previously during analysis in order to populate TransferFunctions::vals()? Dunno if that makes sense, i didn’t really look at this analysis.

I guess i want to look at your full -ast-dump. It seems that you are not only adding a new Stmt kind, but also you’re getting into business of auto-generating AST made of some regular nodes, probably as its children. This is often hard because it’s very easy to break invariants that the normal AST obeys and there’s no way to verify those invariants other than seeing how everything crashes and debugging.

Artem,

First of all, thank you for your reply.

Indeed, I am auto-generating AST made of regular nodes. More specifically, my SpecStmt node has four sub-nodes. A DeclStmt that I use to hang the variables a need, a Expr to compare __spec_mode and decide which path to take (SlowPath or FastPath) and two CompoundStmts. So SpecStmt looks like an IfStmt with both then and else statements.

Bellow is the part of the AST which triggers the sigsegv (Please find the AST for the whole FunctionDecl attached):

-SpecStmt 0x15c1ca8 <./spec.h:127:32>

-DeclStmt 0x15c09e8 col:32

-VarDecl 0x15c0368 col:32 col:32 __setjmp_buf ‘jmp_buf’:‘struct __jmp_buf_tag [1]’ auto
-VarDecl 0x15c03c8 col:32 col:32 __setjmp_ret ‘int’ auto cinit
`-CallExpr 0x15c0630 col:32 ‘int’

-ImplicitCastExpr 0x15c0618 col:32 ‘int (*)(jmp_buf *)’
-DeclRefExpr 0x15c05e0 <col:32> 'int (jmp_buf *)' Function 0x15c04d8 '**setjmp**' 'int (jmp_buf *)' -ImplicitCastExpr 0x15c0450 col:32 ‘jmp_buf *’
-DeclRefExpr 0x15c0428 <col:32> 'jmp_buf':'struct __jmp_buf_tag [1]' Var 0x15c0368 '**__setjmp_buf**' 'jmp_buf':'struct __jmp_buf_tag [1]' -VarDecl 0x15c0660 col:32 col:32 __spec_mode ‘unsigned int’ auto cinit
-CallExpr 0x15c0990 <col:32> 'unsigned int' -ImplicitCastExpr 0x15c0978 <col:32> 'unsigned int (*)(jmp_buf *, int)' <FunctionToPointerDecay> -DeclRefExpr 0x15c0940 col:32 ‘unsigned int (jmp_buf *, int)’ Function 0x15c07d0 ‘__spec_begin’ ‘unsigned int (jmp_buf *, int)’
-ImplicitCastExpr 0x15c0730 col:32 ‘jmp_buf *’
-UnaryOperator 0x15c06e8 <col:32> 'jmp_buf *' prefix '&' -DeclRefExpr 0x15c06c0 col:32 ‘jmp_buf’:‘struct __jmp_buf_tag [1]’ Var 0x15c0368 '__setjmp_buf’ ‘jmp_buf’:‘struct __jmp_buf_tag [1]’
-ImplicitCastExpr 0x15c0748 <col:32> 'int' <LValueToRValue> -DeclRefExpr 0x15c0708 col:32 ‘int’ Var 0x15c03c8 ‘__setjmp_ret’ ‘int’
-BinaryOperator 0x15c0a60 col:32 ‘bool’ lvalue ‘==’
-ImplicitCastExpr 0x15c0a28 col:32 ‘unsigned int’
-DeclRefExpr 0x15c0a00 <col:32> 'unsigned int' lvalue Var 0x15c0660 '**__spec_mode'** 'unsigned int' -IntegerLiteral 0x15c0a40 col:32 ‘int’ 1
-CompoundStmt 0x15c1c38 col:32

Thanks in advance for any help!

Respectfully,

ast-dump.txt.tgz (25.5 KB)

Remote debugging is hard, but here’s something i notice: i don’t think you can declare multiple variables of different types in the same DeclStmt. Eg., you can do int x = 1, *y = 0;, but you can’t do int x = 1, unsigned y = 0;. Could you try to produce a separate DeclStmt for every VarDecl and see if it helps?

If it doesn’t, can i also have a look at the CFG? Just do CFG.dump().

Yes, I understand and really appreciate your help, Artem.

Last night I had the same insight, but splitting the VerDecls to different DeclStmts had no effect.

Bellow is the dump of the source-level CFG for the block in question. One thing I have noticed from the CFG is that all my calls were tagged “template”. Any idea why?
Maybe that has something to do with why UninitializedVariables Analysis is sigseg-faulting.

[B181]
1: jmp_buf __setjmp_buf;
2: template setjmp
3: [B181.2] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(struct __jmp_buf_tag *))
4: __setjmp_buf
5: [B181.4] (ImplicitCastExpr, ArrayToPointerDecay, struct __jmp_buf_tag )
6: B181.3
7: int __setjmp_ret = template setjmp(__setjmp_buf);
8: template __spec_begin
9: [B181.8] (ImplicitCastExpr, FunctionToPointerDecay, unsigned int (
)(jmp_buf , int))
10: __setjmp_buf
11: &[B181.10]
12: __setjmp_ret
13: [B181.12] (ImplicitCastExpr, LValueToRValue, int)
14: [B181.9]([B181.11], [B181.13])
15: unsigned int __spec_mode = template __spec_begin(&__setjmp_buf, __setjmp_ret);
16: __spec_mode
17: [B181.16] (ImplicitCastExpr, LValueToRValue, unsigned int)
18: [B181.17] == 1
19: template __begin_spec_slow_path
20: [B181.19] (ImplicitCastExpr, FunctionToPointerDecay, void (
)(void))
21: B181.20
22: global_maxNumVertices
23: [B181.22] (ImplicitCastExpr, LValueToRValue, ULONGINT_T)
24: (long)[B181.23] (CStyleCastExpr, IntegralCast, long)
25: long __tmp_maxNumVertices = (long)global_maxNumVertices;
26: __tmp_maxNumVertices
27: typeof (tmp_maxNumVertices) ___a = (__tmp_maxNumVertices);
28: maxNumVertices
29: [B181.28] (ImplicitCastExpr, LValueToRValue, ULONGINT_T)
30: 1 (ImplicitCastExpr, IntegralCast, unsigned long)
31: [B181.29] + [B181.30]
32: typeof (maxNumVertices + 1) ___b = (maxNumVertices + 1);
33: ___a
34: [B181.33] (ImplicitCastExpr, LValueToRValue, typeof (tmp_maxNumVertices))
35: [B181.34] (ImplicitCastExpr, IntegralCast, unsigned long)
36: ___b
37: [B181.36] (ImplicitCastExpr, LValueToRValue, typeof (maxNumVertices + 1))
38: [B181.35] > [B181.37]
T: ([B181.38]) ? … : …
Preds (1): B185
Succs (2): B179 B180

Hmm, do we have a backtrace from a debug+assertions build? Like, what are we specifically crashing at? It kinda makes sense that we should have the variable in the vals list, because we went through its respective DeclStmt, right? So i dunno where we are crashing.

I also dumped the AST for the pseudo-code that you provided and i noticed more inconsistencies with your AST:

  • Integral cast from int to unsigned is missing within operator ==.
  • The lvalue-to-rvalue around operator & in the __spec_begin call is unnecessary because operator & already returns an rvalue.
  • On the other hand, DeclRefExprs for both __setjmp_buf and __setjmp_ret must be lvalues.

It’s also a bit weird that there’s no control flow in your CFG. I would have expected that a new block is going to start after operator ==, but i guess if the user code is unmodified anyway, there probably isn’t much control flow anyway. But then what’s the point of comparing? I’m confused.

One thing I have noticed from the CFG is that all my calls were tagged “template”.

CFG just calls Stmt::printPretty() in this case. A quick look at Stmt pretty-printing suggests that you have created the DeclRefExpr for the function with a non-empty template keyword source location, i.e. this AST pretends to be constructed from code

unsigned int __spec_mode = template __spec_begin(…);

Also your variables have type ‘auto’. And, as far as i understand, we’re dealing with C code.

So just to be sure - are you sure you want to auto-generate AST? Did you consider implementing your feature as a macro instead, probably with the help of a custom compiler builtin for __spec_begin() if it does something that a normal function cannot do? I.e., implement all special features in the builtin, and instead of auto-generating the surrounding AST, just add a macro into the header that expands to whatever you want? It’s not as omnipotent, but it’s much easier. Like, i mean, this AST compiles, but all clang tools are screwed unless the AST makes perfect sense; this compiler warning is just the first slightly-advanced tool you encountered.

Artem, please find my comments bellow.

Hmm, do we have a backtrace from a debug+assertions build? Like, what are we specifically crashing at? It kinda makes sense that we should have the variable in the vals list, because we went through its respective DeclStmt, right? So i dunno where we are crashing.

Its breaking at “return scratch[idx.getValue()];” (lib/Analysis/UninitializedValues.cpp at line 213). After some “errs() <<” debugging I noticed that my VarDecls are note being mapped by DeclToIndex::computeMap (lib/Analysis/UninitializedValues.cpp at line 78).

Probably I am using the wrong DeclContext then? (I am retrieving it from SemaRef.CurContext)

I also dumped the AST for the pseudo-code that you provided and i noticed more inconsistencies with your AST:

  • Integral cast from int to unsigned is missing within operator ==.
  • The lvalue-to-rvalue around operator & in the __spec_begin call is unnecessary because operator & already returns an rvalue.
  • On the other hand, DeclRefExprs for both __setjmp_buf and __setjmp_ret must be lvalues.

Yes, I had noticed this as well. Its fixed now! I am not doing the Integral cast, as of now, just to see if it triggers a warning.

It’s also a bit weird that there’s no control flow in your CFG. I would have expected that a new block is going to start after operator ==, but i guess if the user code is unmodified anyway, there probably isn’t much control flow anyway. But then what’s the point of comparing? I’m confused.

You are right. However, I am emitting at CGStmt a BranchOnBoolExpr and both blocks as intended. In the future a intend to reflect the control change in the AST as well (enabling thus static analysis on my construct).

One thing I have noticed from the CFG is that all my calls were tagged “template”.
CFG just calls Stmt::printPretty() in this case. A quick look at Stmt pretty-printing suggests that you have created the DeclRefExpr for the function with a non-empty template keyword source location, i.e. this AST pretends to be constructed from code

Fixed this as well.

unsigned int __spec_mode = template __spec_begin(…);
Also your variables have type ‘auto’. And, as far as i understand, we’re dealing with C code.

Also fixed.

So just to be sure - are you sure you want to auto-generate AST? Did you consider implementing your feature as a macro instead, probably with the help of a custom compiler builtin for __spec_begin() if it does something that a normal function cannot do? I.e., implement all special features in the builtin, and instead of auto-generating the surrounding AST, just add a macro into the header that expands to whatever you want? It’s not as omnipotent, but it’s much easier. Like, i mean, this AST compiles, but all clang tools are screwed unless the AST makes perfect sense; this compiler warning is just the first slightly-advanced tool you encountered.

I think it makes sense to implement this at the AST level because, in the future, I intend to add more functionality. For instance, allow the user to inform which variables do not need to be speculated. Also, once in the AST I will be able to perform static analysis. But I might be wrong and appreciate your take on this.

The correct DeclContext for these variables is the definition of the function (i.e., the re-declaration that has a body).

Your DeclContext looks correct because we know that the given DeclRefExpr has been classified as Use, which can only happen when isTrackedVar() return true - cf. findVar().

But for the same reason they should not be skipped in computeMap(). Weird.

I was thinking one giving up on this, but then I remembered that I need finish this to actually move on with my PhD. =<

Just in case someone makes the same mistakes I made, I am replying to register how i solved it.

The actual bug was caused by the DeclVars inside the CompoundStmt I was transforming through TreeTransform. I have figured out that my “->dump()” debugging was misleading. As DeclVars were not been added to the current context via SemaRef.CurContex->addDecl(Decl*), the analysis was trying ti access uninitialized memory and thus triggering a segfault. Once I added the addDecl calls the analysis could be left enabled (default) and the compilation proceeds gracefully.

I thank you Artem once again for your time and help.

Hopefully in the future I might be able to give back by helping somebody else.