How can I judge whether a "CFGElement" is in the for or while loop?


I have got the CFG and I am analysing the CFGBlock of it.

I notice that in Clang, a for loop will be divide into several basic blocks like :

for (int a = 0, int i = 0; i< 5; i++)

Then we can got:
B1: int a=0, int i =0;
B2: i < 5;
B3: i ++;
B4: a++;

and so on by using clang -cc1 -analyze -analyzer-checker=debug.DumpCFG xxx.c

My question is , when a got the first CFGElement of CFGBlock by using
How can I know this CFGElement is in the for loop?
(Which means my current analyzing block is B2 or B3 )


Hi, Shuai. Unfortunately, there's not a great answer for this. Part of the philosophical reason for that is "goto" (or even "switch"), where you can jump "into" a loop, and then possibly leave it again before even evaluating the loop condition.

The analyzer diagnostics occasionally find it useful to know this, but they're just using the ParentMap of the AnalysisDeclContext for the function to walk up from the statement to see if there's an enclosing loop.

All of that said, what do you need this for? It's possible there's a better way.

I don’t think there’s a good way to do this in general from the CFG. Here’s another example for you:

if (condition())

The “doSomething()” call is in its own basic block, but you can’t just add another statement there, because there are no parentheses. I suppose that could be against your style guidelines.

Another problem is that the CFG contains implicit statements inserted by the compiler, which may not have valid source locations.

If you don’t strictly need this on every basic block, you could instead do an AST walk, and insert this at the beginning of all brace statements. That’s not quite the same thing (most importantly it doesn’t check the “join” block after an if-else), but it could get you pretty far.

Another question: does this have to be done as a source-to-source transform, or could you accomplish this with an instrumentation pass at the LLVM IR level?


There are some crazy examples, like hand-rolled coroutines:

Not too many common cases, but possibly the sort of thing you’d find in a library. In any case, the original question was about performing a source-to-source transformation based on basic blocks, and that’s just not something worth doing.

Anna thought of another example why this would be problematic:

if (condition()) {
} else if (anotherCondition()) {

Although “else if” looks like a single construct, it’s really “else” followed by a single un-braced statement, "if". Trying to insert something at the start of the else-block (“anotherCondition()” being the first thing evaluated) is probably just as bad as doing it in the condition of a while loop.