Statements appear twice in CFG

Hello all,

I’m writing a tool that uses the CFG with different functions.

I have a problem that when iterating over the elements of a CFGBlock, some of the statements appear twice.

It is most prevalent in function calls but I saw it in some other cases as well.

For example, the following code:

double i = pow(2.0, 3);

will create two elements in the CFGBlock:


1: pow(2., 3)

2: double i = pow(2., 3);

And when I try to reproduce the code from this block (by iterating over the elements and pretty print them), I get two function calls:

pow(2., 3);

double i = pow(2., 3);

I saw the same behavior also in binary operator when the opcode is a comma and in conditional operator.

Here are some examples:

int i, k = 10;

for (i = 0; i < 10; i++, k–){

// some code


The block that corresponds to the step of the for loop looks like that:


1: i++

2: k–

3: [B2.1], [B2.2]

Which creates the code:



i++, k–;

return a > 0? a : -a;

Creates this block:

1: [B4.2] ? [B2.1] : [B3.1]

2: return [B1.1];

Which pretty prints the following code:

a > 0 ? a : -a;

return a > 0 ? a : -a;

I suspect that the reason for this behavior is that the CFG is built recursively and some of the statements are visited and appended twice to the block.

In the case of the binary operator with comma, I solved this problem by removing the call to appendStmt in the code below (Which is in CFGBuilder::VisitBinaryOperator() in CFG.cpp file)

if (B->getOpcode() == BO_Comma) { // ,


//appendStmt(Block, B);


return addStmt(B->getLHS());


I tried to solve the problem also by changing VisitCallExpr() and VisitConditioalOperator() but couldn’t solve it there.

I would appreciate any help and insights about this issue.



Hi Rachel,

The CFG encodes control flow between both statements and (important) sub-expressions. It’s been a while since I’ve touched the CFG, but IIRC we don’t do a complete linearization of all expressions — but we do for many of them.

Let’s take your example:

   1: pow(2., 3)
   2: double i = pow(2., 3);

This is actually an expression, B1.1, and its containing statement, B1.2. The CFG encodes that “pow(2., 3)” happens before the initialization of ‘i’. This is NOT the case of the same thing being added twice. The idea is that an analysis will know the call happens first. By the time a statement appears in the CFG, it is a precondition that any sub-expressions that appear in the CFG will dominate the statement appearing in the CFG. Again, we don’t do a complete linearization (because that would be unnecessary) but we do for things like calls, binary expressions, and anything that has control flow like ‘&&’, etc.

The way this SHOULD be printed up is something like:

   1: i++
   2: k--
   3: [B2.1], [B2.2]

In this case we see that the sub-expression ‘i++’ gets evaluated first, followed by ‘k—‘, and then the comma expression gets evaluated as a whole after both of these.

Does that help? It is not the case that the same exact thing is added twice.