# Extra join points in CFG?

I am having an issue with joint points in the clang CFG. I would
expect the CFG for

if (f() && g()) foo();

To look something like the following (I've simplified the syntax somewhat):

[B1]
1: f();
T: if [B1.1] then B2 else B4

[B2]
1: g();
T: if [B2.1] then B3 else B4

[B3]
1: foo();
T: goto B4

[B4]

That is to say, if the evaluation of f() yields false, it should
short-circuit immediately to the point after the "if" statement.
Instead, I get a CFG which looks like:

[B1]
1: f();
T: if [B1.1] then B2 else B3

[B2]
1: g();
T: goto B3

[B3]
1: [B1.1] && [B2.1]
T: if [B3.1] then B4 else B5

[B4]
1: foo();
goto B5

[B5]

In this case, there is an extra join point in the CFG. The case where
f() yields false, and the case where f() yields true, both join at
[B3], which comes before the body of the branch. This is a problem
for the analysis that I am doing, because my algorithm merges state at
each join point, so having extra join points yields a false positive.
In other words, I need to know when looking at foo() that f() yielded
true, and I can't see that in the current CFG. Moreover, I am not
convinced the extra join point is even valid; it seems odd that [B3.1]
refers to [B2.1], even though [B2] does not dominate [B3].

Would it be possible to update the CFG code so that it outputs the
first case, rather than the second?

-DeLesley

Hi Delesley,

This is a known issue. The CFG could (and should) certainly be refined here. It's been on my queue for a while. I'll try and take a look at it today, as the change will percolate to a bunch of analyses and the static analyzer.

Ted

It's good to hear that a fix is planned. BTW, I'm not in a huge hurry
here; I just wanted to know if it was intended behavior.

-DeLesley

I've got a prototype patch for the CFG changes, but as expected there's some fallout I'll need to address in the static analyzer and -Wuninitialized. I'll try and have something in tree next week.

Brief update: I have looked at this, and it turns out to be a bit more complicated then I anticipated. I discovered through investigation that the way we currently model '&&' and '||' in the CFG does not properly model C++ destructors at all, and fixing that may require changing some fairly pervasive invariants. I think we need to do this, but it will take a bit longer than I originally anticipated...

Incidentally, this kind of conditional evaluation comes up in four places
in the language:
- the ternary operator ?:, in both its variants,
- the binary operator &&,
- the binary operator ||, and
- 'new' expressions that call a operator new declared throw()/noexcept,
where the initializer is not evaluated if the result of the call is null.

Grepping CodeGen for ConditionalEvaluation is a good way to find
these places, albeit with some redundancy between the different
code generation patterns.

John.

Thanks John. I discovered some of the issues by looking at the generated IR; looking at CodeGen was my next planned step.

Cheers,
Ted

Thanks John. I discovered some of the issues by looking at the generated IR; looking at CodeGen was my next planned step.

This again makes me wish that we could begin working to actually implement CodeGen in terms of the CFG so that we both share some of the burden of implementing this logic, and more importantly can share in the testing of the logic. Anyways, that’s still more invasive, so I just wanted to keep people thinking about it…

I think there is strong merit in doing this, and this has been discussed (offline) before.

Historically, the design choice of having CodeGen and the CFG be independent came when the CFG was being used exclusively for the static analyzer. Thus the CFG wasn’t needed for the compiler at all, and thus it was slightly faster to not build the CFG.

Only later, when we had a bunch of compiler warnings based on top of the CFG, did that base assumption here change. We still don’t build CFGs for static inline functions in system headers (because we would suppress those warnings), but we don’t do codegen for those functions all the time either.

Changing CodeGen to be built on top of the CFG would provide some great self-consistency, but there are tradeoffs (potential performance related) with the approach, not to mention that this would be a significant engineering project. If someone wants to play around with doing an alternate implementation of CodeGen based on the CFG, I would support such an effort. We can only evaluate that approach if we actually do it.

However, from a pragmatic/realistic standpoint, doing this would “probably” be a significant slowdown for compile times, particularly exception handling situations. However, since the early decision was made, we’ve gone to building the CFG for each function anyway for flow sensitive warnings. Perhaps a new implementation may not be as costly and might be worth it.

It is clearly a big project though. The first necessary step is to improve the CFG to handle everything that codegen does. Fortunately, that is a really valuable and worthwhile project regardless of whether codegen switches.

-Chris

Hi Ted,

What do you mean by "does not properly model C++ destructors at all"?
Could you give me a test case? Thanks.

Certainly. Consider:

\$ cat /tmp/test.cpp
int foo();
int bar();

class A {
public:
A(int x);
~A();
operator bool();
};

class B {
public:
B(int x);
~B();
operator bool();
};

class C {
public:
C(int x);
~C();
operator bool();
};

int test(int x, int y) {
if (C b = A(x) || B(y))
return foo();
return bar();
}

Let’s look at the CFG without destructors:

\$ clang -cc1 -analyze -analyzer-checker=debug.DumpCFG /tmp/test.cpp
[B6 (ENTRY)]
Succs (1): B4

[B1]
1: bar
2: [B1.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
3: B1.2
4: return [B1.3];
Preds (1): B3
Succs (1): B0

[B2]
1: foo
2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
3: B2.2
4: return [B2.3];
Preds (1): B3
Succs (1): B0

[B3]
1: [B4.8] || [B5.8]
2: [B3.1] (ImplicitCastExpr, IntegralCast, int)
3: [B3.2] (CXXConstructExpr, class C)
4: [B3.3] (ImplicitCastExpr, ConstructorConversion, class C)
5: [B3.4] (BindTemporary)
6: [B3.5] (ImplicitCastExpr, NoOp, const class C)
7: [B3.6]
8: C b = A(x).operator _Bool() || B(y).operator _Bool();
9: b
10: [B3.9].operator _Bool
11: B3.10
12: [B3.11] (ImplicitCastExpr, UserDefinedConversion, _Bool)
T: if [B3.12]
Preds (2): B5 B4
Succs (2): B2 B1

[B4]
1: x
2: [B4.1] (ImplicitCastExpr, LValueToRValue, int)
3: [B4.2] (CXXConstructExpr, class A)
4: [B4.3] (BindTemporary)
5: A([B4.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A)
6: [B4.5].operator _Bool
7: B4.6
8: [B4.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
T: [B4.8] || …
Preds (1): B6
Succs (2): B3 B5

[B5]
1: y
2: [B5.1] (ImplicitCastExpr, LValueToRValue, int)
3: [B5.2] (CXXConstructExpr, class B)
4: [B5.3] (BindTemporary)
5: B([B5.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B)
6: [B5.5].operator _Bool
7: B5.6
8: [B5.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
Preds (1): B4
Succs (1): B3

[B0 (EXIT)]
Preds (2): B1 B2

This CFG behaves as expected. We have a single block with the ‘||’ operation as a terminator (block 4), and a single block that merges the values from the LHS and RHS expressions (block B3) which is then terminated with the ‘if’ branch terminator. My motivation for looking at this CFG was to further optimize this so that block B5 was seen as dominating block B2. This would improve the quality of some flow-sensitive (but path-insensitive) analyses in the frontend and also simplify the CFG.

The CFG gets significantly more complicated (and wrong) when we add implicit destructors:

\$ clang -cc1 -analyze -cfg-add-implicit-dtors -analyzer-checker=debug.DumpCFG /tmp/test.cpp

[B8 (ENTRY)]
Succs (1): B6

[B1]
1: [B3.2].~C() (Implicit destructor)
2: bar
3: [B1.2] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
4: B1.3
5: return [B1.4];
Preds (1): B3
Succs (1): B0

[B2]
1: foo
2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
3: B2.2
4: return [B2.3];
5: [B3.2].~C() (Implicit destructor)
Preds (1): B3
Succs (1): B0

[B3]
1: ~A() (Temporary object destructor)
2: C b = A(x).operator _Bool() || B(y).operator _Bool();
3: b
4: [B3.3].operator _Bool
5: B3.4
6: [B3.5] (ImplicitCastExpr, UserDefinedConversion, _Bool)
T: if [B3.6]
Preds (2): B4 B5
Succs (2): B2 B1

[B4]
1: ~B() (Temporary object destructor)
Preds (1): B5
Succs (1): B3

[B5]
1: [B6.8] || [B7.8]
2: [B5.1] (ImplicitCastExpr, IntegralCast, int)
3: [B5.2] (CXXConstructExpr, class C)
4: [B5.3] (ImplicitCastExpr, ConstructorConversion, class C)
5: [B5.4] (BindTemporary)
6: [B5.5] (ImplicitCastExpr, NoOp, const class C)
7: [B5.6]
8: ~C() (Temporary object destructor)
T: [B6.8] || …
Preds (2): B7 B6
Succs (2): B3 B4

[B6]
1: x
2: [B6.1] (ImplicitCastExpr, LValueToRValue, int)
3: [B6.2] (CXXConstructExpr, class A)
4: [B6.3] (BindTemporary)
5: A([B6.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A)
6: [B6.5].operator _Bool
7: B6.6
8: [B6.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
T: [B6.8] || …
Preds (1): B8
Succs (2): B5 B7

[B7]
1: y
2: [B7.1] (ImplicitCastExpr, LValueToRValue, int)
3: [B7.2] (CXXConstructExpr, class B)
4: [B7.3] (BindTemporary)
5: B([B7.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B)
6: [B7.5].operator _Bool
7: B7.6
8: [B7.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
Preds (1): B6
Succs (1): B5

[B0 (EXIT)]
Preds (2): B1 B2

Notice that blocks B6 and B5 have the same terminator, which itself is suspect. The rationale is that the temporary for ‘B(y)’ should only be destructed when we evaluated the RHS of the logical operation. This is good enough for the static analyzer, but the CFG itself doesn’t explicitly capture this dependency. This can be seen by observing that there is a path in the CFG from the entry block to the exit block where destructor ~B is called but the constructor B() is NOT called. While the static analyzer will likely get this right and recover that information (since it does enough value flow analysis to understand control-dependent branches), this unnecessarily drops a bunch of control-dependencies from the CFG which will degrade the quality of CFG-based analyses in the frontend.

The second, and more glaring problem, is that ~C is called twice along a path, both as an implicit destructor (in B1 and B2 respectively) and a destructor for a temporary (in B5). I may be missing something here, but this doesn’t show up in the LLVM IR (which clearly shows a single destructor call to ~C).

Also, look at where the destructors are called. In block B1, ~C is correctly called before the call to bar(), but in block B2, the destructor appears after the return statement. Instead, it should appear immediately after the call to foo().

I've filed this as PR 11645.

Hi Ted,

What do you mean by "does not properly model C++ destructors at all"?
Could you give me a test case? Thanks.

Certainly. Consider:

\$ cat /tmp/test.cpp
int foo();
int bar();

class A {
public:
A(int x);
~A();
operator bool();
};

class B {
public:
B(int x);
~B();
operator bool();
};

class C {
public:
C(int x);
~C();
operator bool();
};

int test(int x, int y) {
if (C b = A(x) || B(y))
return foo();
return bar();
}

Let's look at the CFG without destructors:

\$ clang -cc1 -analyze -analyzer-checker=debug.DumpCFG /tmp/test.cpp
[B6 (ENTRY)]
Succs (1): B4

[B1]
1: bar
2: [B1.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
3: [B1.2]()
4: return [B1.3];
Preds (1): B3
Succs (1): B0

[B2]
1: foo
2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
3: [B2.2]()
4: return [B2.3];
Preds (1): B3
Succs (1): B0

[B3]
1: [B4.8] || [B5.8]
2: [B3.1] (ImplicitCastExpr, IntegralCast, int)
3: [B3.2] (CXXConstructExpr, class C)
4: [B3.3] (ImplicitCastExpr, ConstructorConversion, class C)
5: [B3.4] (BindTemporary)
6: [B3.5] (ImplicitCastExpr, NoOp, const class C)
7: [B3.6]
8: C b = A(x).operator _Bool() || B(y).operator _Bool();
9: b
10: [B3.9].operator _Bool
11: [B3.10]()
12: [B3.11] (ImplicitCastExpr, UserDefinedConversion, _Bool)
T: if [B3.12]
Preds (2): B5 B4
Succs (2): B2 B1

[B4]
1: x
2: [B4.1] (ImplicitCastExpr, LValueToRValue, int)
3: [B4.2] (CXXConstructExpr, class A)
4: [B4.3] (BindTemporary)
5: A([B4.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A)
6: [B4.5].operator _Bool
7: [B4.6]()
8: [B4.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
T: [B4.8] || ...
Preds (1): B6
Succs (2): B3 B5

[B5]
1: y
2: [B5.1] (ImplicitCastExpr, LValueToRValue, int)
3: [B5.2] (CXXConstructExpr, class B)
4: [B5.3] (BindTemporary)
5: B([B5.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B)
6: [B5.5].operator _Bool
7: [B5.6]()
8: [B5.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
Preds (1): B4
Succs (1): B3

[B0 (EXIT)]
Preds (2): B1 B2

This CFG behaves as expected. We have a single block with the '||'
operation as a terminator (block 4), and a single block that merges the
values from the LHS and RHS expressions (block B3) which is then
terminated with the 'if' branch terminator. My motivation for looking at
this CFG was to further optimize this so that block B5 was seen as
dominating block B2. This would improve the quality of some
flow-sensitive (but path-insensitive) analyses in the frontend and also
simplify the CFG.

The CFG gets significantly more complicated (and wrong) when we add
implicit destructors:

-analyzer-checker=debug.DumpCFG /tmp/test.cpp

[B8 (ENTRY)]
Succs (1): B6

[B1]
1: [B3.2].~C() (Implicit destructor)
2: bar
3: [B1.2] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
4: [B1.3]()
5: return [B1.4];
Preds (1): B3
Succs (1): B0

[B2]
1: foo
2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
3: [B2.2]()
4: return [B2.3];
5: [B3.2].~C() (Implicit destructor)
Preds (1): B3
Succs (1): B0

[B3]
1: ~A() (Temporary object destructor)
2: C b = A(x).operator _Bool() || B(y).operator _Bool();
3: b
4: [B3.3].operator _Bool
5: [B3.4]()
6: [B3.5] (ImplicitCastExpr, UserDefinedConversion, _Bool)
T: if [B3.6]
Preds (2): B4 B5
Succs (2): B2 B1

[B4]
1: ~B() (Temporary object destructor)
Preds (1): B5
Succs (1): B3

[B5]
1: [B6.8] || [B7.8]
2: [B5.1] (ImplicitCastExpr, IntegralCast, int)
3: [B5.2] (CXXConstructExpr, class C)
4: [B5.3] (ImplicitCastExpr, ConstructorConversion, class C)
5: [B5.4] (BindTemporary)
6: [B5.5] (ImplicitCastExpr, NoOp, const class C)
7: [B5.6]
8: ~C() (Temporary object destructor)
T: [B6.8] || ...
Preds (2): B7 B6
Succs (2): B3 B4

[B6]
1: x
2: [B6.1] (ImplicitCastExpr, LValueToRValue, int)
3: [B6.2] (CXXConstructExpr, class A)
4: [B6.3] (BindTemporary)
5: A([B6.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A)
6: [B6.5].operator _Bool
7: [B6.6]()
8: [B6.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
T: [B6.8] || ...
Preds (1): B8
Succs (2): B5 B7

[B7]
1: y
2: [B7.1] (ImplicitCastExpr, LValueToRValue, int)
3: [B7.2] (CXXConstructExpr, class B)
4: [B7.3] (BindTemporary)
5: B([B7.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B)
6: [B7.5].operator _Bool
7: [B7.6]()
8: [B7.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
Preds (1): B6
Succs (1): B5

[B0 (EXIT)]
Preds (2): B1 B2

Notice that blocks B6 and B5 have the same terminator, which itself is
suspect. The rationale is that the temporary for 'B(y)' should only be
destructed when we evaluated the RHS of the logical operation. This is
good enough for the static analyzer, but the CFG itself doesn't
explicitly capture this dependency. This can be seen by observing that
there is a path in the CFG from the entry block to the exit block where
destructor ~B is called but the constructor B() is NOT called. While the
static analyzer will likely get this right and recover that information
(since it does enough value flow analysis to understand
control-dependent branches), this unnecessarily drops a bunch of
control-dependencies from the CFG which will degrade the quality of
CFG-based analyses in the frontend.

Well, B6 and B5 have the same conditions, and the CFG is correct in that respect. So the flow is that if B7 is executed, B4 will also always be executed (and the other way around). So it boils down to: is it allowed for one block to access the condition calculated in a preceeding block?

The second, and more glaring problem, is that ~C is called twice along a
path, both as an implicit destructor (in B1 and B2 respectively) and a
destructor for a temporary (in B5). I may be missing something here, but
this doesn't show up in the LLVM IR (which clearly shows a single
destructor call to ~C).

Also, look at where the destructors are called. In block B1, ~C is
correctly called before the call to bar(), but in block B2, the
destructor appears after the return statement. Instead, it should appear
immediately after the call to foo().

It gets worse when you write:

int test(int x, int y) {
if (C b = A(x) || B(y))
return foo();
else
return bar();
}

Not only will both destructor calls appear after the return statement, but an extra block will be generated:

[B1]
1: [B4.2].~C() (Implicit destructor)
Succs (1): B0

Also, should the destructors A and/or B be called after the condition has been calculated, or only after calling ~C?

-- Erik.