-Wunreachable-code and templates

I'm just wondering if anyone's already working on addressing the
current shortcomings around -Wunreachable-code in regards to
templates.

Take the following simple example:

$ cat unreachable.cpp
int func1();
int func2();
template<bool b>
int func() {
  return !b ? func1() : func2();
}

int main() {
  return func<true>() + func<false>();
}
$ clang++ -Wunreachable-code unreachable.cpp
unreachable.cpp:5:15: warning: will never be executed [-Wunreachable-code]
  return !b ? func1() : func2();
              ^~~~~
unreachable.cpp:9:10: note: in instantiation of function template
specialization 'func<true>' requested here
  return func<true>() + func<false>();
         ^
unreachable.cpp:5:25: warning: will never be executed [-Wunreachable-code]
  return !b ? func1() : func2();
                        ^~~~~
unreachable.cpp:9:25: note: in instantiation of function template
specialization 'func<false>' requested here
  return func<true>() + func<false>();
                        ^
2 warnings generated.

I have at least two (possibly naive) ideas about how to handle this:

1) Visit templates & skip instantiations. This is even consistent with
some existing code (see CFG.cpp:441-ish, where the Expr *S is checked
for type/value dependence & the condition is not evaluated if that is
the case - in fact that's never the case because
AnalysisBasedWarnings.cpp:821 skips analyzing any dependent context.
Instead we could remove the check in AnalysisBasedWarnings.cpp and use
the check in CFG.cpp (& add another check to ignore the actual
template instantiations)). This, I believe, should remove the false
positives with a minimum of fuss. It will also remove a bunch of
correct positive reports from specific template instantiations (for
example if we only instantiated func<true> above, the false case is
unreachable but would not produce a warning)

2) A more interesting option would be to visit the instantiations, but
instead of reporting on the instantiation, build a CFG of the original
template (this would require effectively inter-function analysis -
across all the instantiations of a function template). Once all the
instantiations have been visited we could look at any blocks still
marked unreachable in the original template & produce warnings for
those. My only query here would be where to store the CFG for these
templates & when to visit them to ensure all the instantiations had
already been visited (probably a map as a member of Sema and visit
them at the end of sema - though these may have significant perf
(mostly space - the time is already being spent walking each
instantiation) impact)

So - anyone else already working on this? Existing PRs I should look
at? Other design ideas people have already had?

Thanks,
- David

[sorry, wrong list - moving to cfe-dev]

I'm just wondering if anyone's already working on addressing the
current shortcomings around -Wunreachable-code in regards to
templates.

Take the following simple example:

$ cat unreachable.cpp
int func1();
int func2();
template<bool b>
int func() {
return !b ? func1() : func2();
}

int main() {
return func<true>() + func<false>();
}
$ clang++ -Wunreachable-code unreachable.cpp
unreachable.cpp:5:15: warning: will never be executed [-Wunreachable-code]
return !b ? func1() : func2();
^~~~~
unreachable.cpp:9:10: note: in instantiation of function template
specialization 'func<true>' requested here
return func<true>() + func<false>();
^
unreachable.cpp:5:25: warning: will never be executed [-Wunreachable-code]
return !b ? func1() : func2();
^~~~~
unreachable.cpp:9:25: note: in instantiation of function template
specialization 'func<false>' requested here
return func<true>() + func<false>();
^
2 warnings generated.

I have at least two (possibly naive) ideas about how to handle this:

1) Visit templates & skip instantiations. This is even consistent with
some existing code (see CFG.cpp:441-ish, where the Expr *S is checked
for type/value dependence & the condition is not evaluated if that is
the case - in fact that's never the case because
AnalysisBasedWarnings.cpp:821 skips analyzing any dependent context.
Instead we could remove the check in AnalysisBasedWarnings.cpp and use
the check in CFG.cpp (& add another check to ignore the actual
template instantiations)). This, I believe, should remove the false
positives with a minimum of fuss. It will also remove a bunch of
correct positive reports from specific template instantiations (for
example if we only instantiated func<true> above, the false case is
unreachable but would not produce a warning)

2) A more interesting option would be to visit the instantiations, but
instead of reporting on the instantiation, build a CFG of the original
template (this would require effectively inter-function analysis -
across all the instantiations of a function template). Once all the
instantiations have been visited we could look at any blocks still
marked unreachable in the original template & produce warnings for
those. My only query here would be where to store the CFG for these
templates & when to visit them to ensure all the instantiations had
already been visited (probably a map as a member of Sema and visit
them at the end of sema - though these may have significant perf
(mostly space - the time is already being spent walking each
instantiation) impact)

So - anyone else already working on this? Existing PRs I should look
at? Other design ideas people have already had?

Thanks,
- David

Hi David,

Thanks for bringing up this topic.

My concern about approach #1 is that not all analyses are savvy to analyze templates directly, and the CFG itself will be incomplete. The CFG contains destructor calls and possibly other entities whose specifics are only known when the template is actually instantiated (e.g., a template parameter 'T' may expand to something with a destructor, or it may expand to something like 'int' where no destructor is involved). The result is that two instantiations may have very different CFGs. Further, I don't think most analyses would be savvy to reason about such "abstract definitions" of functions. Uninstantiated templates are really more syntax than semantics, and analyses rely on precise semantic information.

My concern about #2 is twofold. First, it is potentially very costly. If you analyze instantiated templates as they stream by, you potentially have to retain all the analysis information in order to fuse them up later to reach the final conclusions. With a naive implementation, the space overhead of the analysis could grow linearly with the number of instantiations, while a good implementation would incrementally merge the new analysis results for each instantiation into the old one, and then once we finished parsing the translation unit reach the final decisions. This sound potentially really complex, but I suppose it could be tractable since only few analyses would possibly care about taking this extra step.

My second concern about #2 is that I don't think it is technically correct. Even if you analyze all N instantiations of a function template, do the dataflow analysis, and conclude that from those N instantiations that a given piece of code was dead, that doesn't mean that if you did a N+1'th instantiation that was different from the others that the block would be guaranteed to be dead. Basically, proving that a block is dead for the N instantiations in the way you propose doesn't prove that the block is dead in general.

The idea I had to tackle this was a bit simpler, although it could certainly be flawed. Looking at the results of -Wunreachable-code that you mentioned, the reason we report this code as unreachable is because the edges for the branch conditions in the CFG are intentionally pruned as being trivially satisfiable/unsatisfiable. Instead of just chopping off the edges completely, we can retain all edges, but mark them as being unreachable for various reasons, e.g. value dependent because of template instantiations or expanded from a macro.

If we retain this information, what you are left with is a CFG that contains both pruned edges (the default view for most clients) but also the "hidden" edges for those clients interested in the complete control-flow as implied by the AST. There are already some clients that care about getting both the "unoptimized" and "optimized" CFG; this enhancement would collapse the two into the same data structure. -Wunreachable-code for example could enhance its walk of the CFG to take special attention to these hidden edges and handle them accordingly. The nice thing about this approach is that it helps marginalize out overly optimistic conclusions on reachable code that were made based on the specific template instantiations that you saw.

The other nice thing about this approach is that it is very simple. All CFGBlocks currently record a vector of predecessors and successors (each represented as a CFGBlock*), with iterators over them. We could change those vectors to contain bitmasked pointers representing real edges and pruned edges, with special bits to indicate why an edge was pruned. We could then add a new successor/predecessor iterator that just skips the hidden edges (the default view for most clients) but then provide an alternate iterator that viewed the hidden edges as well (and means to query them).

What do you think?

Ted

I really like this idea for the reasons you mention.

Also, Richard Smith and I were discussing based on David’s initial queries in IRC how we might do this. Our specific idea was to retain just enough information in the instantiated AST to figure out the dependence of the templated expressions leading to that instantiation. He had a clever idea that would re-use the bits already in the Expr node to indicate different kinds of dependence to also indicate being instantiated from different kinds of dependence. I’ll let him fill in the details, but essentially this (possibly combined with a side-table of more detailed information) seems like it would allow the AST to indicate exactly what template construct it came from, much the way source locations provide this information for macros.

Hi Ted - thanks for the reply (given that you mentioned this
previously, I thought you were likely to have an informed opinion on
the subject & wanted to get that before wading out too deep into the
code)

My concern about approach #1 is that not all analyses are savvy to analyze templates directly, and the CFG itself will be incomplete. The CFG contains destructor calls and possibly other entities whose specifics are only known when the template is actually instantiated (e.g., a template parameter 'T' may expand to something with a destructor, or it may expand to something like 'int' where no destructor is involved). The result is that two instantiations may have very different CFGs. Further, I don't think most analyses would be savvy to reason about such "abstract definitions" of functions. Uninstantiated templates are really more syntax than semantics, and analyses rely on precise semantic information.

Makes sense - I had that concern at a very vague level, not knowing
all the particulars about how Clang represents template patterns, so
thanks for clarifying/making that more concrete.

My concern about #2 is twofold. First, it is potentially very costly. If you analyze instantiated templates as they stream by, you potentially have to retain all the analysis information in order to fuse them up later to reach the final conclusions. With a naive implementation, the space overhead of the analysis could grow linearly with the number of instantiations, while a good implementation would incrementally merge the new analysis results for each instantiation into the old one, and then once we finished parsing the translation unit reach the final decisions. This sound potentially really complex, but I suppose it could be tractable since only few analyses would possibly care about taking this extra step.

For what it's worth, the idea I had was to do it incrementally -
essentially having to walk the template pattern side-by-side with the
instantiation (which, given (1) would be more work than I anticipated,
certainly - having to account for tolerable differences in the CFG
between pattern and instantiation) & only keep the template pattern's
CFG across multiple function definitions. So memory cost linear on the
number of templates (because you'd never know if there might be
another instantiation coming up later)

My second concern about #2 is that I don't think it is technically correct. Even if you analyze all N instantiations of a function template, do the dataflow analysis, and conclude that from those N instantiations that a given piece of code was dead, that doesn't mean that if you did a N+1'th instantiation that was different from the others that the block would be guaranteed to be dead. Basically, proving that a block is dead for the N instantiations in the way you propose doesn't prove that the block is dead in general.

My naive thinking was that if we've seen all the instantiations &
something still isn't covered, then it is dead code - but you're
right; Perhaps it's a library that a given translation unit only uses
one or two instantiations of and another TU reaches the apparently
'dead' code. Scratch that, then.

The idea I had to tackle this was a bit simpler, although it could certainly be flawed. Looking at the results of -Wunreachable-code that you mentioned, the reason we report this code as unreachable is because the edges for the branch conditions in the CFG are intentionally pruned as being trivially satisfiable/unsatisfiable.

Right, this hinges on the
(anon)::CFGBuilder::tryEvaluate/tryEvaluateBool functions (in turn on
Expr::EvaluateAsRValue/EvaluateAsBooleanCondition).

Instead of just chopping off the edges completely, we can retain all edges, but mark them as being unreachable for various reasons, e.g. value dependent because of template instantiations or expanded from a macro.

So you're suggesting that those try/Evaluate functions would return
this extra information as part of their tree walk evaluation? "true,
but it's also value dependent", "true, but depends on a macro
expansion"

This was the first puzzle I hit upon when I was looking through this
code - the type/value dependence checks in tryEvaluate/tryEvaluateBool
weren't 'working' (they never fired, even with an assert) because
templates weren't visited, only instantiations - so I was wondering if
I could just simply check the Expr to see if it was an instantiation
of a type/value dependent expression, but that information wasn't
available (which is why I got talking to Chandler & Richard on IRC
about whether it could be retrieved, etc). But if we're going to
evaluate the expression anyway (for other clients) then it shouldn't
be hard to record the extra fact that while doing so we had to rely on
template or macro expansions (at least the template case has a
specific expression node type - I haven't looked at how I'd detect the
macro case yet).

If we retain this information, what you are left with is a CFG that contains both pruned edges (the default view for most clients) but also the "hidden" edges for those clients interested in the complete control-flow as implied by the AST. There are already some clients that care about getting both the "unoptimized" and "optimized" CFG; this enhancement would collapse the two into the same data structure. -Wunreachable-code for example could enhance its walk of the CFG to take special attention to these hidden edges and handle them accordingly. The nice thing about this approach is that it helps marginalize out overly optimistic conclusions on reachable code that were made based on the specific template instantiations that you saw.

The other nice thing about this approach is that it is very simple. All CFGBlocks currently record a vector of predecessors and successors (each represented as a CFGBlock*), with iterators over them. We could change those vectors to contain bitmasked pointers representing real edges and pruned edges, with special bits to indicate why an edge was pruned.

What kind of granularity did you have in mind for this bitmask? Are
there reasons we would need to store the reason for a particular
branch being trivially unsatisfiable (macro, type dependence, value
dependence)? Or just that it is. (& then another flag for the
optimized/unoptimized case you mentioned already exists)

We could then add a new successor/predecessor iterator that just skips the hidden edges (the default view for most clients) but then provide an alternate iterator that viewed the hidden edges as well (and means to query them).

Would it make sense, possibly, to just have the iteration itself
accept a query in the form a bitmask of the type of edges to visit, or
would that be too heavyweight? (it could be a template parameter to
the iterator rather than a runtime one, I suppose - the default value
of the argument being the "NoHidden" flag to get the old/current use
case)

What do you think?

Makes sense so far (we'll see what happens when I try to implement it
- I might have some more questions).

- David

The idea I had to tackle this was a bit simpler, although it could certainly be flawed.

One issue here based on the basic idea you've outlined: we still would
be reporting on template instantiations, not the template itself. If
there is unreachable code that's just unreachable in all
instantiations of the template then we would actually report it once
for each instantiation rather than just once (& a lesser issue: we
wouldn't report it at all if there were no instantiations). Though
this is less of a problem than the issue of false positives.

$ cat unreachable2.cpp
int func1();
int func2();
template<bool b>
int func() {
  if (false) {
    return func1();
  }
  return func2();
}

int main() {
  return func<true>() + func<false>();
}
$ clang++ -Wunreachable-code unreachable2.cpp
unreachable2.cpp:6:12: warning: will never be executed [-Wunreachable-code]
    return func1();
           ^~~~~
unreachable2.cpp:12:10: note: in instantiation of function template
specialization 'func<true>' requested here
  return func<true>() + func<false>();
         ^
unreachable2.cpp:6:12: warning: will never be executed [-Wunreachable-code]
    return func1();
           ^~~~~
unreachable2.cpp:12:25: note: in instantiation of function template
specialization 'func<false>' requested here
  return func<true>() + func<false>();
                        ^
2 warnings generated.

From a time perspective, this also means we're analyzing reachable

code for every template instantiation - should we ever find different
unreachable code for different instantiations (you mentioned some
differences between template patterns and template instantiations but
it's not immediately obvious to me if/when these differences could
produce different reachable code that we would want to diagnose on a
per-instantiation basis (unlike the original issue where we expressly
don't want to look per instantiation, basically))?

- David

I thought about this issue more since I wrote my email late last night. Your absolutely right that fundamentally we are still analyzing the instantiations. Concerning the performance issues of analyzing multiple instantiations, or duplicate diagnostics, I think there are plenty of tricks we can play here to reduce that burden (e.g., unique diagnostics for this particular warning, which would be fairly cheap).

My concern, however, is that this approach is still fundamentally flawed. By analyzing the instantiations, we are still impacted by the control-flow of that particular instantiation. For example, suppose we have:

template void foo() {
{
T x;
}
printf(“is this dead?”);
}

If the instantiated type for ‘T’ has a ‘noreturn’ destructor, then the printf() call is unreachable. If it doesn’t, it is reachable.

I still think my suggestion for having a CFG that retains some of the pruned edges is a good one, but the subtle dependencies from template instantiation on the semantics and the resulting control-flow is inherently complex and possibly too complicated to represent directly in the CFG.

Analyzing the template itself is also a non-starter. Without instantiating a template, we really don’t know what its control-flow semantics are.

Realistically, unless we really want to be super clever here, the most immediate solution to making -Wunreachable-code useful in the presence of templates is to simply have it skip templates entirely. That will definitely lead to false negatives, but all of the false positives would be eliminated, which I think is far more important if we want this warning to be more actively used by users. My concern about doing deadcode analysis on templates it is that it is very difficult for us to prove in the general case that a given template always has a snippet of code that is dead. Certainly there are restricted cases where this is easy, but I think the common cases will not be trivial. Deadcode analysis means that must prove, for all template instantiations, that the code is dead. This is very different from analyses such as -Wuninitialized, which only need to look for a single example where a value is used uninitialized.

A simpler example:

template void foo() {
T::bar();
printf(“is this dead?”);
}

Same issue as before. ‘bar()’ could be marked ‘noreturn’, but it depends on the instantiation. Thus control-flow could vary significantly between instantiations.

In the spirit of improving the utility of the warning, I have gone and done this in r145520. If we come up with a better solution, we can certainly change this. I believe the general solution, if one exists, will be very complex.

It seems like we're no longer moving in this direction, but for posterity the
idea was to use the InstantiationDependent bit to indicate (as it currently
does) whether an Expr node is dependent, and if that bit is 0, use the
ValueDependent/TypeDependent bits to indicate if the node was instantiated
from a dependent node.

Injecting the right values into the latter bits is somewhat tricky to do
non-invasively. The best approach which Chandler and I found was to track in
TreeTransform whether we're visiting a dependent Expr node, and if any new
Expr nodes are created during such a visit, to set their Dependent bits to the
corresponding values.

Richard

Thanks Ted - I tend to agree once we got to those examples of
conditionally unreachable code that it appears to be a hard problem. I
did check gcc's behavior, but it seems they gave up on this warning
entirely around 4.3-4.4 (because their implementation was based on
their optimizations & fraught with many more problems than just
templates, by the sounds of it) so we're not missing any GCC parity
here by not checking templates.

Slightly related/minor tidyup - should the isType/ValueDependent
checks in tryEvaluate/tryEvaluateBool be removed (and possibly
replaced with assertions) as they don't seem to do anything - we
weren't visiting dependent contexts previously nor are we now.

- David

My concern, however, is that this approach is still fundamentally flawed.
By analyzing the instantiations, we are still impacted by the control-flow
of that particular instantiation. For example, suppose we have:
template <typename T> void foo() {
{
T x;
}
printf("is this dead?");
}
If the instantiated type for 'T' has a 'noreturn' destructor, then the
printf() call is unreachable. If it doesn't, it is reachable.

A simpler example:
template <typename T> void foo() {
T::bar();
printf("is this dead?");
}
Same issue as before. 'bar()' could be marked 'noreturn', but it depends on
the instantiation. Thus control-flow could vary significantly between
instantiations.

So the next issue I've encountered in LLVM's code is:

  if (sizeof(long) == sizeof(int))

Do we need to suppress any constant evaluation of conditions that
involve sizeof/alignof/offsetof/etc expressions? (perhaps this could
be implemented using the technique you mentioned earlier - flagging
certain edges as hidden for certain reasons so that the reachable code
analysis could examine them without interfering with other analyses -
at least at first glance it doesn't seem like these suffer from the
same complexities as templates). So far as I can see this would
involve plumbing through the logic in ExprConstant.cpp with an extra
output parameter specifying whether the evaluation of the expression
included any of these special expressions, yes?

While chatting about this on IRC - Chandler brought up the other case
you had briefly touched on: macros. These seem to suffer from all the
same issues as templates, but perhaps even worse - a single macro in
any function could exhibit the same problems as the template case:

[[noreturn]] void foo();
void bar();
#ifdef X
#define CALL foo();
#else
#define CALL bar();
#endif

int main() {
  CALL
  baz(); // unreachable?
}

Does this mean we can't analyze any function involving macros? That
would seem overly limiting. Should we try to record more detail about
the graph again, such as where a branch (or lack of one) came from?
How practical is that?

- David

This is all a bit of art, but it is about hitting the sweet spot between false positives and false negatives, complexity of implementation (and cleverness) versus payoff. This isn’t about soundness; -Wunreachable-code was correctly warning in template instantiations that some code is unreachable. The problem is that there are cases where we don’t care, because it doesn’t indicate a problem in the original code (e.g., the original template body). The goal of the warning is to find bugs, so we should concentrate on stylistically what kind of deadcode doesn’t look intentional.

For templates, my concern is that the template instantiation can routinely impact control-flow in bizarre ways. We could itemize the common ways that unreachable code occurs in template instantiations, and see if we can do something clever. I think some of this could be done using the CFG approach I described (e.g., recording that specific pruned branches were “instantiation-dependent”). That might be acceptable for a large number of cases, but the analysis isn’t trivial. I’m very skeptical that the payoff would be worth it versus the complexity of the implementation.

Macros I think are more nuanced. Consider the following two cases:

if (MACRO_INSTANTIATION)
unreachable

and

MACRO_INSTANTIATION_THAT_CONTAINS_UNREACHABLE_CODE

For the latter, my gut is to never issue a -Wunreachable-code warning. It’s basically the same as templates; the instantiated code could be anything (and it frequently does). Many people use macro instantiations understanding that the macro will generate a ton of goop, only to get optimized away from the compiler. I think warning about unreachable code within macros is going to have marginal utility. However, if we saw something like this:

MACRO_INSTANTIATION_THAT_BLOCKS_FURTHER_EXECUTION
deadcode

should we warn about ‘deadcode’? Probably not. It’s probably no different than using llvm_unreachable.

Then there is also the case of:

if (COMPILE_TIME_CONSTANT)
unreachable

There, I think it depends on how the compile-time constant was computed. If the constant involved macros, e.g.:

if (MACRO_A + 1)

then I’d say this is the same as:

if (MACRO_INSTANTIATION)

If it was something like:

if (some_constant_dependent_on_langoptions_or_triple)

then I’d also treat it like “if (MACRO_INSTANTIATION)”. This is basically 'sizeof(long) == sizeof(int)". It’s basically meta-programming of a sort, and we don’t know that for a different program “instantiation” (e.g., on a different architecture) that the code would have behaved differently.

If, however, it looks something like:

if (2 * 0)
unreachable

I think we should definitely warn. There is nothing here that relies on “meta-programming”. It’s all concrete, and clearly the developer made a mistake.

So the bottom line is that I think it comes down to a grab bag of heuristics. With the CFG approach I described, I’m basically saying we retain “ghost” edges for the edges we would have pruned. Heck, we don’t even need to record why they were pruned, just that they were pruned. -Wunreachable-code, if it cares, could go an infer the specific cases of why an edge was pruned (or what was involved, e.g. a macro instantiation in the condition).

One nice thing of leaving the heuristics up to -Wunreachable-code is that it doesn’t complicate CFG construction. CFG construction just records the ghost edges, and it is up to specific analyses to decide what to do with them (if anything).

I’d be happy with that if we just had CFGBuilder abort up front the construction on raw (uninstantiated) templates. This defensive code is there so that other clients of CFGBuilder (besides AnalysisBasedWarnings) don’t get into trouble.

This may be generally useful, but I’m dubious that analyzing templates for unreachable code is really worth the payoff of the complexity it will incur. I painted a picture earlier of how instantiated code could have wildly different control-flow, but how much should we care about those issues in practice? At the very least, however, we may get different CFGs for different template instantiations that have different basic blocks, different quantities of basic blocks, etc. We’d need to correlate those results together to get reasonable results, and then that doesn’t necessarily prove that the original template contains dead code I’m skeptical that doing all the extra work to get this to dance for template instantiations will really be worth it compared to handling the common edge cases that happen in everyday code.

> Slightly related/minor tidyup - should the isType/ValueDependent
> checks in tryEvaluate/tryEvaluateBool be removed (and possibly
> replaced with assertions) as they don't seem to do anything - we
> weren't visiting dependent contexts previously nor are we now.

I'd be happy with that if we just had CFGBuilder abort up front the
construction on raw (uninstantiated) templates. This defensive code is
there so that other clients of CFGBuilder (besides AnalysisBasedWarnings)
don't get into trouble.

Before doing that tidyup I decided to have a go at the "simple" thing
- doing unreachable code on templates instead of their instantiations.
No doubt it's not the tidiest implementation (suppressing all the
other analysis based warnings when 'dependent' is true is a bit
intrusive/repetitive) but it seems to show the concept fairly well.

I don't think we'd discussed in detail what problems might arise from
this approach - certainly we know cases that we can't prove (such as a
dependent function call which may or may not be noreturn) but
non-dependent calls to noreturn functions, for example, seem to work
correctly (without regressing your test cases that cover the dependent
noreturn case).

Are there other cases I've missed where this approach could be problematic?

(I've also made some changes to the array-bounds tests here - an
indirect discovery while experimenting with this change: we emit
warnings that are contingent on reachability for every instantiation
even when the expression is not dependent. It'd be nice if we didn't
do that, but it's not too important I suppose - since the warning is
legitimate & it'll all versions will go away when it's fixed. The
minor advantage of warning on uninstantiated templates is probably
marginal (& perhaps debatable - it's not really reachable if the
template was never instantiated, but we're not doing interprocedural
analysis anyway - so a real (non-template) function has the same issue
anyway)).

Thoughts?
- David

unreachable_templates.diff (4.4 KB)

Hi David,

Thank you for being persistent. :slight_smile:

I think this approach is worth pursuing further only if there is an algorithmic justification that it is doing the right thing. Since each instantiation of a template can have slightly different control-flow (mainly due to noreturn, but there may be other cases we haven’t considered), we’d need to be able to say that the “control-flow” of the uninstantiated template represents some kind of superset of the possible control-flow when the template is instantiated (or at least a superset of the statements that are reachable).

If ‘noreturn’ is the only issue, then this may actually work. The uninstantiated template may have ‘noreturn’ calls, but by definition those will appear in the instantiated template as well. Ab instantiation may have additional ‘noreturn’ calls, but my understanding is that this will only cause more code to become unreachable in the particular instantiation. Thus as far as ‘noreturn’ is concerned, the uninstantiated template represents a correct under-approximation of the unreachable code in all instantiation. Is this logic sound?

If it is, my main concern then comes down to the following:

(1) Are there are other issues that we are not considering where the uninstantiated template doesn’t faithfully provide an under approximation of the reachable code?

(2) If we decide this approach is viable, which uninstantiated templates do we analyze? For performance reasons, I don’t think we should analyze all the uninstantiated templates, as we may find ourselves repeatedly analyzing a huge portion of the Boost and the STL, etc. My suggestion is that when we encounter an instantiation, we go and check if we have already analyzed the uninstantiated template for this warning, and if not, do the analysis.

(3) Somewhat tangential, we would want to analyze explicit template instantiations an their own, since they represent “concrete” functions that were provided, not instantiated ones.

(4) I don’t know the if the CFGBuilder is going to work all that well on uninstantiated templates. There may be bugs here, or other fundamental issues (which i don’t know about).

(5) We wouldn’t want to do this approach wholesale for all analyses. I think it mainly makes sense for the reachable code warnings, e.g. -Wunreachable-code or warning if a function doesn’t return. Analyses like -Wuninitialized, etc., would want to keep working on instantiated templates since (a) they need the more precise semantics of the instantiated templates to work correctly and (b) it is okay if we flag a warning in one instantiation that doesn’t appear in an other (since that is a genuine issue).

What do you think?

Ted

Before doing that tidyup I decided to have a go at the "simple" thing
- doing unreachable code on templates instead of their instantiations.
No doubt it's not the tidiest implementation (suppressing all the
other analysis based warnings when 'dependent' is true is a bit
intrusive/repetitive) but it seems to show the concept fairly well.

I don't think we'd discussed in detail what problems might arise from
this approach - certainly we know cases that we can't prove (such as a
dependent function call which may or may not be noreturn) but
non-dependent calls to noreturn functions, for example, seem to work
correctly (without regressing your test cases that cover the dependent
noreturn case).

Are there other cases I've missed where this approach could be problematic?

Hi David,

Thank you for being persistent. :slight_smile:

I think this approach is worth pursuing further only if there is an
algorithmic justification that it is doing the right thing. Since each
instantiation of a template can have slightly different control-flow (mainly
due to noreturn, but there may be other cases we haven't considered), we'd
need to be able to say that the "control-flow" of the uninstantiated
template represents some kind of superset of the possible control-flow when
the template is instantiated (or at least a superset of the statements that
are reachable).

If 'noreturn' is the only issue, then this may actually work. The
uninstantiated template may have 'noreturn' calls, but by definition those
will appear in the instantiated template as well. Ab instantiation may have
additional 'noreturn' calls, but my understanding is that this will only
cause more code to become unreachable in the particular instantiation. Thus
as far as 'noreturn' is concerned, the uninstantiated template represents a
correct under-approximation of the unreachable code in all instantiation.
Is this logic sound?

I think the generalization goes something like this:

To get a false positive with this approach it would be necessary for
edges to be added to the CFG during the transformation from template
pattern to template instantiation.

So far as I can think this shouldn't be the case - my only concern
would be exceptions. But it seems catch blocks are considered
reachable even when the corresponding try has no code at all, so, at
least for now, these aren't an issue. (but if we started adding in
edges to catch blocks only when they were reachable, we'd have to err
on the side of caution for any dependent expressions that might throw
- adding edges to all catch blocks in such a case, rather than adding
no edges initially & only adding the required edges at instantiation
time)

If it is, my main concern then comes down to the following:

(1) Are there are other issues that we are not considering where the
uninstantiated template doesn't faithfully provide an under approximation of
the reachable code?

I've tested the exception case above & that seems to be always
reachable anyway. I'll try my hand at looking through the template
instnantiation code to see if there's any cases where we might end up
adding edges rather than removing them.

(2) If we decide this approach is viable, which uninstantiated templates do
we analyze? For performance reasons, I don't think we should analyze all
the uninstantiated templates, as we may find ourselves repeatedly analyzing
a huge portion of the Boost and the STL, etc. My suggestion is that when we
encounter an instantiation, we go and check if we have already analyzed the
uninstantiated template for this warning, and if not, do the analysis.

By repeatedly, you mean in each translation unit, right? Because we
shouldn't have to visit them more than once per TU (we won't be
visiting them per insntantiation). But, yes, the idea of analyzing
every boost template someone includes when they're only using a subset
does sound a little costly - I might be interested to compare times,
though.

Only analyzing instantiated templates (& keeping a list of those
templates we've already seen/analyzed) seems OK to me, assuming that
didn't represent a prohibitive memory cost to maintain the set of
visited templates (can't see why it should, but it is a small
time/space tradeoff all the same).

(3) Somewhat tangential, we would want to analyze explicit template
instantiations an their own, since they represent "concrete" functions that
were provided, not instantiated ones.

I'll double check this, but I assume my prototype changes already
account for this, since they didn't regress your test cases.

(4) I don't know the if the CFGBuilder is going to work all that well on
uninstantiated templates. There may be bugs here, or other fundamental
issues (which i don't know about).

I'm planning to keep plodding along compiling LLVM & Clang with
-Weverything (with a few things disabled) so I'll see what comes of
it. With my prototype I didn't have any regressions across the Clang
test suite, for what that's worth.

(5) We wouldn't want to do this approach wholesale for all analyses. I
think it mainly makes sense for the reachable code warnings, e.g.
-Wunreachable-code or warning if a function doesn't return. Analyses like
-Wuninitialized, etc., would want to keep working on instantiated templates
since (a) they need the more precise semantics of the instantiated templates
to work correctly and (b) it is okay if we flag a warning in one
instantiation that doesn't appear in an other (since that is a genuine
issue).

The prototype patch attached to my previous email already accounted
for this (I think) - just not in a very nice way.

To formalize this requirement, though: Warnings that we only want to
emit in reachable code should use a pessimistic definition of
reachable code (err on the side of assuming code is unreachable) to
avoid false positives. Warnings about unreachable code should use a
pessimistic definition of unreachable code (err on the side of
assuming code is reachable) to avoid false positives.

So, yes, unreachable warnings can be done on templates but array
bounds checking, return paths, etc, should be done on instantiations.
Though, ideally, we could find some way to emit
instantiation-independent problems once, rather than once for every
instantiation (see the array bounds test case modifications in my
previous patch for an example that exposes this problem) - either by
analyzing the template first & trying to avoid
instantiation-independent cases when we analyze the instantiation, or
simply by filtering out the duplicates in some way whenever we visit
an instantiation.

Thanks again for your help,
- David

[bumping with the latest sync'd patch]

Before doing that tidyup I decided to have a go at the "simple" thing
- doing unreachable code on templates instead of their instantiations.
No doubt it's not the tidiest implementation (suppressing all the
other analysis based warnings when 'dependent' is true is a bit
intrusive/repetitive) but it seems to show the concept fairly well.

I don't think we'd discussed in detail what problems might arise from
this approach - certainly we know cases that we can't prove (such as a
dependent function call which may or may not be noreturn) but
non-dependent calls to noreturn functions, for example, seem to work
correctly (without regressing your test cases that cover the dependent
noreturn case).

Are there other cases I've missed where this approach could be problematic?

Hi David,

Thank you for being persistent. :slight_smile:

I think this approach is worth pursuing further only if there is an
algorithmic justification that it is doing the right thing. Since each
instantiation of a template can have slightly different control-flow (mainly
due to noreturn, but there may be other cases we haven't considered), we'd
need to be able to say that the "control-flow" of the uninstantiated
template represents some kind of superset of the possible control-flow when
the template is instantiated (or at least a superset of the statements that
are reachable).

If 'noreturn' is the only issue, then this may actually work. The
uninstantiated template may have 'noreturn' calls, but by definition those
will appear in the instantiated template as well. Ab instantiation may have
additional 'noreturn' calls, but my understanding is that this will only
cause more code to become unreachable in the particular instantiation. Thus
as far as 'noreturn' is concerned, the uninstantiated template represents a
correct under-approximation of the unreachable code in all instantiation.
Is this logic sound?

I think the generalization goes something like this:

To get a false positive with this approach it would be necessary for
edges to be added to the CFG during the transformation from template
pattern to template instantiation.

So far as I can think this shouldn't be the case - my only concern
would be exceptions. But it seems catch blocks are considered
reachable even when the corresponding try has no code at all, so, at
least for now, these aren't an issue. (but if we started adding in
edges to catch blocks only when they were reachable, we'd have to err
on the side of caution for any dependent expressions that might throw
- adding edges to all catch blocks in such a case, rather than adding
no edges initially & only adding the required edges at instantiation
time)

If it is, my main concern then comes down to the following:

(1) Are there are other issues that we are not considering where the
uninstantiated template doesn't faithfully provide an under approximation of
the reachable code?

I've tested the exception case above & that seems to be always
reachable anyway. I'll try my hand at looking through the template
instnantiation code to see if there's any cases where we might end up
adding edges rather than removing them.

I haven't really delved into this further - is there a particular
form/approach you'd like to see to prove this property? (that new
edges are never added when instantiating a template (except when
splitting an existing block in the CFG (eg: by a dependent call
becoming noreturn - splitting the block at the call site)))

(2) If we decide this approach is viable, which uninstantiated templates do
we analyze? For performance reasons, I don't think we should analyze all
the uninstantiated templates, as we may find ourselves repeatedly analyzing
a huge portion of the Boost and the STL, etc. My suggestion is that when we
encounter an instantiation, we go and check if we have already analyzed the
uninstantiated template for this warning, and if not, do the analysis.

By repeatedly, you mean in each translation unit, right? Because we
shouldn't have to visit them more than once per TU (we won't be
visiting them per insntantiation). But, yes, the idea of analyzing
every boost template someone includes when they're only using a subset
does sound a little costly - I might be interested to compare times,
though.

Only analyzing instantiated templates (& keeping a list of those
templates we've already seen/analyzed) seems OK to me, assuming that
didn't represent a prohibitive memory cost to maintain the set of
visited templates (can't see why it should, but it is a small
time/space tradeoff all the same).

Is there a particular technique/suite/process I should use to try to
measure the possible performance regression of performing reachable
code analysis on all templates? (of course I may need to come up with
my own test cases using the STL and/or boost to really grind things a
bit)

(3) Somewhat tangential, we would want to analyze explicit template
instantiations an their own, since they represent "concrete" functions that
were provided, not instantiated ones.

I'll double check this, but I assume my prototype changes already
account for this, since they didn't regress your test cases.

(4) I don't know the if the CFGBuilder is going to work all that well on
uninstantiated templates. There may be bugs here, or other fundamental
issues (which i don't know about).

I'm planning to keep plodding along compiling LLVM & Clang with
-Weverything (with a few things disabled) so I'll see what comes of
it. With my prototype I didn't have any regressions across the Clang
test suite, for what that's worth.

(5) We wouldn't want to do this approach wholesale for all analyses. I
think it mainly makes sense for the reachable code warnings, e.g.
-Wunreachable-code or warning if a function doesn't return. Analyses like
-Wuninitialized, etc., would want to keep working on instantiated templates
since (a) they need the more precise semantics of the instantiated templates
to work correctly and (b) it is okay if we flag a warning in one
instantiation that doesn't appear in an other (since that is a genuine
issue).

The prototype patch attached to my previous email already accounted
for this (I think) - just not in a very nice way.

To formalize this requirement, though: Warnings that we only want to
emit in reachable code should use a pessimistic definition of
reachable code (err on the side of assuming code is unreachable) to
avoid false positives. Warnings about unreachable code should use a
pessimistic definition of unreachable code (err on the side of
assuming code is reachable) to avoid false positives.

So, yes, unreachable warnings can be done on templates but array
bounds checking, return paths, etc, should be done on instantiations.
Though, ideally, we could find some way to emit
instantiation-independent problems once, rather than once for every
instantiation (see the array bounds test case modifications in my
previous patch for an example that exposes this problem) - either by
analyzing the template first & trying to avoid
instantiation-independent cases when we analyze the instantiation, or
simply by filtering out the duplicates in some way whenever we visit
an instantiation.

Thanks again,
- David

template_unreachable.diff (4.4 KB)

I’ve thought about this more, and I’m pretty sure that analyzing the uninstantiated template will be a conservative superset of the reachable control-flow of any instantiated template.