incorrect DSCallGraph for simple indirect call with vtable nearby

In an earlier message (http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-August/042298.html), Andrew Lenharth suggested that EQTDDataStructures (from the poolalloc project) may only try to resolve indirect function calls. However, I am now finding that the generated DSCallGraph over-approximates the callees in a very simple indirect call. Some over-approximation is unavoidable, but this case seems simple enough that any reasonable analysis ought to have been able to give the exact, correct answer.

My example C++ source file, compiled to bitcode using clang from the current llvm trunk, is as follows:

  struct Base
  {
    virtual void virt();
  };

  void Base::virt()
  {
  }

  void red();
  void blue();

  extern volatile int unknown;
  extern Base *base;

  void test()
  {
    base->virt();
    (unknown ? red : blue)();
  }

Observe that we have a class with a virtual method, and two indirect calls: one through a vtable to that virtual method, and another which is a simple non-deterministic choice between two possible function pointers. I can understand an inexact (over-approximating) set of callees for the vtable call, as that is rather complex at the bitcode level. However, I am very surprised to see an over-approximation at the second indirect call, which is merely picking between two possible values:

  %11 = phi void ()* [ @red(), %8 ], [ @blue(), %9 ] # demangled
  call void %11()

Yet my DSCallGraph tester code reports that this instruction might call either red(), blue(), or Base::virt(). I am at a loss to explain why.

Equally strange: if I comment out the "base->virt();" call, then DSCallGraph *does* give the expected answer that "(unknown ? red : blue)()" could call either red() or blue() but not Base::virt(). Somehow having that vtable-based call present forces the other call to be unexpectedly conservative in its over-approximation.

Source code for my DSCallGraph tester is attached below. I'd be most grateful for any clues you good people can provide.

Thanks,
Ben

ShowCallGraph.cpp (1.35 KB)

It's hard to say without looking at the bytecode. One possible
explanation involves the extern property of the functions. base is
external, so you don't know what virt may point to, except it may at
least point to any function whose address is taken and that address
escapes or the address is externally visible. (external code may set
base->virt = red (either directly, DSA doesn't know about vtables, or
by subclassing). DSA (again, I haven't looked at the specific code in
a while), tries to see through casts in a call instruction to treat
apparent indirect calls in the IR as direct (look at your previous
post to see if the ir involved a cast of the function pointer), but it
doesn't do the kind of flow-sensitive analysis you are looking for
here.

So, the call to %11 is indirect. Thus we pull the alias set for it.
That set includes red and blue obviously. However, since they are
both externally visible, they may, through base, alias virt. DSA
pulls the alias set, not the minimal flow-based set of possible
callees.

Andrew

Andrew Lenharth wrote:

It's hard to say without looking at the bytecode.

Thanks for the speedy reply, Andrew. I'm happy to provide that bytecode if it would help. Binary form or LLVM assembly source code form? On this list or off-list directly to you?

That set includes red and blue obviously. However, since they are
both externally visible, they may, through base, alias virt. DSA
pulls the alias set, not the minimal flow-based set of possible
callees.

Oh, I see. I was assuming something inclusion-based, as in Andersen's analysis. But if DSA is fundamentally unification-based (as in Steensgaard's analysis), that could explain why red() and blue() and Base::virt() all end up lumped together.

If that is true, then I'll definitely need a different approach to resolving indirect calls. The code I intend to analyze is vtable-rich, so unification through those vtables is going to infect just about everything, making my analysis too conservative to be useful in practice.

Does anyone know of an inclusion-based alternative, either in poolalloc or core LLVM or any other openly-available LLVM component? Surely there must be a decent LLVM implementation of an inclusion-based alias analysis out there somewhere...?

-- Ben

Dear Ben,

Just a few comments on DSA:

  1. I’ll try out your example C++ code below and see if I can get the same results that you do. However, I’m at a conference right now (Usenix Security), so I don’t know exactly when I’ll get to it.

  2. DSA can get pessimistic results when dealing with external code (as Andrew described). It is designed for whole program analysis, meaning that the entire program should be available (e.g., no variables defined in other compilation units). Can you:

a) Modify your example so that all variables are internally defined. You may need to use volatile keywords or printf() to ensure that dead code isn’t eliminated.

b) Make sure that you run the -internalize pass before running your analysis to make sure that all variables are marked with internal linkage.

  1. As an FYI, we just ported DSA from LLVM 2.7 to LLVM mainline last week and haven’t had time to shake it down and see how well it is working. We’ll be shaking down mainline DSA as we integrate it into an optional libLTO replacement for use with the SAFECode memory safety compiler. DSA for LLVM 2.7 is still available in the release_27 branch of the poolalloc project.

– John T.

John Criswell wrote:

1) I'll try out your example C++ code below and see if I can get the
same results that you do. However, I'm at a conference right now (Usenix
Security), so I don't know exactly when I'll get to it.

Excellent. Thanks, John!

2) DSA can get pessimistic results when dealing with external code (as
Andrew described). It is designed for whole program analysis, meaning
that the entire program should be available (e.g., no variables defined
in other compilation units). Can you: [...]

I have made the recommended changes. My test input is now a complete, self-contained program with a proper main. I use "-internalize" on the "opt" command line to run llvm::InternalizePass before my ShowCallGraph pass. (Sadly, llvm::InternalizePass::ID is not exposed through any headers, making it impossible to compile this pass-ordering requirement directly into my ShowCallGraph sources.)

The modified test input is attached below. I'm happy to provide compiled bitcode, LLVM assembly source, or whatever else you need to reproduce the problem. The ShowCallGraph pass is the same as in my earlier message at <http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-August/042312.html>. When run on the bitcode for my updated test input, ShowCallGraph reports:

   call void %6(%struct.Base* %2)
  red()
  blue()
  Base::virt() const
  Derived::virt() const
   call void %12()
  red()
  blue()
  Base::virt() const
  Derived::virt() const

The first of those two calls is a vtable dispatch; the ideal answer would be Base::virt() const and Derived::virt() const, without red() and blue(). Still, vtable lookups are complex, so I could imagine an over-approximation here.

The second of those two calls is just a non-deterministic choice between two functions. I'd really hoped that DSA would give the ideal answer here: red() or blue(), but not Base::virt() const or Derived::virt() const.

-- Ben

test.cpp (387 Bytes)

Hi Ben!

This is actually the expected behavior for EQTD :).

In short, EQTD (and CBU) are useful for program-transforming passes
like pool allocation, but are _not_ good for alias analysis queries.
If you switch to TD you'll get better alias-analysis information, and
in this example the correct result. I changed both instances of
EQTDDataStructures to TDDataStructures in your example code, and got
the desired result (and confirmed that I get the results you report
when using EQTD).

Give that change a shot and let us know if you have any further
questions/issues.

FWIW at the moment DSA doesn't give good results for vtable-heavy
code, marking all such callsites as incomplete and cannot resolve
them. Offhand, I don't remember if DSCallGraph will correctly report
a pessimistic callee set or if we expect you to look at the Incomplete
flag yourself. Anyway, this is a fundamental limitation of DSA that
probably won't be fixed anytime soon, just a heads-up.

Hope this helps! :slight_smile:

~Will

Will Dietz wrote:

This is actually the expected behavior for EQTD :).

Expected by you, maybe. :smiley:

If you switch to TD you'll get better alias-analysis information, and
in this example the correct result.

OK, I have switched to TDDataStructures as well, and I am also seeing much better (for my purposes) results in simple tests. I'll keep poking at this some more and let you folks know if I hit any other surprises.

There is no mention of TDDataStructures in "poolalloc/dsa-manual/manual.tex", which is why I did not consider using it earlier. Can you give me a high-level idea of how TDDataStructures and EQTDDataStructures differ? Clearly they differ in the "gives the answers Ben expected" dimension, but that's not a very useful description more generally. What should I know about the key differences between these two?

FWIW at the moment DSA doesn't give good results for vtable-heavy
code, marking all such callsites as incomplete and cannot resolve
them.

Understood. My plan for vtable-based calls is to pattern-match on the specific code sequences emitted by clang or whatever front end we are using. Likewise I'll be exploiting C++ domain knowledge to recognize global variables that are vtables. Once the clang type-system-rewrite work is released, that will also help me a lot as I'll have better information about the static types at each vtable call site. So I think I'm in reasonable shape for vtables. [I was prepared to be pleasantly surprised if DSA figured these out for me, but I certainly was not expecting or assuming that.]

Thanks again, Will, John, and Andrew. You've been most helpful. It's great to see that the high quality of LLVM's code is matched by the high quality of support found on this mailing list.

Regards,
Ben

I wrote:

I'll keep poking at this some more and let you folks know if I hit any other surprises.

Well, that didn't take long. :slight_smile: I have found two new surprises in DSCallGraph as built by TDDataStructures. Consider the following program, which is complete and self-contained and which has one simple indirect call site:

  volatile int unknown;

  static void red() { }
  static void blue() { }

  int main()
  {
    (unknown ? red : blue)();
    return 0;
  }

If I save this as "test.c", compile it with clang, and run my ShowCallGraph testing pass, DSCallGraph lists *zero* callees and claims that its results for the indirect call site are incomplete. Why is it unable to identify the two (seemingly-obvious) callees in this case? The bitcode generated for this call looks like:

     %7 = phi void (...)* [ bitcast (void ()* @red to void (...)*), %4 ],
                          [ bitcast (void ()* @blue to void (...)*), %5 ]
     call void (...)* %7()

If I save this same program as "test.cpp", compile it with clang, and run my ShowCallGraph testing pass, DSCallGraph correctly lists both red() and blue() as possible callees. However, it still claims that its results are incomplete. Why doesn't it know that its callee set is complete here? The bitcode generated for this call is slightly different than for "test.c". Here it looks like:

     %7 = phi void ()* [ @_Z3redv, %4 ], [ @_Z4bluev, %5 ]
     call void %7()

I would have expected any analysis to give the same results for either version of the bitcode: callee set is {red, blue} and this answer is complete. What's going wrong here?

Attached below is an updated copy of my "ShowCallGraph.cpp" pass. It now uses TDDataStructures (instead of EQTDDataStructures) and prints an additional line of output identifying incomplete callee sets.

Thanks for any hints,
Ben

ShowCallGraph.cpp (1.5 KB)

   volatile int unknown;

   static void red\(\) \{ \}
   static void blue\(\) \{ \}

   int main\(\)
   \{
     \(unknown ? red : blue\)\(\);
     return 0;
   \}

If I save this as "test.c", compile it with clang, and run my ShowCallGraph
testing pass, DSCallGraph lists *zero* callees and claims that its results
for the indirect call site are incomplete.

Reproduced.

If I save this same program as "test.cpp", compile it with clang, and run my
ShowCallGraph testing pass, DSCallGraph correctly lists both red() and
blue() as possible callees. However, it still claims that its results are
incomplete. Why doesn't it know that its callee set is complete here? The
bitcode generated for this call is slightly different than for "test.c".
Here it looks like:

%7 = phi void ()* [ @_Z3redv, %4 ], [ @_Z4bluev, %5 ]
call void %7()

I would have expected any analysis to give the same results for either
version of the bitcode: callee set is {red, blue} and this answer is
complete. What's going wrong here?

In C the red() and blue() declarations are var-args functions, in C++
they're void. This difference is behind the IR you posted, and the
function pointer cast required in the C version.

Since DSA often has conservative results, one way to keep down
explosion of the size of the analysis information (while doing
interprocedural inlining) is to drop "illlegal" callgraph edges. This
can be very useful in many cases, particularly at helping DSA scale on
large codes that are hard to analyze. Anyway, one such arguably
illegal pairing is a varargs/nonvarargs mismatch between callsite and
callee, and filtering on this is what's causing the results you're
seeing.

To be maximally clear, DSA sees you calling a var-args method from a
non-varargs callsite and says "nope, that's illegal, that can't be
right" ("and if you really are doing so, your code is illegal, so oh
well"), and drops those as valid callees.

Luckily(-ish), the types of filtering used are controlled by flags,
and the flag for this option is "-dsa-no-filter-vararg". Adding that
to the opt invocation (after -show-call-graph) causes DSA to give the
expected results on your example.

Interestingly we actually have a test case for this exact behavior,
that is presently failing (and is fixed by changing this option to not
filter in this way). I admin I'm personally a little unclear on if
this is supposed to actually be illegal or not (and if it is, we might
want to special-case the no-argument ambiguity demonstrated in your
example _anyway_, since that's fairly common in C...). Regardless of
what our defaults are, setting that option should get things going for
you, and I apologize for the trouble.

Thanks for your detailed reports, and happy callgraph building :slight_smile:

~Will

Will Dietz wrote:

In C the red() and blue() declarations are var-args functions, in C++
they're void. This difference is behind the IR you posted, and the
function pointer cast required in the C version.

Ah, right. Subtle!

Anyway, one such arguably
illegal pairing is a varargs/nonvarargs mismatch between callsite and
callee, and filtering on this is what's causing the results you're
seeing.

In that case, in the example that I gave, it's hard to imagine what
possible call site would constitute a *legal* call to a function
declared as "void red()". I do understand that "()" as an argument list
in C means that the number and types of arguments are unspecified. But
shouldn't that mean that *any* call to such a function is legal? It
seems here that there is no possible way to call such a function that
would not be treated as illegal and therefore be omitted from DSA's
callee results.

Luckily(-ish), the types of filtering used are controlled by flags,
and the flag for this option is "-dsa-no-filter-vararg".

Yay! Thanks for pointing that out.

Meanwhile, what about the fact that even the C++ version, which does not
have this varargs issue, is marked as incomplete? Where does DSA get
the idea that something other than red() or blue() could be called at
"(unknown ? red : blue)()"? I'm quite surprised that the callee set for
this call site is not marked as complete. Can you help me understand?

Thanks for your detailed reports, and happy callgraph building :slight_smile:

Thank *you* for your detailed explanations of DSA's behavior and helpful
suggestions for how to tweak it to better meet my needs!

Regards,
Ben