Deleting unused C++ code

Can llvm generate warnings for unused C++ code using global
analysis?

If I could use llvm to figure out what code I can delete in a 20
year old app with millions of lines of code, this alone would defend
spending time on making the app build with llvm, even if we don't
actually run the code generated...

You can try Clang Static Analyzer (http://clang-analyzer.llvm.org/), but I'm not sure that it has such analysis.

An easier way would be to use a coverage tool like gcov to see what's
actually *used* when the app is run normally. Then you can ask the
question, what percentage of all lines of code are dead?

A static analysis will not be able to see through things like virtual
method calls.

Reid

An easier way would be to use a coverage tool like gcov to see what's
actually *used* when the app is run normally. Then you can ask the
question, what percentage of all lines of code are dead?

We need something that can do this using static analysis... Otherwise
we can just use Eclipse and search for reference as a first approximation
and try a rebuild. Tedious process. gcov is out of question we'd have
to execute the entire program, which is a non-starter.

That said, it would be interesting to add gcov to the testsuite to get a measure
of how many percent of the application we're exercising....

A static analysis will not be able to see through things like virtual
method calls.

Not even in theory?

It's provably undecidable to do so in general, because aliasing is
statically undecidable.
http://academic.research.microsoft.com/Publication/857191/the-undecidability-of-aliasing
(and the more complicated paper it cites by Landi).

This is not to say it's not possible to resolve any of them sometimes,
it means in general you will not be able to.
There are a number of static analysis to do things like class
hierarchy resolution which may be able to prove which classes some
virtual calls will call.
It does generally require you assert to the compiler that you will not
change the class hierarchy later (through dynamic loading/etc), or
else you need to redo the analysis at that point. It is also meant
for compiler optimization, not code coverage. It's purpose is to get
good enough results fast, so that even if it can't resolve a call, it
may let you eliminate other possibilities and make guarded direct
calls.

Basically, what you are trying to do (statically decide code coverage)
is statically impossible, and one can probably prove you can't even
probabilistically estimate it to within some reasonable factor.

I think what you're looking for is something like Doxygen or CodeViz
that can generate (imperfect) call graphs for you. It sounds like you
just need a rough list of functions that might be unused to start
with.

-Scott

I'll have a peek at CodeViz. Even a first approximation of unused code
would be so much better than nothing.

I don't have the link right here, but I saw someone was working on a project
to list unused #include files. That too would be a big boon in untangling
large 20 year old code.

I'll settle for good guesses when iron clad guarantees are provably
impossible :slight_smile:

I don't have the link right here, but I saw someone was working on a project
to list unused #include files. That too would be a big boon in untangling
large 20 year old code.

hi,
you probably means this:
http://code.google.com/p/include-what-you-use/

regards,
Cédric