Finding redundant #includes

Hi,

Can clang's analysis features help me find #includes which are no longer
required in a C source file? I'm working on cleaning up some old crufty
code and it would be good to have this functionality.

Andrew

Hi Andrew,

I don't believe we currently have such a feature (though it's an interesting idea).

Implementing this wouldn't be too difficult, however it certainly isn't a "quick hack".

snaroff

Sounds promising. I'm in no rush so if anybody would like me to test
patches for this feel free to send them my way.

Thanks,

Andrew

Doing it correctly wouldn't be too hard from the AST perspective, but would be tricky when considering preprocessor logic. Any macros defined in a header and later used outside that header causes a dependency. Moreover, if a file can be compiled under different contexts, e.g., on Mac OS X one can compile for i386, x86_64, etc., then the "liveness" of a #include can change between translations.

Good points. I think the diagnostics would need some interpretation (by a human).

If we wanted to get "fancy", we could devise some way to determine if a declaration occurred within a #ifdef clause (special casing the #ifndef that is commonly at the begin/end of every C header). Unfortunately, any feature involving the preprocessor is usually more complex than it should be (given the ability to arbitrary mutate headers in different contexts...including the same compilation unit:-(

Fortunately, ObjC headers and user-defined (i.e. non-system) headers usually aren't as gross...

snaroff

Just a thought: I wonder if for top-level #includes, one possibility is to remove the #include and see if it produces the same translation afterwards in the main source file. That would avoid the issues with explicitly reasoning about preprocessor logic, and would (I believe) handle the simplest case that people care about.

Just a thought: I wonder if for top-level #includes, one possibility is to remove the #include and see if it produces the same translation afterwards in the main source file. That would avoid the issues with explicitly reasoning about preprocessor logic, and would (I believe) handle the simplest case that people care about.

For reference, we do this and have found it be effective, especially
combined with a distributed compilation framework which can do even
the preprocessing phase
(http://google-opensource.blogspot.com/2008/08/distccs-pump-mode-new-design-for.html).
I'd be interested if there are less expensive ways to achieve these
results though.

I have opened a bug to track this feature request:

http://llvm.org/bugs/show_bug.cgi?id=5782

Andrew

Just a thought: I wonder if for top-level #includes, one possibility is to remove the #include and see if it produces the same translation afterwards in the main source file. That would avoid the issues with explicitly reasoning about preprocessor logic, and would (I believe) handle the simplest case that people care about.

Yes, I think this would be a sane and pretty strong approach. It
requires a good way to compare ASTs, which we don't have.

There is a lot of utility in removing includes in non-main fails, as
well, of course.

- Daniel