[RFC] Add include-what-you-use tool to clang-tools-extra

Hi all,

This is a proposal to integrate include-what-you-use [1] into clang-tools-extra.

# Background
The include-what-you-use tool analyzes #includes in C and C++ files and
recommends how to improve them. The goal is to capture symbol dependencies in
code and produce the minimal set of #includes to satisfy these symbol
dependencies. For more information you can check the project site [2], docs, and
presentation from 2010 LLVM Developers' Meeting [3].

# Benefits
Migration to clang-tools-extra doesn't come without a cost, so I want to list
some of the benefits this move yields.

## For Clang community
* Ability to reuse some of IWYU analysis for other purposes. Currently IWYU is
  distributed as a CLI tool and has no API but it is possible to split out a
  separate library. I think there could be some integration potential with the
  budding refactoring tools, for example.
* More community input in deciding further IWYU direction to help it to be more
  useful for various parties.

## For include-what-you-use users
* Easier distribution and use. For users already using other Clang tools it
  should be easier to use IWYU as any other Clang tool. I also expect it to make
  life easier for people packaging include-what-you-use for various *nix
  distributions.
* Moving the tool towards consistency with other Clang tools.

## For include-what-you-use project
* Exposure to more users.
* Easier release process. Instead of releasing IWYU separately, it could be
  bundled with LLVM+Clang releases. It shouldn't incur more work for LLVM+Clang
  releases as the main complexity comes from tracking different branches and
  building binaries for different platforms.
* More resiliency as the project becomes more community-owned instead of
  personally-owned.

# Potential downsides
When new Clang sub-projects are proposed, one of the most common concerns is the
maintenance burden. Dumping the code and walking away to let the community
support the code is unacceptable. The longevity of the project demonstrates
commitment to maintaining the project. The history on GitHub shows that
include-what-you-use is not a passing fancy that will be discarded and forgotten
in a week or two.

What is your opinion, is there value in having include-what-you-use in
clang-tools-extra?

Thanks for any input,
- Kim

[1] GitHub - include-what-you-use/include-what-you-use: A tool for use with clang to analyze #includes in C and C++ source files
[2] https://include-what-you-use.org/
[3] http://llvm.org/devmtg/2010-11/

As another include-what-you-use maintainer I support the proposal. I am doing so solely in my personal capacity and am not representing any third parties.

Regards,
Volodymyr

I’m not fundamentally opposed to the idea, but I’d expect it to be rather painful - speaking from experience of having upstreamed a non-trivially sized chunk of google-style code into clang, which was still significantly smaller than iwyu.

Generally, we need patches to be:

  1. LLVM style
  2. incremental, with design ideas discussed / vetted in review

(1) is just a big chunk of work, but (2) can quickly lead code into a very different direction from where it is now; that said, I do believe that it would make the code, especially for iwyu, a lot better - but it would also make it an incredible amount of work, knowing how much time went into creating iwyu in the first place.

Additionally, with C++ modules, a new idea of “use” is emerging (I don’t know whether standardization is far enough for that to be reliable), and I think we could / should and will build an “iwyu” implementation on top of that.
That doesn’t make a non-modules iwyu useless, but I’d argue that if we want iwyu to live within clang-tools-extra, we want that to be aligned with the semantics of “modules-use”, which I believe to be somewhat different; again, nothing is a show-stopper here, but I’d expect a significant amount of work.

Finally, given the current rate of tooling contributions across clang-tidy / clang-format / tooling / refactoring, and the number of upstream reviewers we have, I’d additionally expect the process to be rather slow; for example, the refactoring contributions have a much higher priority currently, and even those often take (imo) too long to be reviewed (partially my fault :slight_smile:

With all that said, I don’t want to discourage you from trying, but I want to set clear expectations - it might feel / look like a rewrite of iwyu from scratch.

Cheers,
/Manuel

Hi!

Maybe I am missing something but could you summarize what is the difference between the iwyu tool and the already existing include-fixer in the repository?

Would iwyu deprecate include-fixer? Or does it make more sense to improve include-fixer and deprecate iwyu? Or are they completely separate?

Thanks,

Gábor

include-fixer (sorry for the name) solves the problem of adding missing includes. IWYU solves the problem of removing unused includes and resolving transitive includes. There is almost zero overlap between those two tools.

Hi!

Maybe I am missing something but could you summarize what is the difference between the iwyu tool and the already existing include-fixer in the repository?

Include fixes works on broken code and adds missing includes (thus, “fixer”).

IWYU works on correct code, and makes sure it directly includes what is used; thus, it’s deleting includes that are not directly used (unlike include-fixer) and adding transitive includes that are directly used (unlike include-fixer, which would not do anything if an include is already provided transitively).

In principle, I’d support having an IWYU tool in clang-extra-tools - it’s something that we’d have found useful in the past in work I and the team I work with have done.

I did a quick prototype several months ago of an IWYU-style tool, based on the clang libraries (particularly the AST_MATCHER, PPCallbacks and other related code), and without looking directly at the existing implementation particularly. I was able to get it to do the basic stuff fairly quickly (only a few days’ work), although it certainly wasn’t as advanced as the actual IWYU. It was an interesting experiment however.

Without looking into all the details of how the existing IWYU is implemented, and based on my own experiences and what Manuel said earlier, I wonder whether it might be worth building up an IWYU tool that is inspired by, but not necessarily identical to, the existing one?

Regards,

James

In principle, I’d support having an IWYU tool in clang-extra-tools - it’s something that we’d have found useful in the past in work I and the team I work with have done.

I did a quick prototype several months ago of an IWYU-style tool, based on the clang libraries (particularly the AST_MATCHER, PPCallbacks and other related code), and without looking directly at the existing implementation particularly. I was able to get it to do the basic stuff fairly quickly (only a few days’ work), although it certainly wasn’t as advanced as the actual IWYU. It was an interesting experiment however.

Without looking into all the details of how the existing IWYU is implemented, and based on my own experiences and what Manuel said earlier, I wonder whether it might be worth building up an IWYU tool that is inspired by, but not necessarily identical to, the existing one?

Yea. One question is how much of the “use” part of iwyu that’s already encoded in clang as part of modules we could re-use. The whole thing might be a lot simpler if we put in an interface to Sema to get at the full information while parsing, and implementing iwyu as “as-if” warnings.
Ben did explore this for a bit, but I don’t think it went anywhere.

Reimplementing the tool using the tooling available in the LLVM project would seem more appropriate to an LLVM-project tool. J

I am not really familiar with the original IWYU tool, but one thing I remember from James’ work is that it would be fairly easy to implement different policies. For example, minimizing the number of #includes, versus always directly including the header that declares everything actually used in the source. That kind of flexibility is great.

–paulr

Hi Manuel,

Thanks for your thoughtful reply.

I'm not so worried about LLVM style, I think we can get there without too much
effort with a combination of common sense, sed and clang-format.

I'm a strong proponent of incremental patch-by-patch development, and would
absolutely prefer further IWYU development in that style. But I'm not sure I see
the point of re-implementing IWYU for the purpose of small, incremental
patches. I agree that many design choices would be different if built from
scratch, but it seems like a waste to throw away all the hard work that has
already gone into IWYU.

It would basically stop the project dead in its tracks; bug fixes, improvements
and contributions would need to be put on hold until a rewrite is
completed.

My preference would be to make incremental changes on top of the existing
functional code base, out-of-tree or in-tree, to make it more in line with LLVM
standards.

Can we go back to first principles and talk about the value gained by small
increments? Maybe there's a less disruptive way to reach the same goal.

Thanks,
- Kim

Towards the end of his tenure with IWYU, csilvers mentioned an idea
like this, in passing, and I think it makes a lot of sense.

I'm not familiar/comfortable enough with Sema to make it happen, but
if a foundation and a few representative examples come up, I'd be more
than happy to help work on it.

- Kim

I think exploring a new IWYU would be interesting and rewarding. It
would be nice if such an initiative could build on the IWYU test suite
in some form, I suspect that the easy cases are easy to get right, and
that some of the complexity in the current IWYU comes from the edge
cases.

That said, some/much(?) of the IWYU complexity is probably incidental,
and it would be nice to clean that up.

I'm not sure the most productive way to do that is to start from
scratch, though.

- Kim

I fully agree. My main point is: I don’t think putting it into clang-tools-extra in its current form is the right approach. I don’t know any better way to incrementally get it into the form it would need to get into clang-tools-extra other than through incremental patches to clang-tools-extra, given that many folks just read the mailing list for patches, and if we try to go around the usual approach (for example by doing reviews in the current location) important feedback / objections might drop in only when the full thing goes in in the end.

That said, I perhaps also don’t find it super important for iwyu to be in clang-tools-extra - it’d most certainly be nice, and make me happy on a principled basis, but given the current state of the world, I’d rather wait a bit how things play out than try to force it.

Perhaps I’m missing something?

Cheers,
/Manuel

Currently include-what-you-use is considered to be competing with Clang tools, existing and potential. It would be great to resolve this competition issue. One of the ways to achieve it is to make IWYU one of Clang tools. I’ll be glad to learn about other options. So far I don’t know any.

Thanks,
Volodymyr

Currently include-what-you-use is considered to be competing with Clang tools, existing and potential.

By whom? Why?

Currently include-what-you-use is considered to be competing with Clang
tools, existing and potential.

By whom? Why?

I don't have official legal evaluation of the situation but employers tend
to disapprove employees working on competing products. From my limited
experience restrictions can be pretty broad, prohibiting to work on
products or services in the same area or competing with your employer's
current or reasonably anticipated products or services. I believe it is
reasonable to anticipate a tool like include-what-you-use among other Clang
tools. So for anybody working on Clang as a part of their job, contributing
to IWYU might be problematic.

What is the experience of others? Is it better to leave resolving such
matters to individual contributors or to make a project easier to
contribute to?

Thanks,
Volodymyr

I would understand this is clang (or clang tools) was a product by a company. It is an open source project, though, so I’m not sure how that applies. That said, if anybody is in that kind of situation and thus feels like they can’t contribute to IWYU, I’d be curious to learn about the specific case.

Right. The original proposal listed a number of pros and cons, but of
course I'm looking at this mostly from IWYU's perspective.

It would be a great help to us to be closer to Clang, especially when
it comes to packaging and releases -- we spend a lot of our time just
adjusting build systems to keep both in-tree and out-of-tree builds
running smoothly.

We also have a (very informal) API wishlist that's hard to discuss in
a principled way when IWYU is just hanging by the sidelines. I'm
hoping it would be easier with closer proximity, especially if we
could put in some work to help build it out. That would benefit other
Clang tools as well.

Our testing infrastructure is pretty good, but let's be frank: it's not lit.

And the list goes on -- IWYU is built and developed as if part of
Clang, but it's hard to keep up! We're hoping it would be easier for
us, and a net benefit for all, if we just joined the streams.

- Kim