[RFC] IRBuilder polymorphism: Templates/virtual

Hi,

The IRBuilder is currently templated over a constant folder, and an instruction inserter. https://reviews.llvm.org/D73835 proposes to move this towards using virtual dispatch instead. As this is a larger design change, I would like to get some feedback on this.

The current templated design of IRBuilder has a couple of problems:

  1. It’s not possible to share code between use-sites that use different IRBuilder folders/inserters (short of templating the code and moving it into headers).
  2. Methods currently defined on IRBuilderBase (which is not templated) do not use the custom inserter, resulting in subtle bugs (e.g. incorrect InstCombine worklist management). It would be possible to move those into the templated IRBuilder, but…
  3. The vast majority of the IRBuilder implementation has to live in the header, because it depends on the template arguments.
  4. We have many unnecessary dependencies on IRBuilder.h, because it is not easy to forward-declare. (Significant parts of the backend depend on it via TargetLowering.h, for example.)

The referenced patch makes the constant folder and instruction inserter use virtual dispatch instead. IRBuilder remains templated, but is only a thin wrapper around IRBuilderBase, which contains all the logic and creation methods. Functions using the IR builder only need to accept IRBuilderBase&, and headers can forward-declare that type.

The disadvantage of the change is that additional virtual dispatch may make the IRBuilder a bit more expensive. Moving the implementation of IRBuilder methods from the header into the cpp file (not direct part of the proposed change, but a natural followup it enables) would further limit inlining opportunities.

What are your thoughts on this?

Regards,

Nikita

The IRBuilder is currently templated over a constant folder, and an instruction inserter. https://reviews.llvm.org/D73835 proposes to move this towards using virtual dispatch instead. As this is a larger design change, I would like to get some feedback on this.

The current templated design of IRBuilder has a couple of problems:
1. It's not possible to share code between use-sites that use different IRBuilder folders/inserters (short of templating the code and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated) do not use the custom inserter, resulting in subtle bugs (e.g. incorrect InstCombine worklist management). It would be possible to move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h, because it is not easy to forward-declare. (Significant parts of the backend depend on it via TargetLowering.h, for example.)

The referenced patch makes the constant folder and instruction inserter use virtual dispatch instead. IRBuilder remains templated, but is only a thin wrapper around IRBuilderBase, which contains all the logic and creation methods. Functions using the IR builder only need to accept IRBuilderBase&, and headers can forward-declare that type.

The disadvantage of the change is that additional virtual dispatch may make the IRBuilder a bit more expensive. Moving the implementation of IRBuilder methods from the header into the cpp file (not direct part of the proposed change, but a natural followup it enables) would further limit inlining opportunities.

What are your thoughts on this?

I am in favor of this, for all the reasons you mention. In fact, I was
tempted to do this myself in the past. This is particularly an issue
for external tools that want to leverage LLVM without forking the
upstream, which is a usecase that LLVM should support better.

Generally, I think LLVM errs too much on the side of monomorphizing
everything, which sounds nice in theory, but the runtime performance
implications are often questionable, and the compile time certainly
suffers. I think we should take some time to think through the details
of this change, but we should make it.

Cheers,
Nicolai

The IRBuilder is currently templated over a constant folder, and an instruction inserter. https://reviews.llvm.org/D73835 proposes to move this towards using virtual dispatch instead. As this is a larger design change, I would like to get some feedback on this.

The current templated design of IRBuilder has a couple of problems:
1. It's not possible to share code between use-sites that use different IRBuilder folders/inserters (short of templating the code and moving it into headers).
2. Methods currently defined on IRBuilderBase (which is not templated) do not use the custom inserter, resulting in subtle bugs (e.g. incorrect InstCombine worklist management). It would be possible to move those into the templated IRBuilder, but...
3. The vast majority of the IRBuilder implementation has to live in the header, because it depends on the template arguments.
4. We have many unnecessary dependencies on IRBuilder.h, because it is not easy to forward-declare. (Significant parts of the backend depend on it via TargetLowering.h, for example.)

The referenced patch makes the constant folder and instruction inserter use virtual dispatch instead. IRBuilder remains templated, but is only a thin wrapper around IRBuilderBase, which contains all the logic and creation methods. Functions using the IR builder only need to accept IRBuilderBase&, and headers can forward-declare that type.

The disadvantage of the change is that additional virtual dispatch may make the IRBuilder a bit more expensive. Moving the implementation of IRBuilder methods from the header into the cpp file (not direct part of the proposed change, but a natural followup it enables) would further limit inlining opportunities.

What are your thoughts on this?

I am in favor of this, for all the reasons you mention. In fact, I was
tempted to do this myself in the past. This is particularly an issue
for external tools that want to leverage LLVM without forking the
upstream, which is a usecase that LLVM should support better.

Generally, I think LLVM errs too much on the side of monomorphizing
everything, which sounds nice in theory, but the runtime performance
implications are often questionable, and the compile time certainly
suffers. I think we should take some time to think through the details
of this change, but we should make it.

Cheers,
Nicolai

So long as we don't see any measurable (negative) performance impact, I agree that we should do it.

-Hal

IMHO, all the reasons you mention are valid ones.

If you have time, maybe a "before/after" benchmark on a pass that
extensively uses IRBuilder, on a code that would trigger it a lot (I
don't have anything in mind right now).

+1

Nikita posted some performance comparisons on the Phabricator review
https://reviews.llvm.org/D73835, results here:
https://gist.github.com/nikic/9a87083358d98f34e8e64851a84ff864

It seems like the difference in compile time is in the noise.

Cheers,
Nicolai

Agreed. This would also be beneficial to projects like Clang and (maybe?) the frontend library that wish to expose interfaces for emitting code without necessarily exposing their internal function-builder types (e.g. Clang’s CodeGenFunction). Currently this would only be possible when either (1) defining everything in a header or (2) hard-coding a constant-folder and insertion callback.

John.