RFC: Enzyme, Automatic Differentiation for LLVM as an LLVM Incubator Project

Hi all,

Automatic differentiation (AD) is a key component in algorithms used in machine learning, scientific computing, and elsewhere.

For the last year-and-a-half, the Enzyme group have been looking at the practical possibility of doing automatic differentiation as part of the LLVM optimization pipeline. Performing automatic differentiation in LLVM is quite beneficial as it allows all of the languages that lower to LLVM to incorporate automatic differentiation without much additional work. It also allows for automatic differentiation across languages, which is similarly beneficial.

One unexpected benefit we found of doing AD at the LLVM-level is that there is a significant performance benefit (4.2x in our tests) to be gained by performing AD after LLVM’s optimization passes [1].

After several months of testing with various users including the Rust [4, 5], C/C++, Julia [6], Fortran, and machine learning communities, we’d like to share LLVM-based automatic differentiation more widely and ask to be considered as an LLVM incubator project.

Our code is available here (https://github.com/wsmoses/Enzyme/tree/master/enzyme) as a plugin for LLVM versions 7 through master. We’ve had weekly meetings for the past several months with folks from MIT, Argonne, Princeton, Google, NVIDIA, and Facebook and welcome anyone who wants to join. Documentation and install instructions for Enzyme is available here: https://enzyme.mit.edu. We have our charter available here:
https://docs.google.com/document/d/10IK2EgZa-4WF0lOSlkND1_cX3IQLAxEVSOWqbQzNpcs/edit#

Performing automatic differentiation inside of LLVM presents several interesting technical questions, which we’ve explored with the community in a poster and SRC talk at the 2020 US LLVM Dev Meeting [2, 3].

The Enzyme team

[1] https://proceedings.neurips.cc/paper/2020/file/9332c513ef44b682e9347822c2e457ac-Paper.pdf

[2] https://c.wsmoses.com/posters/Enzyme-llvmdev.pdf

[3] https://www.youtube.com/watch?v=auQNFDlaXdM, https://c.wsmoses.com/presentations/enzyme-llvmdev-reduced.pdf

[4] https://github.com/tiberiusferreira/oxide-enzyme https://github.com/bytesnake/oxide-enzyme,

[5] https://internals.rust-lang.org/t/automatic-differentiation-differential-programming-via-llvm/13188

[6] https://github.com/wsmoses/Enzyme.jl

Hi,

Since the project is alright going on for some time, what is the current level of maturity?
You’re proposing to get through as an incubator project, how far is it from getting at the point where it would integrate the monorepo? How do you see the roadmap on this?

Thanks,

Integration into the monorepo is an interesting logistical question.

Right now with the absence of AD for prior LLVM versions, Enzyme aims to support LLVM 7 onwards for users that require a specific LLVM (e.g. Julia currently requires LLVM 9).

Obviously integration into the monorepo itself would fix this for subsequent versions so it’s somewhat of a chicken-and-the egg issue. I’d love to hear any thoughts from folks on how something like this might be eventually handled. I’d also like to see upstream users (for example to differentiate MLIR – ideally with nice integration for reductions, see comment below regarding parallelism).

Outside of logistics, development velocity is somewhat high right now as we explore efficient extensions to Enzyme for parallelism (CPU, GPU, MPI, etc). In essence the additional complexity from parallelism stems from the fact that a benign read race in the forward pass becomes a write race in the reverse pass. Ideally this is handled with efficient reductions for performance (we currently support a specific subset of parallel codes and fall back to using atomics).

My hope would be to graduate to the monorepo or similar after settling these questions and having more folks battle-test the system.

Hi,

This would be very useful for the field of high energy physics and data science in general.

Having enzyme as part of llvm would reduce the amount of custom code we have in our autodiff tool clad. I can see it being a common ground of a bunch of other tools across languages generating llvm.

Thanks for working on making this upstream!

Best, Vassil

Hi William,

I think this is a really cool project and worthy of being in LLVM. For now it’s a plugin pass, which goes well with incubator projects, but it could very well be a standard IR pass that is enabled by flags, etc. We have similar examples for OpenCL, OpenMP, etc. which need integration on both Clang and LLVM. Shouldn’t be too messy.

Hi all,

Since it seems like all of the feedback here is positive, what would be the next steps (migrate Enzyme mailing list to LLVM, create discord/discourse, etc)?

@Renato

The split support sounds like a good solution to me.

Regarding Enzyme/MLIR, the idea there isn’t necessarily to use Enzyme to differentiate MLIR directly, but lowering MLIR to LLVM then running Enzyme could be an interesting (though not necessarily ideal) way to provide differentiable programming in MLIR. We’re also considering extending Enzyme to work directly on MLIR as well and while indeed many parts of the analysis are specific to LLVM instructions, other differentiation specific analyses likely will have components which can be shared (e.g. Activity Analysis which determines whether there exists a path through the program that enables differentiable information to flow from input to output).

Cheers,
Billy

Since it seems like all of the feedback here is positive, what would be the next steps (migrate Enzyme mailing list to LLVM, create discord/discourse, etc)?

I wouldn’t worry about merging the lists too soon (very high traffic). MLIR has a separate channel and that seems to be working for them, you could follow their path at least for now.

I couldn’t find the code’s license, but since you’re working with MIT, I imagine it’s compatible (and convertible) to the LLVM license. Everything else checks for me, including the migration plan towards the monorepo.

Mehdi, does that answer your questions?

Should we look for more people to have a look and comment? Alex, perhaps mentioning on the weekly again next week to see if we get more people to look at it?

Regarding Enzyme/MLIR, the idea there isn’t necessarily to use Enzyme to differentiate MLIR directly, but lowering MLIR to LLVM then running Enzyme could be an interesting (though not necessarily ideal) way to provide differentiable programming in MLIR. We’re also considering extending Enzyme to work directly on MLIR as well and while indeed many parts of the analysis are specific to LLVM instructions, other differentiation specific analyses likely will have components which can be shared (e.g. Activity Analysis which determines whether there exists a path through the program that enables differentiable information to flow from input to output).

MLIR doesn’t always goes to LLVM IR, that’s why I suggested it.

Since it seems like all of the feedback here is positive, what would be the next steps (migrate Enzyme mailing list to LLVM, create discord/discourse, etc)?

I wouldn’t worry about merging the lists too soon (very high traffic). MLIR has a separate channel and that seems to be working for them, you could follow their path at least for now.

I couldn’t find the code’s license, but since you’re working with MIT, I imagine it’s compatible (and convertible) to the LLVM license. Everything else checks for me, including the migration plan towards the monorepo.

Mehdi, does that answer your questions?

Yes, sorry I didn’t follow up but William’s answer was perfectly fine with me.
First step will be to get a repo on github in the LLVM project, we can setup a phabricator project to track it if needed (other incubator projects are using pull-requests).
After that a subsection in the incubator category with the others there: https://llvm.discourse.group and similar on Discord is fairly straightforward.

Should we look for more people to have a look and comment? Alex, perhaps mentioning on the weekly again next week to see if we get more people to look at it?

Yes that’d be great to have more people chime in and express some support on this!

Since it seems like all of the feedback here is positive, what would be
the next steps (migrate Enzyme mailing list to LLVM, create
discord/discourse, etc)?

I wouldn't worry about merging the lists too soon (very high traffic).
MLIR has a separate channel and that seems to be working for them, you
could follow their path at least for now.

I think the idea was `enzyme-dev@lists.llvm.org` not merging the lists :wink:

Should we look for more people to have a look and comment? Alex, perhaps
mentioning on the weekly again next week to see if we get more people to
look at it?

Yes that'd be great to have more people chime in and express some support
on this!

+1

Ah! That works, too. :slight_smile:

Yeah we’ve explicitly marked our code as being under the LLVM license, but I’ll explicitly check back in with our Technology Licensing Office to make sure everything is square.

And the more people who have thoughts, the better!

I think the idea was [enzyme-dev@lists.llvm.org](mailto:enzyme-dev@lists.llvm.org) not merging the lists :wink:

Ah! That works, too. :slight_smile:

Indeed, simply moving the current enzyme-dev Google Group to be a proper llvm list was the idea.