[RFC] BOLT: A Framework for Binary Analysis, Transformation, and Optimization

Hi Maksim,

Any updates on adding BOLT to LLVM?

If you need any help / support, feel free to ask. The World is waiting
for BOLT! :slight_smile:

Yours,
Andrey

One more thing (to clarify my interest): my team is working on Golang
support in BOLT, and we're keen to open-source our developments
(pending approvals from the higher-ups). It's much more preferable for
us to contribute our code to LLVM project.

Hi Andrey,

We appreciate your interest and we look forward to collaborating. We are currently rebasing BOLT on top of LLVM trunk. Since it’s been a while since the last rebase, this is a bit of an involved task and we need to work through a rather lengthy list of conflicts. After we finish this and make sure BOLT works on the new repo, we plan to publish the list of commits and the merging diff so the community can evaluate a project merge proposal that works.

Regarding the project organization, remember BOLT was created before llvm monorepo. To address this, we are currently going for a similar approach to the one used by flang, re-editing all of our history on top of a new folder structure (root repo /bolt, similar to /flang), but trying to keep old commits mostly intact so we preserve project history – I’m happy to change this to whatever makes more sense to the community. The least intrusive way to do this that I found was the flang merge approach. Now, because the project is not so small, we need a starting point that works in LLVM trunk, everything self-contained in /bolt with as few diffs as possible in /llvm, and then from there possibly work on evolving the project to other suggested organization (such as breaking up BOLT in a lib in llvm/lib). But first we wanted to start with the rebase that we knew would take some time.

That’s the gist of the current direction, thanks for pinging!

-Rafael

Hi Rafael,

Thanks for the update!

I understand that preparing a big project for inclusion into LLVM
properly is a ton of work. Again, if you need any help / support,
please let me know.

Yours,
Andrey

Hi,

We finished rebasing BOLT on top of the LLVM monorepo and we verified that the new BOLT is performing as expected. To make BOLT work, we have a few changes to LLVM libs, which we will submit for review (first changes are already up: D97531, D97830, D97899, D97898, D97891, D97830).

The plan for the initial BOLT commit is to include all its parts under a single directory, either /bolt or /llvm/tools/llvm-bolt. Once complete, this approach will allow people to directly contribute to the project and start using BOLT as part of LLVM. After this phase, we would like to start working with the community to break BOLT into separate components that will make it easier to build new tools based on the BOLT technology. As suggested by Propeller folks, we will split the disassembler component from the rest and make it possible to perform optimizations on low-level binary IR, which will likely have a serializable form.

It’s still unclear, though, the proper location of BOLT in the monorepo. In our rebased branch, we are currently in a /bolt top-level folder in the monorepo, but are also considering /llvm/tools/llvm-bolt.

We are trying to work out the pros and cons of living in these locations and would appreciate community input. From our understanding, living under the /bolt top-level folder would give BOLT the following advantages:

  • More independence to build a test infrastructure for BOLT. We could make check-bolt depend on LLD, for instance, if we need to build binaries on the fly to test BOLT features. Generating test inputs is a big problem for us, since we can’t add real-world test binaries into the LLVM repo (which are awkward to track in the repo and also use a lot of space).
  • We would share a similarity with other large projects such as flang and lld in location: these projects have their own top-level folder too.
  • It would make more sense to live in a top-level folder because we intend to support building multiple tools (llvm-bolt, llvm-boltdiff, perf2bolt, merge-fdata). Living under llvm/tools is typically reserved for simpler single-binary projects.

Living in /llvm/tools/llvm-bolt, on the other hand, is perhaps more aligned with a longer-term goal of migrating BOLT to live as a lib under /llvm/lib and has the following advantages:

  • Piggybacking on the LLVM release process, BOLT is released along with other llvm tools
  • Piggybacking on buildbots being configured to build llvm tools, the project is more robust and well tested
  • BOLT was originally developed to live under tools, and the project was named llvm-bolt to reflect that
  • Being closer to LLVM will allow BOLT to migrate functionality more easily to llvm/lib

Any thoughts on this?

I’m probably not the most relevant opinion here, so take any of this with a grain of salt: Generally I’d err towards inclusion in the llvm subproject, as you say, for easy movement of code into reusable libraries, etc - though I guess if you’re a sibling like clang or lld that’s still possible - sinking code down into the common llvm infrastructure as desired.
How much code is bolt? If it’s in llvm, how much more CPU time does it add to build and test?

As for testing - is llvm-mc’d assembly sufficient for testing? That might be a tipping point in deciding whether it should live separately (so that folks can opt out of it)
“real-world test binaries” are probably not at thing that should be part of the usual testing, if by that you mean existing/production binaries, as opposed to small targeted binaries of only a few instructions (enough to demonstrate some specific feature of bolt). In the same way that lld’s test suite doesn’t have “real world” object files being linked into full production binaries, but small targeted/hand-crafted examples.

Hi David,

With respect to the amount of code, what we would add is pretty much the code that is in https://github.com/facebookincubator/BOLT but rebased to use updated LLVM interfaces. These files/folders in the root folder of the facebook github repo would then live in llvm/tools/llvm-bolt (that’s how we do it but in an older fork of llvm).

Regarding CPU time used for the build of llvm, the burden we add is about 80 new C++ files and 2 binaries to be linked (llvm-bolt and merge-fdata – other tools are just a symlink to llvm-bolt). I did a quick check here and my machine built llvm+clang+lld in 6m5s (user time 273m) and llvm+clang+lld+bolt in 6m20s (user time 284m). Testing in LIT is minimal at 20 tests (running in a few seconds), but we would like to expand it and support it better. Internally we have more LIT tests, but unfortunately they rely a lot on real binaries (not necessarily large, but think of bzip2, for example, which is large enough to do not make sense to put it into the repo because it doesn’t isolate a single feature of bolt that needs testing).

This smaller set of 20 tests we currently have are targeted hand-crafted inputs written in assembly, which are nice to read and understand, but the problem is that they require the linker to be consumed by BOLT. If we can’t use a linker, I guess we could check the binaries directly to the repo if they are minimal, even though people wouldn’t be able to easily read the contents. We could make BOLT read .o files directly for testing purposes (straight out of llvm-mc), but that feature needs to be developed.

In general lllvm/tools are supposed to be entry points that exercises the LLVM Libraries. I’d be concerned about adding a tool/bolt that contains more than that (i.e. the entire implementation of the framework, instead of having it live in libraries). But it seems like you intend this as a step towards this? Is there a well defined plan to get there?

Is it difficult / overly involved to split things like the disassembler and other components in libraries that can live in llvm/lib/... and use them from tools/bolt/? Can this be done ahead of time and upstream these libraries first ahead of bolt itself?

Thanks,

Hi Rafael, Thanks for the update on the plan.

I have a question about upstreaming phase ordering. Is there a strong reason to proceed with the order as proposed? It seems more natural to me to do the other way around: 1) refactoring bolt code; 2) check-in utility libraries in LLVM, and then 3) push the BOLT main implementation. There are many advantages doing that:

  1. It makes patches smaller and easier to review
  2. It enables more fine grain testings for supporting libraries;
  3. It gives us a better idea where to drop the primary BOLT product (the two choices proposed).

Thoughts?

thanks,

David

Hi Mehdi and David,

Indeed, we share similar concerns. We do intend to move functionality of BOLT to live as a library, but the timeline is unclear. In fact, most of BOLT could live in a library already, it’s just a matter of moving some files into separate components. Instead of the files living in tools/llvm-bolt, most could just be moved under lib/something, and we already have a llvm-bolt.cpp file that instantiates the driver that coordinates the binary rewriting process, which is the entry point of BOLT as a library. People could already leverage this to use BOLT in different ways (for example, I wrote some time ago a different utility that runs the driver for two different binaries and compares the two – this was named boltdiff later).

My main reason for committing the project as a whole first, in the same way as flang did, though, (as a project merged into the monorepo), is because BOLT is already opensource for a while, and it is a 6-year old project with about 800 commits and 50K lines of code and we know we have people who forked the project and would like to contribute to it. If I commit into LLVM a different BOLT (not just rebased), then I (a) break or make it hard for any work on top of it from other contributors, (b) lose the original history or make it harder to preserve it. That’s why I was going for a more smoother transition. I, as a developer, put value in the ability to blame and to understand why things were built a certain way, and not bringing BOLT’s history (in the same way as flang did) would mean we and the community loses a lot of context on the decisions of the project. And I guess that’s also the rationale for a monorepo, to have multiple projects merged together.

Because of that, I initially put bolt under /bolt, following flang’s model of merging the history so every developer has the right context. But the original location was under llvm/tools.

That makes sense, but something unclear to me is that refactoring it in separate libraries in-tree right after merging it will also “break any work on top of it” from people who forked it, wouldn’t it? How would this be managed after Bolt gets in-tree?

I guess a first step could be to produce a “snapshot” of the monorepo after you rebase, so that folks can look at the actual proposal, the code structure, and discuss the actual modifications that would be required pre-merge and agree and the plan post-merge. How does it sound to you?

Best,

As with others, I’m not very aware of the internal architecture of bolt, so take this with a grain of salt:

From what I understand, I have a slight preference for starting this out as a /bolt top level “subproject”, because the code currently sounds monolithic. As the implementation logic is refactored into more reusable units, those library can be cleanly movable within the monorepo, e.g. under the llvm-project/llvm directory if appropriate.

The advantage of doing this is that nothing in the llvm-project/llvm repo can come to depend on the bolt code until and if it gets refactored. This is also how things like LLDB started out (and it would be great for more of the reusable libraries in LLDB to be merged into LLVM over time).

Does anyone have any concerns about this approach?

Unrelatedly, I’d also love to see the llvm repository exploded a bit into more top level repos, e.g. splitting support/adt out to their own thing. It is also worth considering splitting the MC layer out to its own thing as well, LLVM IR and the mid-level optimizer into its own thing, and CodeGen and the targets into its own thing.

The major constraint we need is that we want the dependences between top-level subproject to be a strong DAG between the subproject now and defensible into the future, and we don’t want minor evolution of the codebase to cause libraries to have to be moved around. The benefit of splitting it up is easier to enforce layering, encouraging LLVM developers to work across subproject a bit more, and making it easier for subproject to depend on slices of “the big llvm directory”.

-Chris

Dropping Bolt to the top level directory sounds reasonable, but perhaps a hybrid approach similar to what is mentioned by Medhi can be applied. Basically Bolt first goes through a round of refactoring in github upstream first with design that is close to the future structure in LLVM, and then drops in as a monolithic piece initially. This will make future restructuring much easier. There are other benefits: 1) it is a good opportunity to clean up Bolt’s internal APIs 2) It is time to beef up unittests; 3) it makes code review easier.

David

Chris, the approach of living under /bolt sounds reasonable to me.

Mehdi and David, the difference of doing things in-tree vs out-of-tree is that, currently, BOLT out-of-tree has

(1) different legal requirements for accepting contributions (external contributions require devs to sign a CLA). So I agree with Mehdi that the same forks will get broken as we refactor code, but once BOLT is in the llvm monorepo, at least they will have the chance to upstream it with different legal requirements. If they don’t want to upstream it, that’s fine too, but I would like to give them a chance.
(2) a different development workflow that is less open than LLVM’s. Because we want the input of the community on a refactoring that reflects how they want to use the libraries too, it would be more natural for this to happen inside in-tree LLVM.

David, if we try to coordinate this refactoring happening in both repos (library part in LLVM while the client part in our separate repo), that will be challenging to do because we wouldn’t be able to easily test the LLVM’s diffs – a problem we are already facing with upstreaming our changes to LLVM without BOLT being there to easily show devs how our changes are actually used and tested. Moreover, other contributors who don’t have easy access to our github repo will have a hard time working with us in the refactor as they wouldn’t be able to do work on the tool (just the open library).

Mehdi, your suggestion looks good, I intend to show everyone the monorepo snapshot. We are making sure it is ready to be published and that’s why I’ve been referring to our snapshot as “imagine our github repo contents are under /bolt” because that is pretty much it, but I will present it soon.

Mehdi, here is the snapshot of the LLVM monorepo with bolt living in /bolt:

https://github.com/facebookincubator/BOLT/tree/rebased

Last 6 commits will probably be rewritten or removed as we upstream changes to LLVM that need to land before the commits that change exclusively files in /bolt are pushed.

Chris, the approach of living under /bolt sounds reasonable to me.

Mehdi and David, the difference of doing things in-tree vs out-of-tree is that, currently, BOLT out-of-tree has

(1) different legal requirements for accepting contributions (external contributions require devs to sign a CLA). So I agree with Mehdi that the same forks will get broken as we refactor code, but once BOLT is in the llvm monorepo, at least they will have the chance to upstream it with different legal requirements. If they don’t want to upstream it, that’s fine too, but I would like to give them a chance.
(2) a different development workflow that is less open than LLVM’s. Because we want the input of the community on a refactoring that reflects how they want to use the libraries too, it would be more natural for this to happen inside in-tree LLVM.

David, if we try to coordinate this refactoring happening in both repos (library part in LLVM while the client part in our separate repo), that will be challenging to do because we wouldn’t be able to easily test the LLVM’s diffs – a problem we are already facing with upstreaming our changes to LLVM without BOLT being there to easily show devs how our changes are actually used and tested. Moreover, other contributors who don’t have easy access to our github repo will have a hard time working with us in the refactor as they wouldn’t be able to do work on the tool (just the open library).

Hi Rafael, I am not actually proposing an intermediate state where parts of BOLT lives in LLVM while the client lives in a separate repo. What I meant is a restructuring step within BOLT before dropping in LLVM. For instance, in the bolt’s top directory, there are lots of different things – different driver programs, profile reader/writers, debug info handling, exception handling code, BOLT IR/core data structures (BB, Loop, Function) etc, pass managers etc. The Pass directory is also pretty flat. Some preliminary reorganization with more tests added can reduce a lot of churns in the future. WDYT?

thanks,

David

Let me add my modest +1 vote to committing BOLT as it is, and *then*
restructuring it as a part of LLVM development process -- with proper
reviews, etc.

This is how flang and OpenMP runtime had been added to LLVM project.
This is a sure way to start things going; otherwise we may end up with
a project preparing for inclusion into LLVM ad infinitum.

Yours,
Andrey

I think one thing we can all agree upon is the community wants a good balance between velocity and quality (ensured by proper reviews). I believe doing some preliminary restructuring and cleanups can help not only the quality, but improves velocity as well. A good structure serves the purpose of ‘self-documentation’ and will greatly help code reviewers (to be more effective).

thanks,

David

Let me add my modest +1 vote to committing BOLT as it is, and then
restructuring it as a part of LLVM development process – with proper
reviews, etc.

This is how flang and OpenMP runtime had been added to LLVM project.

Actually if I remember correctly flang went through multiple months of preparatory upgrade that were asked for by some people in the community, and they did so out-of-tree before getting ready to land in a single merge.

This is a sure way to start things going; otherwise we may end up with
a project preparing for inclusion into LLVM ad infinitum.

We just have to make the expectation very clear and having a “moving goalposts” situation and it should work fine. Any particular reason that would put us in a “ad infinitum” situation?

Let me add my modest +1 vote to committing BOLT as it is, and then
restructuring it as a part of LLVM development process – with proper
reviews, etc.

This is how flang and OpenMP runtime had been added to LLVM project.

Actually if I remember correctly flang went through multiple months of preparatory upgrade that were asked for by some people in the community, and they did so out-of-tree before getting ready to land in a single merge.

As the person who requested the most changes for flang I concur here. There was some negotiation as to what was reasonable to expect before and what was easier to add after. I think we should get a proposal and a change that shows what we’re looking at as far as inclusion and we can make our evaluations at this point.

Thanks!

-eric