Contributing Bazel BUILD files similar to gn

gcmn · October 28, 2020, 11:18pm

Hi all,

tl;dr: We’d like to contribute Bazel BUILD files for LLVM and MLIR in a side-directory in the monorepo, similar to the gn build.

Some of us have been working on open-source Bazel BUILD files for the LLVM Project. You may have seen us hanging out in the #build-systems discord channel. As you may know, Google uses Bazel internally and has maintained a Bazel BUILD of LLVM for years. Especially with the introduction of MLIR, we’ve got more and more OSS projects with a Bazel BUILD depending on LLVM (e.g. IREE and TensorFlow). We’re also not the only ones using Bazel: e.g. PlaidML also has a Bazel BUILD of LLVM that they’ve borrowed from TF. Each of these projects has to jump through some weird hoops to keep their version of the Bazel BUILD files in sync with the code, which requires some fragile combination of scripts and human intervention. Instead, we’d like to move general-purpose Bazel BUILD files into the LLVM Project monorepo. We expect to follow the model of the GN build where these will be maintained by interested contributors rather than expecting the general community to maintain them.

To facilitate and test this we’ve been developing a standalone repository that just has the Bazel BUILD files. It symlinks together the directory trees on top of a submodule as we would need in the monorepo to to avoid in-tree BUILD files. The configuration is at https://github.com/google/llvm-bazel. We now have those in a good place and think they would be useful upstream.

Details

What

Bazel BUILD files for the LLVM, MLIR, and Clang (PR out for review) subprojects, potentially expanding to others, as needed. Basically everything currently at https://github.com/google/llvm-bazel.

Where

In https://github.com/google/llvm-bazel the BUILD files live in a single directory tree matching the structure of the overall llvm-project directory. For users, @llvm-project is a single Bazel repository that includes both LLVM and MLIR subprojects. To maintain this structure, we would probably want to put a bazel directory in the monorepo’s utils directory, which currently only contains a directory for arcanist. This is different from gn, which is under the LLVM subproject’s utils directory. We could similarly put the Bazel BUILD files under llvm/utils/bazel but have them be for the entire llvm project (the subsets that are supported). This seems like an odd structure to me, but I know that the CMake build for LLVM also builds the other subprojects, so maybe this would be preferable.

Alternatively we could split each subproject into a separate Bazel repository and put the Bazel build files under each subproject. I think this fragments the configuration of the BUILD without much benefit.

Configurations

We currently have configurations for Linux GCC and Clang, MacOS GCC and Clang, and Windows MSVC. Support for other configurations can be added as-desired, but supporting all possible LLVM build configurations is not the goal.

Support

Support would be similar to the gn build. Contributors could optionally update the Bazel BUILD files as part of their patches, but would be under no obligation to do so.

Preserving History

I don’t think the history of llvm-bazel is interesting enough to try to merge it into the monorepo and I was planning to submit this as a single patch, but please let me know if you disagree.

Benefits to the community

Projects that depend on LLVM and use the Bazel build system can avoid duplicating fragile effort. We’ll spend more time contributing to LLVM instead
Bazel is stricter than CMake in many ways (e.g. it requires that even header dependencies be declared) and can catch layering issues very easily. There’s even an optional layering_check feature we could turn on if its use would benefit the community. (though currently the existing problematic layering makes it a burden to maintain on our own). Even without that additional check, as I’ve been keeping the Bazel build green, I’ve found and fixed a number of layering issues in the past couple weeks (e.g. https://reviews.llvm.org/rGb49787df9a and https://reviews.llvm.org/rGc17ae2916c).

Here’s a patch adding the Bazel build system. It’s basically just cp -r llvm-bazel/llvm-bazel llvm-project/utils/bazel.

tstellar · October 29, 2020, 3:22pm

Hi all,

tl;dr: We'd like to contribute Bazel BUILD files for LLVM and MLIR in a side-directory in the monorepo, similar to the gn build.

Some of us have been working on open-source Bazel BUILD files for the LLVM Project. You may have seen us hanging out in the #build-systems discord channel. As you may know, Google uses Bazel internally and has maintained a Bazel BUILD of LLVM for years. Especially with the introduction of MLIR, we've got more and more OSS projects with a Bazel BUILD depending on LLVM (e.g. IREE <https://github.com/google/iree>and TensorFlow <https://github.com/tensorflow/tensorflow>\). We're also not the only ones using Bazel: e.g. PlaidML also has a Bazel BUILD of LLVM that they've borrowed from TF <https://github.com/plaidml/plaidml/blob/master/vendor/llvm/llvm.BUILD>\. Each of these projects has to jump through some weird hoops to keep their version of the Bazel BUILD files in sync with the code, which requires some fragile combination of scripts and human intervention. Instead, we'd like to move general-purpose Bazel BUILD files into the LLVM Project monorepo. We expect to follow the model of the GN build where these will be maintained by interested contributors rather than expecting the general community to maintain them.

To facilitate and test this we've been developing a standalone repository that just has the Bazel BUILD files. It symlinks together the directory trees on top of a submodule as we would need in the monorepo to to avoid in-tree BUILD files. The configuration is at GitHub - google/llvm-bazel. We now have those in a good place and think they would be useful upstream.

Can you explain some of the benefits to using Bazel instead of CMake?

I'm a little concerned about having two 'unsupported' buildsystems living in tree, and I'm not sure what would stop us from continuing to add more. I would feel better if we had a set of guidelines to define the criteria for adding a new buildsytem and also criteria for when we can remove them.

Would you be able to amend this proposal to include some general guidelines for adding/removing new buildsystems, so that we can discuss that too?

Thanks,
Tom

Renato_Golin2 · October 29, 2020, 4:06pm

I have used Bazel and it doesn’t seem to map well to CMake. It seems to be in between CMake and Ninja with a lot of hard-coded dependencies that are cumbersome to keep updating. I’m by no means an expert, and I could very well be wrong, but supporting more than one build system is not trivial (remember the autoconf days?).

For example, when trying to implement the same logic on both will not be trivial. So, whenever we want to add some functionality or improve how we build LLVM with one system, we’ll have to do so in multiple build systems that do not easily match each other. If we don’t try to match functionality, we’ll segregate the community, because people will be able to do X on build system A but not B, and the similar features cluster together and then we have essentially two projects built from the same source code.

Testing this, or worse, trying to fix a buildbot that is built with Bazel (and having to install Java JDK and all its dependencies) on potentially a hardware that you do not have access to, will be a nightmare to debug. The nature of post-commit testing, revert and review of LLVM will not make that simpler. Unless we treat the Bazel build as “not our problem” (which defeats the point of having it?).

To make matters worse, our CMake files are not simple, and do not do all of the things we want them to do in the way we understand completely. There is a lot of kludge that we carry and with that comes in two categories: the things that we hate and would love to fix, and the things that are fixes that we have no idea are there. The former are the reasons why people want to start a new build system, the latter is why they soon realise that was a mistake (insert XKCD joke here).

If the Bazel files can be completely ignored, then it’s just more clutter. But if other projects start to use more different build systems and we start packing them all in LLVM, then we’ll have a hard time knowing what we build how. I can’t really see this scaling.

Two-cents worth.
–renato

stefan.teleman · October 29, 2020, 4:54pm

I can, and I will be very brief: None.

dblaikie · October 29, 2020, 7:16pm

I /believe/ the idea is that, like gn, there are folks maintaining these build systems out of tree anyway - and having them in tree makes it easier to coordinate that effort, with the express intent of not burdening the general community with their upkeep (like gn currently - the idea is that there’s no burden on developers to update gn build files (& consequently bazel build files)).

So far the gn build inclusion seems to have gone OK, I think? So maybe the Bazel thing would be similar?

Sounds like the work is being done out of tree anyway, but having it in-tree makes it a bit easier to coordinate interested parties, while not adversely affecting unrelated parties, I think?

Though I’m not sure what the tradeoff/cost of this is compared to having a separate project holding the build files, with LLVM as a git submodule. Not knowing a lot about it, that /sounds/ like it gets most of the benefits/not sure what the costs are?

Dave

dblaikie · October 29, 2020, 7:17pm

This is a fairly unhelpful email - clearly folks using Bazel derive some benefit/have chosen some tradeoff compared to CMake. Doesn’t have to be the thing you want, but it’s pretty unhelpful to dismiss/diminish the needs of others like this.

Dave

stefan.teleman · October 29, 2020, 7:40pm

This is a fairly unhelpful email - clearly folks using Bazel derive some benefit/have chosen some tradeoff compared to CMake. Doesn't have to be the thing you want, but it's pretty unhelpful to dismiss/diminish the needs of others like this.

I did not see a rationale for the Bazel proposal, outlining its
benefits over CMake.

Speaking with direct experience with Bazel - Tensorflow - I cannot
think of a single reason why it would/should be considered "better"
over the current CMake.

Everyone has their own favorite build system. That is nice, but it is
not enough of a reason to propose adding it.

I would also like to become informed as to what particular
needs/shortcomings/defects are addressed by Bazel, that are lacking in
/ cannot be addressed by CMake.

Thanks.

Renato_Golin2 · October 29, 2020, 7:49pm

Perhaps the initial assumption about my concerns weren’t well articulated.

I get that those files would be “additional” and other developers won’t need to care much about them.

But what happens when people join the project with experience in Bazel and, instead of building pure LLVM with CMake, they start using Bazel for everything, just because they’re used to it?

Bazel is big enough (at least inside Google) that the probability of that happening is not trivial.

What if they create sub-projects that can only build with Bazel? Do we refuse inclusion? But don’t we have Bazel files already?

One big example is Android. They used to build LLVM in a very different way, and the inclusion of run-time library files was completely different. So different it was not possible to merge some changes they had (128 bit maths IIRC) because of the amount of work required.

My point is that adding another build system will not necessarily improve the chances of external people contributing to LLVM if they use those build systems. It may very well reduce those chances.

Once we get to the point where Bazel support is “complete” enough, and enough other projects that use LLVM use Bazel (I assume many internal Google projects), the problem I describe above is bound to happen sooner or later.

Personally, I’m happy to ignore Bazel and continue using CMake. But I just wanted to make clear that in the past, using a different build system did not increase the chances of contribution, so that’s not a given in this case either.

cheers,
–renato

Renato_Golin2 · October 29, 2020, 7:56pm

I don’t think they’re proposing adding Bazel as a new core build system nor replacing CMake.

This is just about adding Bazel files to the project so external projects (like Tensorflow) can build LLVM more easily. Also, so that all Bazel-based builds that use LLVM can share the same files without having to reproduce them in every sub-project.

It is a worthy goal in itself, but I think this speaks very loudly to how weird it is to build LLVM. We had to add some horrible hacks on our project because exporting CMake and TD files to our project (in order to use MLIR) was super weird.

So, perhaps there’s an underlying goal there to finally fix the LLVM build “once and for all”, and export libraries, headers, and meta-files in an orderly fashion, so that wrapping projects don’t need to care what build system LLVM is in.

But I’m not a build-system specialist, so I can’t even begin to fathom how that would work. I’m not even sure that’s possible, so… lots of salt.

In the meantime, having those files wouldn’t be the end of the world. But I fear that once we add, they’ll stay there forever, and will lead to people ignoring CMake and segregating the project.

cheers,
–renato

stefan.teleman · October 29, 2020, 8:02pm

Yes that is my main concern as well.

Build systems for complex projects are ... messy. I believe that what
we have right now - with CMake - works quite well. And I am perfectly
aware of the insane amount of work that has gotten into making LLVM
quite easy to build.

mehdi_amini · October 29, 2020, 8:04pm

(following up on the discussion on Discord)

Hi all,

tl;dr: We’d like to contribute Bazel BUILD files for LLVM and MLIR in a
side-directory in the monorepo, similar to the gn build.

Some of us have been working on open-source Bazel BUILD files for the
LLVM Project. You may have seen us hanging out in the #build-systems
discord channel. As you may know, Google uses Bazel internally and has
maintained a Bazel BUILD of LLVM for years. Especially with the
introduction of MLIR, we’ve got more and more OSS projects with a Bazel
BUILD depending on LLVM (e.g. IREE <https://github.com/google/iree>and
TensorFlow <https://github.com/tensorflow/tensorflow>). We’re also not
the only ones using Bazel: e.g. PlaidML also has a Bazel BUILD of LLVM
that they’ve borrowed from TF
<https://github.com/plaidml/plaidml/blob/master/vendor/llvm/llvm.BUILD>.
Each of these projects has to jump through some weird hoops to keep
their version of the Bazel BUILD files in sync with the code, which
requires some fragile combination of scripts and human intervention.
Instead, we’d like to move general-purpose Bazel BUILD files into the
LLVM Project monorepo. We expect to follow the model of the GN build
where these will be maintained by interested contributors rather than
expecting the general community to maintain them.

To facilitate and test this we’ve been developing a standalone
repository that just has the Bazel BUILD files. It symlinks together the
directory trees on top of a submodule as we would need in the monorepo
to to avoid in-tree BUILD files. The configuration is at
https://github.com/google/llvm-bazel. We now have those in a good place
and think they would be useful upstream.

Can you explain some of the benefits to using Bazel instead of CMake?

The question can be taken with multiple angles:

The benefit for the LLVM project to switch from CMake to Bazel: it isn’t clear even that it would be practical/feasible or serve all the users, this is not the proposal here. Nothing changes for CMake.
The benefit for an LLVM developer to use Bazel instead of CMake: I think it is minor, unless you have access to a remote build farm in your environment maybe?
The benefit for a project that is willing to use LLVM to use Bazel instead of CMake: this is a tricky topic because there are many variables. The Bazel website is a good starting point I think: https://bazel.build
Some of the practical thing I perceive as interesting with Bazel over CMake personally:

Strict checking of the dependencies: this seems strange at first that a missing library dependency won’t allow you to include a header which is present on the filesystem, but this is actually very powerful: this is what allow Bazel to have correctness with incremental build and what enables caching at all. This is also quite fundamental to the “distributed build” mode in Bazel: you can farm-out the build to a large distributed cluster with remote cachine.
Declarative approach: this enables static analysis of the BUILD configuration/graphs, and transformations as well. For example because of the above, the tooling can figure out unused library dependencies, or adding missing dependencies as well.
These two aspects are impossible to achieve in a principled way in CMake. Note that it does not imply that I would (or wouldn’t) pick Bazel from my next open-source project, or that I would recommend LLVM to adopt it as a primary build system!

I can also miss some reasons why some other projects are using Bazel, but they got >400 attendees at the Bazel conference last year apparently: https://blog.bazel.build/2019/12/20/bazelcon-2019.html
Ultimately I don’t judge why other projects are picking Bazel, we’re just proposing making their life easier if they start being interested to include some of LLVM in their project.
In particular, putting on my MLIR ecosystem hat, it’d be great if other projects using Bazel out-there who may have a possible use for MLIR and LLVM could have an easier integration path: right now in practice they may use the Bazel configuration that are shipped inside TensorFlow and try to adjust them to BUILD LLVM and send us patches to integrate in TensorFlow. This is difficult because they may want to improve support for the Bazel config on platforms that TensorFlow does not support: it is just not the right place for people interested in building with Bazel to collaborate.

I’m a little concerned about having two ‘unsupported’ buildsystems
living in tree, and I’m not sure what would stop us from continuing to
add more.

Following up on some concerns from Tom (and others) explained on Discord here:

a) Commit Mailing list traffic for updating these build files. This is a valid point with the GN bot today. It also shows up when I git log llvm/ in the monorepo annoyingly.
The proposal would be to have the Bazel and gn files in a separate folder at the top level of the monorepo: that way no commits email would be sent to any mailing list, and no update would show up in the git log llvm/.
b) CI systems picking up more commits to build when not needed: similarly as above, isolating these in a separate part of the tree will exclude these from bots tracking the llvm/ or clang/ paths.
c) Investing in developing Bazel support means less investment in CMake. It is true that engineers fixing Bazel configs are spending time there instead of in CMake, however the situation is that downstream projects that picked Bazel (for their own reasons, I don’t judge) who start using LLVM are spending the time to maintain these Bazel files out-of-tree. We’re not making the situation worse by allowing the maintainer of these projects (who are also frequently upstream contributors) to just collaborate on their set of patches in a more coordinated way upstream. Also no public LLVM bots builds with an unsupported build system, any feature is expected to build with CMake on every supported platform I believe. It seems like this worked out well with gn in practice?

I would feel better if we had a set of guidelines to define
the criteria for adding a new buildsytem and also criteria for when we
can remove them.

Would you be able to amend this proposal to include some general
guidelines for adding/removing new buildsystems, so that we can discuss
that too?

That’s an excellent point as well! Could we take inspiration from the experimental (or non-experimental) backends? For example if gn does not build anymore, send an email to LLVM-dev@ proposing to remove it and see if any maintainer steps up? Without a community to maintain it we should remove these easily and quickly?

mehdi_amini · October 29, 2020, 8:46pm

I /believe/ the idea is that, like gn, there are folks maintaining these build systems out of tree anyway - and having them in tree makes it easier to coordinate that effort, with the express intent of not burdening the general community with their upkeep (like gn currently - the idea is that there’s no burden on developers to update gn build files (& consequently bazel build files)).

Perhaps the initial assumption about my concerns weren’t well articulated.

I get that those files would be “additional” and other developers won’t need to care much about them.

But what happens when people join the project with experience in Bazel and, instead of building pure LLVM with CMake, they start using Bazel for everything, just because they’re used to it?

I would propose to have the files in a separate tree from llvm/, mlir/, clang/ ; labelling these clearly as unsupported (either in the path to these files or in the README, or both), and not provide any public documentation on llvm.org that would invite users to work with these. The readme would explain how to use them to include LLVM as a dependency to an existing Bazel project and document the intent as such.

Bazel is big enough (at least inside Google) that the probability of that happening is not trivial.

What if they create sub-projects that can only build with Bazel? Do we refuse inclusion? But don’t we have Bazel files already?

This is a fair concern: can we defend against this with a clear policy?
Also: no public bot with Bazel or other build system than CMake should help right?

One big example is Android. They used to build LLVM in a very different way, and the inclusion of run-time library files was completely different. So different it was not possible to merge some changes they had (128 bit maths IIRC) because of the amount of work required.

My point is that adding another build system will not necessarily improve the chances of external people contributing to LLVM if they use those build systems. It may very well reduce those chances.

My intuition was that by having the file upstream, we would instead encourage such users to track the HEAD of our main branch more closely and so provide them an easier path for upstream work. The fact that they can get upstream working with their build environment may provide an incentive to upstream along the way, even if they have to do the CMake integration first.
Ultimately while this may facilitate people to go in one direction or another, I suspect they would just reinforce their natural tendency: people interested in working more upstream will have a better path, and people who have less of this tendency may also have an easier path of integration.

Renato_Golin2 · October 29, 2020, 9:15pm

I would propose to have the files in a separate tree from llvm/, mlir/, clang/ ; labelling these clearly as unsupported (either in the path to these files or in the README, or both), and not provide any public documentation on llvm.org that would invite users to work with these. The readme would explain how to use them to include LLVM as a dependency to an existing Bazel project and document the intent as such.

Sounds good, with the addendum below:

This is a fair concern: can we defend against this with a clear policy?

Also: no public bot with Bazel or other build system than CMake should help right?

Right, as you said later, no emails from broken bots, ie. people that use non-CMake build systems are expected to fix their own builds, even if the breakage came from a third-party change.

Of course, we expect other contributors to help with what they can, but it’s not their responsibility. Such is the cost of having a different build system.

My intuition was that by having the file upstream, we would instead encourage such users to track the HEAD of our main branch more closely and so provide them an easier path for upstream work. The fact that they can get upstream working with their build environment may provide an incentive to upstream along the way, even if they have to do the CMake integration first.

The benefit to keep the files in LLVM is clear to all Bazel users out there. I don’t think that’s a problem for the rest of LLVM.

My point was that the reason why Arm’s patch (128-bit) was “uncontributable” was because their build system was so different, it was impossible to keep both versions on their merged tree. This is a big problem to the Bazel users, not CMake users.

If we keep the policy of “not my problem”, I don’t see a single problem to CMake users. But that’s slightly unfriendly to people that came later to LLVM and “didn’t know better” before creating a whole project in Bazel, etc.

I’m strictly not thinking about myself here.

–renato

keith · October 29, 2020, 10:17pm

I want to jump in with some general support for this addition as a non-googler + frequent bazel user. I find moving between projects that use bazel much more palatable than moving between bazel + cmake projects.

I also think there’s a huge benefit in having strict dependencies. In my experience this is especially true for tests. This way you know you’ve always correctly rebuilt the necessary inputs to a single test, vs using lit directly and having to know / remember which set of binaries are required to be rebuilt based on your current changes.

Zachary_Turner2 · October 29, 2020, 11:11pm

Didn’t the community already go through this exact discussion when gn was added? Let me ask a different question. If gn support was permitted, on what grounds should we refuse a different parallel build system? Either we should allow people to contribute build systems upstream that they wish to maintain, or we should keep every buidl system other than CMake out of the tree.

dblaikie · October 29, 2020, 11:15pm

This is a fairly unhelpful email - clearly folks using Bazel derive some benefit/have chosen some tradeoff compared to CMake. Doesn’t have to be the thing you want, but it’s pretty unhelpful to dismiss/diminish the needs of others like this.

I did not see a rationale for the Bazel proposal, outlining its
benefits over CMake.

Speaking with direct experience with Bazel - Tensorflow - I cannot
think of a single reason why it would/should be considered “better”
over the current CMake.

Everyone has their own favorite build system. That is nice, but it is
not enough of a reason to propose adding it.

I would also like to become informed as to what particular
needs/shortcomings/defects are addressed by Bazel, that are lacking in
/ cannot be addressed by CMake.

I expect most of it is probably a statement free of value judgments: Some other projects chose to use it/some folks have to use it for other reasons, clearly there’s enough use that it’s motivated folks to have/maintain Bazel builds for LLVM for years. Rather than judging their choices as bad/lesser/wrong - might be useful to accept that some folks had their reasons and they’re trying to make the most of the situation. I don’t think anyone’s making an argument that LLVM should switch to Bazel/that that would be better than the CMake we’re using, and I think it’s helpful to return the favor and not suggest that other projects would be better off switching to CMake over Bazel - they no doubt have their reasons.

Dave

stefan.teleman · October 29, 2020, 11:48pm

Please do not manufacture statements that I did not make. I never
suggested, or stated, anywhere, that some other imaginary project
using Bazel should switch to CMake.

I did state that I do not find Bazel to be a better alternative to
CMake. My statement is based on direct experience with both.

If the intent behind Bazel is not to present it as a better
alternative to CMake, then what is the intent? Instead of maintaining
this impenetrable mystery as to why a Bazel build system should be
included in LLVM, please take the time to advocate for Bazel with
technical facts, than "someone at Google really likes it".

Just because someone likes and maintains an alternative build system
for LLVM, somewhere, that does not automatically mean, or imply that
it should be upstreamed.

For all I know, someone might be building their fork of LLVM with
autoconf. I am sure they have their own very good reasons for doing
so. Should we, therefore, bring back autoconf?

Thanks.

kparzysz-quic · October 29, 2020, 11:56pm

On the grounds that it was a bad idea after all.

Any commits going into the LLVM repository should not break any part of it, at least not without a consideration for a fix. There is an exception to it—experimental targets. They can be broken, but they are there with the explicit intent of becoming officially supported.

Same thing applies to the cmake files. If they get broken, they need to be fixed, but the same doesn’t apply to the extraneous build systems. They can be broken and never fixed. There is no commitment from the community as a whole to keep them working. IMO, this isn’t right, and files like that should not be a part of the official repository.

Whether GN or Bazel have superior features is irrelevant. Unless their configuration files are a part of a longer-term transition process, they don’t belong in the repo.

gcmn · October 29, 2020, 11:58pm

This seems to have fragmented into a few separate threads, so apologies if I’m missing someone’s response. I don’t think I’m going to be able to effectively reply inline.

The intention here is not to propose Bazel be another community-maintained build system or that it replace CMake. In my initial message I deliberately didn’t focus on the specific advantages to Bazel because I’m not really trying to convince anyone to use it. Personally I prefer it to CMake, but it’s got some annoying parts I don’t like as well (my project uses both). The goal here is that if you don’t care about Bazel, you should not be impacted. But some projects (and people) do use Bazel and depend on LLVM and right now that means copying around different versions of these files. We’ve been working on at least consolidating these in https://github.com/google/llvm-bazel. I think Mehdi already summarized the reasons we think it would make more sense for these to be in-tree than in a separate repo: it’s a more natural collaboration point.

Tom raised some specific concerns about the grounds for adding or removing a build system and when we know we’ve got bit rot, pointing out that it can be harder to tell with a build system if it’s not one you actually use! In this case, we’ve got a build bot (lowercase, I’m using BuildKite) that builds against head every 15 minutes or so (there are currently some delays caused by pulling in new changes from the monorepo). Does a functioning bot with a clearly visible (though not noisy!) status indicator seem like a reasonable requirement? If this bot remains broken for a long time and/or no one has been updating the build files, then that would be an indication of bit rot. Someone would send a message to the list proposing the build files be deleted and doing so should be relatively easy.

I think some of the other general concerns about people using Bazel instead of CMake and assuming support or breaking the CMake build should be solvable by Bazel being in a side-directory (as proposed) with a clear readme that explains the level of support. I’ll draft such a readme to include with the patch (I was waiting to see some responses to the RFC before doing so).

A side point regarding extra commit traffic. I proposed putting these in the top-level utils/ directory. Would it also make sense to move gn there to similarly remove it from commit mailing list traffic?

smeenai · October 30, 2020, 1:04am

The main benefit I see is the ease of integrating into an existing Bazel build system.

At Facebook, we use Buck (which is inspired by Blaze, as is Bazel). Our main development repository uses Buck (and you get a variety of benefits from such build systems when you have the required infrastructure integration: remote caching, distributed builds, hermeticity, etc.), and the Buck build sets up particular flags, uses a specific sysroot, etc. We have people who want to develop LLVM-based tooling (that uses the LLVM and Clang libraries) in this repository, which means the LLVM and Clang libraries we build with CMake also need to be built with the same flags, sysroot, etc., which entails a bunch of duplication (and keeping up with any changes to the Buck build system). It's much more convenient to be able to build the LLVM libraries directly with the build system you're using for the rest of your build, so that they automatically get the right build settings.

We build our libraries internally with CMake today, but we've considered moving them to Buck for this reason. Having Bazel files in-tree would be mildly more convenient for us (Buck and Bazel are similar enough that we think we could machine-translate the Bazel files for our use), although we're also fine grabbing them from some other public repository.

    > Hi all,
    >
    > tl;dr: We'd like to contribute Bazel BUILD files for LLVM and MLIR in a
    > side-directory in the monorepo, similar to the gn build.
    >
    > Some of us have been working on open-source Bazel BUILD files for the
    > LLVM Project. You may have seen us hanging out in the #build-systems
    > discord channel. As you may know, Google uses Bazel internally and has
    > maintained a Bazel BUILD of LLVM for years. Especially with the
    > introduction of MLIR, we've got more and more OSS projects with a Bazel
    > BUILD depending on LLVM (e.g. IREE <https://github.com/google/iree>and
    > TensorFlow <https://github.com/tensorflow/tensorflow>\). We're also not
    > the only ones using Bazel: e.g. PlaidML also has a Bazel BUILD of LLVM
    > that they've borrowed from TF
    > <https://github.com/plaidml/plaidml/blob/master/vendor/llvm/llvm.BUILD>\.
    > Each of these projects has to jump through some weird hoops to keep
    > their version of the Bazel BUILD files in sync with the code, which
    > requires some fragile combination of scripts and human intervention.
    > Instead, we'd like to move general-purpose Bazel BUILD files into the
    > LLVM Project monorepo. We expect to follow the model of the GN build
    > where these will be maintained by interested contributors rather than
    > expecting the general community to maintain them.
    >
    > To facilitate and test this we've been developing a standalone
    > repository that just has the Bazel BUILD files. It symlinks together the
    > directory trees on top of a submodule as we would need in the monorepo
    > to to avoid in-tree BUILD files. The configuration is at
    > GitHub - google/llvm-bazel. We now have those in a good place
    > and think they would be useful upstream.
    >