RFC: Do "something" with the subproject tarballs in the release page


A topic that creates confusion and a lot of questions is the availability of subproject specific tarballs (i.e. cfe, lld, runtimes etc). They are confusing because most likely you will need to download most of them and “recreate” the monorepo locally in order to build. Standalone builds are not very supported and we keep getting bug reports about them.


And previous discussion on discourse:

Reading all these topics it makes the following things are a negative with the stand-alone packages:

  • Stand-alone builds are not maintained, some subprojects are “more” maintained than others.
  • Building from the subproject tarballs in a release requires a lot of trial-and-error and you need to basically stitch together the mono repo again.
  • All these tarballs being part of the release page makes it confusing and hard for beginners to understand what they need to do and that for most people you just have to take llvm-project-17.0.5.tar.xz instead of llvm-17.0.5.tar.xz.

On the positive side:

  • Less to download if you only want to build LLVM or Clang.
  • Distributions often use these scripts to craft the different subpackages of LLVM and would be hurt by removing them.

In order to solve the confusion and documentation problem my suggestion would be that we still will provide the subproject tarballs, but not upload them in the same place as the release artifacts. I.e. not on the release page in GitHub. This would remove a lot of the confusion around these packages and make it so that beginners wouldn’t get lost when they are selecting what artifact to download. It will take away a lot of the clutter on this page. But the packages would still be available for distributions to rely on and use in their more “advanced” usages.

The big question for this is of course where to put these subproject archives. We could use releases.llvm.org for this or create two releases in github (might be confusing again though) or create a whole new repo on github just to store these artifacts.

I am open to suggestions, but I think this suggestion is at least moving in the right direction from where we are right now.

Thoughts: @mgorny @tstellar @ldionne @petrhosek @DimitryAndric

1 Like

I support moving the per-project tarballs elsewhere, thanks for putting this RFC together!

I also want to clarify something:

  • Stand-alone builds are not maintained, some subprojects are “more” maintained than others.

In libc++/libc++abi/libunwind, we purposefully don’t support it anymore. We used to support it (and a bunch of different ways to build), and since then we’ve removed all ways to build except the Bootstrapping build and the normal runtimes build to greatly simplify everything. So basically it’s not that it’s unmaintained, it’s that supporting standalone builds is a non-goal.

1 Like

I would be fine moving the subproject tarballs to another location too. We (Fedora and RHEL) still use them and think they have value, so I don’t want them to go away completely, but I think moving them somewhere else is a good compromise.

Any idea where we could put them?

I think releases.llvm.org makes sense.

I’ll only repeat here what I wrote under my tickets

To be honest I still do not understand propose of creating LLVM per subproject dist tar balls and keeping them in unusable state for quite long time😞

Currently useability of those archives is limited because there are lots of inter submodules dependencies like:

  • discussed here libcxx, libcxxabi loop in build dependencies
  • installed llvm provides own additional cmake modules which needs to be combined with cmake/ to be able build other modules (why llvm/ cmake module is not merged into cmake/ could be another question)
  • clang-tools-extra needs to be build together with clang
  • bolt main CMakeLists.txt is not finished (has no even project() call)
  • all other subprojects cmake files needs be patched and even fedora do has no proper fixes for all those issue.

On top of that since at least 13.0.x actually nothing has been done to address all those issues and looks like there is no to much intention to even review what already has been submitted (which is odd).
From one side slicing into subprojects should allow and actually allows now (after apply some patches) to not waste hours to build whole LLVM stack again in case of any build/testing issues in the middle.
I can submit batch of my fixes … but many of these fixes would be worthless if installed by llvm/ cmake files would be correctly integrated into cmake/ submodule or better if cmake/ would be integrated under llvm/.

In other words … currently except cmake/ NONE of the per subproject dist tar balls are useable without patching or using with other dist tar balls.

I’m not expecting to have all those (and few other minor) issues solved for yesterday however it would be really good to have from LLVM core maintainers some sketch of plan of how all those issues LLVM are going to be addressed.

Than with that plan would be possible to apply proper set of consistent solutions in which I’m 100% sure different distros maintainers would be able actively participate reusing at lease partially already used patches.

I fully understand as well that all possible fixes needs to preserve mono repo build as highest priority.

I’d say lack of manpower at least some projects experience is what makes this unsustainable, same goes for long review queues

Situations is more and more like in old communist time Polish joke.
“A guy walking over the street found a very strange situation on roadside construction. Construction workers have been running in the loop with empty barrels between constructed buildings and piles of bricks.
He asked one of the workers resting for a few minutes “What you are doing here?”.
The worker answered that that they don’t know but they were so busy that they had no time to load and unload the bricks.”

IIRC a year ago number of LLVM opened issue tickets was ~16k. Now that number is approaching to 20k. Number of issue tickets labels grew as well.

IMO dividing whole llvm/llvm repo into smaller pieces, delegate exact maintainers to each repo and create at least second layer chain of command, solving problems etc. would help:

  • to have better view in which part of the LLVM are most of the issues
  • prepare common tooling to build and test each of the subproject
  • make decisions about use or share some bits of that tooling with other subprojects [1]
    Above would allow solve at least few other major problems.

Other things which IMO would be good to do to save preciouses time:

  • abandon use own copy of gtest
  • abandon lit and use ONLY standard ctest framework [1]

[1] currently testing per subproject does not work as well.

I knew when all the llvm projects were put in monorepo, it was inevitable that interdependencies would come into existence. (Actually that is most often the whole idea behind a monorepo, but I digress.)

Therefore, I’m of the opinion that we should only release atomic snapshots of the whole monorepo, and relegate every other packaging into a very clearly marked UNSUPPORTED directory. :slight_smile:

What is the point of such approach? :thinking:

I’m in favor of this RFC. There is definitely a confusion we need to address.