LLVM Releases: Upstream vs. Downstream / Distros

In my talks with a number of these projects, they pretty much don't care
what anyone else does, and plan to stick to their own import/etc schedules
no matter what LLVM does with stable releases :slight_smile:

Is there anything we can do to make they care?

What I heard from them is that the upstream process wasn't clear
enough with regards to fixes, API stability and process (which were
pretty much echoed in this thread).

Maybe, if we fix most of those problems, they would care more?

(For reference, Google *ships* the equivalent of about 13-16 linux
distributions in products, uses about 5-6x that internally, and we have a
single monolithic source repository for the most part. I have the joy of
owning the third party software policies/etc for it, and so end up
responsible for trying to deal with maintaining single versions of llvm for
tens to hundreds of packages).

You sound like the perfect guy to describe a better upstream policy to
please more users.

But I don't want to volunteer yourself. :slight_smile:

--renato

> Errr, Stephen has spoken up here, but my folks are in contact with
android
> folks pretty much every week, and I don't think what you are stating is
> correct on a lot of fronts.

I obviously don't speak for Android and have already apologised to
Steve about my choice of words.

> So if android is your particular concern here, i can pretty much state
that
> android LLVM is on a release process close to the rest of Google, which
is
> 'follow TOT very closely'.

Isn't this what I said?

But your position seems to be "this is a bad thing for folks", and the
position we take is that it's explicitly a good thing.

Following ToT very closely is only good for groups that have high
involvement in LLVM, like Google and Android.

And for that reason (and others), Android doesn't use the upstream
releases. I was wondering if we could make anything so they would.

The major benefit wouldn't be, as I explained, specifically for
Google/Android, but for Android users, Linux users, Linux distros,
LLVM library users (including Renderscript), etc.

There is a strong implicit assumption here that the current model they use
is better for users than the model LLVM uses, and that aligning these
models in *that* direction ends up better from usings than aligning models
in the other direction.

IE make ToT more appealing to follow, have folks follow that.
Maybe that's true, maybe it's not, but it needs a lot more evidence :slight_smile:

The evidence i see so far is that they spend time trying to get disparate
projects to use a single version of LLVM, but i also have seen no evidence
that any of the projects using stable releases would ever align their
policies *anyway*, so they still have that problem no matter what you do to
stable releases.

If that is the real concern, i think the entire discussion is misplaced.
Because that problem is solely one of API compatibility between releases.

If there are other concerns, it'd be good to catalogue them :slight_smile:

But your position seems to be "this is a bad thing for folks", and the
position we take is that it's explicitly a good thing.

Then I apologise again! :slight_smile:

My point was that following ToT is perfect for developer teams working
*on* LLVM. Everyone should be doing that, and most people are. Check.

But for some people, including library users, LTS distributions and
some downstream releases (citation needed), having an up-to-date and
stable release *may* (citation needed) be the only stable way to
progress into newer LLVM technology.

IE make ToT more appealing to follow, have folks follow that.
Maybe that's true, maybe it's not, but it needs a lot more evidence :slight_smile:

There were responses on this thread that said it's possible and
desirable to test ToT better, than only validate releases, and I think
this is great. Mostly because ultimately this will eventually benefit
the releases anyway.

Maybe, the solution to the always-too-old-release problem is to get
better trunk and give up at all on releases, like Arch Linux rolling
releases (which I use), so I'm ok with it, too.

As long as we make it a clear and simple process, so upstream users
can benefit too, whatever works. :slight_smile:

cheers,
--renato

> In my talks with a number of these projects, they pretty much don't care
> what anyone else does, and plan to stick to their own import/etc
schedules
> no matter what LLVM does with stable releases :slight_smile:

Is there anything we can do to make they care?

For all of them? Unequivocally: no.
You can define a subset of external customers you care about, and want to
work with you, and do something for them.

Whether you can get a subset that is large enough to reduce costs/burden
for you/distro folks is an unknown :slight_smile:

What I heard from them is that the upstream process wasn't clear
enough with regards to fixes, API stability and process (which were
pretty much echoed in this thread).

Maybe, if we fix most of those problems, they would care more?

Maybe. In any case, LLVM (as a community) has to define who the customers
are that it wants to prioritize, and know what they care about, before you
can start solving their problems. :slight_smile:

IE you need an actual ordered hierarchy of who llvm, as a community, cares
about supporting.
Without this, you can't possibly define the effort you should go to to
actually support them, and what problems you should and should not solve.g

And the answer can't be "everyone", because you have sets of customers
whose priorities and desires are often at odds with others.

(For example, "people trying to get high performance at all costs from
LLVM" may have a diametrically opposed set of desires from "people trying
to ship production LLVM based compilers")

> (For reference, Google *ships* the equivalent of about 13-16 linux
> distributions in products, uses about 5-6x that internally, and we have a
> single monolithic source repository for the most part. I have the joy of
> owning the third party software policies/etc for it, and so end up
> responsible for trying to deal with maintaining single versions of llvm
for
> tens to hundreds of packages).

You sound like the perfect guy to describe a better upstream policy to
please more users.

The problem is you may not like my policies :wink:
At some point, at scale, you have to assign who bears the burden of various
support-like things.

We already do this a little bit in the community, telling people they need
to update tests for what their patches break, etc.

This is not unlike that, just at a larger scale. So, for example, saying
who bears the cost of API compatibility, and to what degree.

For all of them? Unequivocally: no.
You can define a subset of external customers you care about, and want to
work with you, and do something for them.

That's what I did. :slight_smile:

I copied all people that have expressed concerns about our release or
back-porting process in some way on this email.

I'm glad Stephen, Bero, Paul Kristof, Antoine and David replied, as
well as you and Hans, so that we could have a better picture of who's
interested in participating more on the release process, and also how
much does our process really hurts them.

Seems I was wrong about many things, but not all of it. For me, being
proven that I *don't* have a problem is a big win, so thanks everyone!
:slight_smile:

Maybe. In any case, LLVM (as a community) has to define who the customers
are that it wants to prioritize, and know what they care about, before you
can start solving their problems. :slight_smile:

I'm advocating for them to help us solve their own problems. Which
nicely solves the "who do we care more" problem. :slight_smile:

We already do this a little bit in the community, telling people they need
to update tests for what their patches break, etc.

Yup. And I'm glad folks have now explicitly said they could help the
releases with some extra testing (building packages with the
pre-release). That, for me, is already a major win.

If that leads into them helping more later, or us being more
pro-active with their requests that we have been in some cases
(specifically the abi_tag case), that's a bonus (and slightly
selfish).

This is not unlike that, just at a larger scale. So, for example, saying who
bears the cost of API compatibility, and to what degree.

The API is slightly harder to solve that way. Most people that use our
APIs are not big companies or projects, and we want to be nice to
them, too, even if they can't help as much as Google.

Same for some distros, that the packagers are responsible for *a lot*
of packages, and they can't spend all month on a single one.

I don't have a solution to that, and this email was a request to solve
that problem (as well as the distros). I also don't know how to reach
them in any different way than this email.

If anyone has better ideas, please feel free to do what you can.

cheers,
--renato

This is a long email :slight_smile: I’ve made some comments inline, but I’ll
summarize my thoughts here:

  • I like to think that the major releases have been shipped on a
    pretty reliable six-month schedule lately. So we have that going for
    us :slight_smile:

  • It seems hard to align our upstream schedule to various downstream
    preferences. One way would be to release much more often, but I don’t
    know if that’s really desirable.

I’d like to also point out that it’s not only about the schedule, but also a lot about the release strategy. For example, we branch a very long time before we release externally. At the beginning of a new release qualification, we will cherry-pick most of what’s happening on trunk to the release branch. At that time it is still good to take new optimizations, or refactorings that will make working on the branch later easier. And then we slow down, but we still take miscompile fixes and common crashers for something like 4 months after the branch point. This is very different from what is happening upstream, and I’m not sure it would be good for the community releases to try to unify with this.

That being said, it makes things a bit easier for us when we branch close to an open-source release, and it would be nice to be able to plan on doing this.

The result of our qualification work is of course contributed to trunk, it’s left to the discretion of individual contributors if they want to nominate fixes for the current open-source release branch. I’m sure we could be better at this, and I’ll try to message this internally.

Fred

Renato,

Thank you for bringing up this thread. I think that managing downstream releases is a challenge that a lot of our community members are dealing with, and it is great to have an open discussion about it.

One thing that has come to mind for me lately is that as a community we might benefit from encouraging maintainers of downstream distributions to manage their releases (at least partially) in LLVM public trees. As Fred stated at Apple we tend to fork off from trunk way before we ship our releases.

Today you can see some of this process in the Swift project on GitHub. Swift 3.0 branched LLVM back in January (https://github.com/apple/swift-llvm/tree/swift-3.0-branch).

It might be useful for our release managers (and other release managers across the community) to work collectively. For example I could see release managers creating branches on LLVM.org, and instead of a process where interesting patches are emailed to Hans we could have a mailing list comprised of release managers from around the community.

Having that process in place would simplify communication about critical bug fixes.

One thing I also wanted to bring up from Renato’s first email I absolutely think we should encourage package maintainers to have their packaging scripts in our repositories. I made a big push over the year to get most of Apple’s packaging logic represented directly in the public tree’s CMake, and I think we should encourage all package maintainers to move toward having as much logic as possible in-tree so that bots and users around the community can reproduce the builds provided by downstream distributions.

As an example, the documentation here (http://llvm.org/docs/AdvancedBuilds.html#apple-clang-builds-a-more-complex-bootstrap) details a process that gets really close to the Apple Clang distribution. The only real differences are some goop that glues LLVM’s CMake into Apple’s internal build process, a few extra CMake options to set the Clang version, and a few other minor tweaks. I’d love to put all that extra goop into the open source repository too, but we just don’t have a convention for where to store packaging scripts.

-Chris

FWIW, for our ARM Compiler product, we follow top-of-trunk, not the releases.
Next to picking up new functionality quicker, it also allows us to detect regressions
in LLVM against our in-house testing quickly, not 6 months later. We find that when
we find a regression within 24 to 48 hours of the commit introducing it, it’s much
cheaper to get it fixed.

In my opinion, it would be better overall for the LLVM project if top-of-trunk is
tested as much as possible, if testing resources are so scarce that a choice
has to be made between testing top-of-trunk or testing a release branch.

+1 to everything Kristof said.

I will also add to the discussion that I believe it does not make sense to try to align releases with open source in general because different people have different goals. For instance, when we (the open source community) branch for a release, we try to make the branch as stable as possible and only pull-in fixes.
Other people may have different goal, like it is possible a new feature was expected for that release and that we need to pull that in. And we are back to the problem we were discussing in the other thread: helping release manager for general commits.

Cheers,
-Quentin

Shall we sync our upstream release with the bulk of other downstream
ones as well as OS distributions?

I suspect that'll, in general, be hard.

The downstream consumers are all going to have their own needs & schedules and as a result I doubt you'll find any upstream release schedule & cadence that is acceptable to enough downstream consumers to make this viable.

With that in mind, I'm a proponent of timing based schedules with as much predictability as can be made. That allows the downstream consumers to make informed decisions based on likely release dates.

This work involves a *lot* of premises that are not encoded yet, so
we'll need a lot of work from all of us. But from the recent problems
with GCC abi_tag and the arduous job of downstream release managers to
know which patches to pick, I think there has been a lot of wasted
effort by everyone, and that generates stress, conflicts, etc.

Ideally as we continue to open more lines of communication we won't run into anything as problematical as the abi_tag stuff. While it's important, I wouldn't make it the primary driver for where you're trying to go.

WRT downstream consumers. The more the downstream consumer is wired into the development community, the more risk (via patches) the downstream consumer can reasonably take. A downstream consumer without intimate knowledge of the issues probably shouldn't be taking incomplete/unapproved patches and applying them to their tree.

  1. Timing

Many downstream release managers, as well as distro maintainers have
complained about the timing of our releases, and how unreliable they
are, and how that makes it hard for them to plan their own branches,
cherry-picks and merges. If we release too early, they miss out
important optimisations, if we do too late, they'll have to branch
"just before" and risk having to back-port late fixes to their own
modified trees.

And this just gets bigger as the project gets more downstream consumers. Thus I think you pick a time based release schedule, whatever it may be and the downstream consumers can then adjust.

Note that this can have the effect of encouraging them to engage more upstream to ensure issues of concern to them are addressed in a timely manner.

  2. Process

Our release process is *very* lean, and that's what makes it
quasi-chaotic. In the beginning, not many people / companies wanted to
help or cared about the releases, so the process was what whomever was
doing, did. The major release process is now better defined, but the
same happened to the minor releases.

For example, we have no defined date to start, or to end. We have no
assigned people to do the official releases, or test the supported
targets. We still rely on voluntary work from all parties. That's ok
when the release is just "a point in time", but if downstream releases
and OS distributions start relying on our releases, we really should
get a bit more professional.

Can't argue with getting a bit more structured, but watch out for going too far. I'd really like to squish down the release phase on the GCC side, but it's damn hard at this point.

A few (random) ideas:

* We should have predictable release times, both for starting it and
finishing it. There will be complications, but we should treat them as
the exception, not the rule.

Yes.

* We should have appointed members of the community that would be
responsible for those releases, in the same way we have code owners
(volunteers, but no less responsible), so that we can guarantee a
consistent validation across all relevant targets. This goes beyond
x86/ARM/MIPS/PPC and includes the other targets like AMD, NVidia, BPF,
etc.

Good luck :wink: Don't take this wrong, but wrangling volunteers into release work is hard. It's sometimes hard to find a way to motivate them to focus on issues important for the release when there's new development work they want to be doing.

Mark Mitchell found one good tool for that in his years as the GCC release manager -- namely tightening what was allowed on the trunk as the desired release date got closer. ie, there's a free-for-all period, then just bugfixes, then just regression fixes, then just doc fixes. Developers then had a clear vested interest in moving the release forward -- they couldn't commit their new development work until the release manager opened the trunk for new development.

* OS distribution managers should test on their builds, too. I know
FreeBSD and Mandriva build by default with Clang. I know that Debian
has an experimental build. I know that RedHat and Ubuntu have LLVM
packages that they do care. All that has to be tested *at least* every
major release, but hopefully on all releases. (those who already do
that, thank you!)

LLVM's usage on the Fedora side is still small, and smaller still within Red Hat. But we do have an interest in this stuff "just working". Given current staffing levels I would expect Fedora to follow the upstream releases closely with minimal changes.

* Every *new* bug found in any of those downstream tests should be
reported in Bugzilla with the appropriate category (critical / major /
minor). All major bugs have to be closed for the release to be out,
etc. (the specific process will have to be agreed and documented).

Yes, GCC has a similar policy and it has worked reasonably well. In fact it's a good lever for the release manager if you've got a locked trunk. If the developers don't address the issues, then the release doesn't branch and the trunk doesn't open for development. It aligns the release manager and a good chunk of the development team's goals.

jeff

I think that this would require some fairly significant changes to workflow for downstream users and I don’t see it being possible for us. In FreeBSD, we ship Clang in two ways:

1) As the system compiler
2) As a third-party package

In the former case, we import all of the sources into our svn tree and integrate it with our own build system (in a horrible way). All of the CMake-generated definitions are committed. We then need to support this branch for a long time, so we periodically back-port fixes, though in good releases we just pull in the minor releases.

In the latter case, the source for building is already the LLVM upstream tarballs. The ports tree has infrastructure for applying patches, so we put these in, but it doesn’t have infrastructure for fetching an arbitrary release from svn. If LLVM moved to GitHub then this might be possible, as GitHub provides a mechanism for grabbing an arbitrary git hash as a tarball (which quite a lot of ports use). To be honest, this isn’t much easier than maintaining a small set of patches (most of which currently are to work around the fact the clang ships with a load of C headers that conflict with definitions in FreeBSD libc).

David

Folks,

First, thanks everyone that replied, it made many things much more
clear to many people (especially me!).

But would be good to do a re-cap, since some discussions ended up
list-only, and too many people said too many things to keep track. I
read the whole thread again, and this is my summary of what people
do/want the most.

TL;DR version:

* Upstream needs to be more effective at communicating and
formalising the process.
* Downstream / Distros need to be respectively more vocal / involved
about testing ToT and releases.
* Upstream may need to change some process (bugzilla meta, feature
freeze, more back-ports, git branches) to facilitate downstream help.
* Downstream needs to be more pushy with their back-port suggestions,
so we do it upstream more often.
* We all need to come up with a better way of tracking patches for
back-port (separate thread ongoing)

Now, the (slightly) longer version:

By far, the most important thing is cadence and transparency. We
already do a good job at the former, not so much at the latter.

The proposals were:
- Formalise the dates/duration on a webpage, clarify volunteered
roles, channels (release list, etc).
- Formalise how we deal with proposals (llvm-commits to release list
/ owner), some suggested using git branches (I like this, but svn).
- Formalise how we deal with bugs (bugzilla meta, make it green),
this could also be used as back-port proposal.
- Formalise how we deal with back-ports, how long we do, overlapping
minor releases, etc.

The other important factor is top-of-tree validation as a way to get
more stable releases. We have lots of buildbots and the downstream
release folks are already validating ToT enough. We need the distros
in as well.
Same goes for releases, but that process isn't clear, it needs to be.

The proposals were:
- Have distros build all packages with ToT as often as possible, report bugs.
- Do the same process on stable branches, at least once (RC1).
- Downstream releases also need to acknowledge when a validation came
back green (instead of just report bugs).
- All parties need to coordinate the process in the same place, for
example, a meta bug in bugzilla, and give their *ack* when things are
good for them.

The third big point was following changes on long running downstream
stable branches. Many releases/distros keep stable local branches for
many years, and have to back-port on their own or keep local patches
for that long.

The proposals were:
- Better tracking of upstream patch streams, fixes for old stable
release bugs. Making a bug depend on an old release meta may help.
- Distros and releases report keeping lots of local patches (some
already on trunk). Back-porting them to minor releases would help
everyone, keeping old releases for longer, too.
- Create patch bundles and publish them somewhere (changelog?
bugzilla?) so that downstream knows all patches to back-port without
duplicating efforts.

Other interesting points were:
- Moving versions for small projects / API users is not an easy task.
Having release numbers that make sense and multiple releases at a time
may help not introduce API changes to old releases.
- Volume of changes is too great. Having a slow down (staged freeze)
may help developers invest more in stability just before the branch.
We sort of have that, needs formalisation.
- If we have major bugs after the branch, we stop the whole process,
which makes it slower and more unpredictable. Feature freeze may help
with that, but can affect the branch date.
- Volunteering is still required, but we could keep documented who
does what, and change the docs as that changes.
- Upstreaming package maintenance scripts had mixed views, but I
still encourage those that want, to try.

cheers,
--renato

[I'm the FreeBSD LLVM package maintainer]

I'm not really worried about patches. Where I need them, I've got
infrastructure in place to fetch them and add them to the build system.
I tend to track minor releases and GCing patches during that process
isn't much work. If there was a more regular stream of changes, it
would be easy enough to follow. Mostly I apply patches when someone
actually hits a regression so people don't have to rebuild/reinstall for
patches that often don't apply to our users. This is particularly
important to 3.8+ where shared builds have been broken for quite some
time so the full package takes up >1GB of disk space.

For llvm-devel I have a script to grab the current git checksums from
the github API. It works well and is easy to use. For actual releases,
I consider the tarballs to be the source of truth.

-- Brooks