Last Known Good Builds?

Hi,

I have several times over the past three months tried to retrieve the trunk version of LLVM/Clang/Compiler-RT from Subversion and then tried to build it on Windows: Without success. Also, I occasionally get emails from people who ask me why LLVM/Windows does not build (they find my name in the mailing list archive).

This has made me ponder the issue and I came up with the following scheme to ensure that there is a high probability that we Windows users can actually build the most recent version of LLVM:

  1. Every night, an automated build builds LLVM & Co. on all the supported host platforms and runs the test suite to completion.
  2. IF the automated build succceds on ALL supported host platforms AND the test suite succeeds, the build is marked as “Last Known Good” and automatically published to the website so that people can download it without using Subversion. There should always only be one and only one Last Known Good (LKG) build - which is sort of equivalent to a “daily tarball”, albeit it is known to build properly on all platforms.

I believe step 1 is already in place, but perhaps the project’s build master, or I, can put together some Python scripts to automate the process of making a LKG build.

Why is the LKG build so important? Because in software development in general, you want to reduce the “granularity” of the process as much as possible, approximating a kurve rather than a series of steps. If people adopt LLVM v3.1 and then have to wait one year for another update (v3.2) in which LOTS and LOTS of things have changed, they are in for a tough ride when they want to upgrade to the new version. If, on the other hand, they can get a bi-weekly LKG build (assuming the entire build process on all platforms only succeeds twice a week), they can adopt their code in very small steps.

Cheers and have a great day!

Mikael Lyngvig

Why is the LKG build so important? Because in software development in
general, you want to reduce the "granularity" of the process as much as
possible, approximating a kurve rather than a series of steps. If people
adopt LLVM v3.1 and then have to wait one year for another update (v3.2) in
which LOTS and LOTS of things have changed, they are in for a tough ride
when they want to upgrade to the new version. If, on the other hand, they
can get a bi-weekly LKG build (assuming the entire build process on all
platforms only succeeds twice a week), they can adopt their code in very
small steps.

Do you have any information about how people are setting up their
projects? I ask because there are different ways to set up a project
to depend on LLVM.

Probably the most robust is how the Rust language
<https://github.com/mozilla/rust> does it, which is to set up a git
submodule which is pinned on a particular git commit; the revision
that it is pinned on is actually for a fork
<https://github.com/brson/llvm> of LLVM's main development which the
Rust developers have made specifically for this purpose. I don't
really track Rust, but from a quick browse, it looks like what they
are doing is to periodically sync with upstream or cherry-pick
specific bugfixes; they also put in some little fixes of their own. My
impression is that this is really robust and flexible, since it allows
them to sync with upstream at any granularity they want.

Another possibility that a lot of projects do is to just rely on the
"installed" version of LLVM, and just say 3.0+ or 3.1+. As you
mentioned, this really is not very good, since LLVM really has no
backwards compatibility policy they are in for a really rough ride
when the next version comes out.

One benefit of the Rust-style setup is that it allows you to easily
set up a job on your own servers that automatically pulls the last
green LLVM release (the buildbots have JSON API's which should allow
extracting this) and merges it in and then tries to build your project
with that. That way, you can stay up to date in a relatively automated
fashion and can ping the list for help with migrating to new APIs near
the time that the APIs are changed and not months later. Personally I
think this is a better alternative than an LKG tarball.

Overall, what you are hitting against is just a product of how LLVM is
developed (very fast moving, straight-line SVN, highly-volatile trunk,
with no backwards compatibility). Even though the code is nicely
modularized and loosely-coupled so that in theory it is very nice and
modular and easy to use as a library, the reality is that there is a
very strong *developer coupling*, in that you have to really be aware
of what is happening in the LLVM codebase or else your project will
bitrot extremely quickly. Since LLVM doesn't maintain any kind of
"migration guide" for its changes, you basically have to be an LLVM
developer to fight the bitrot, or else have some kind of automated
thing which will allow you to squawk on the mailing list for migration
help while the change is still fresh.

-- Sean Silva

Do you have any information about how people are setting up their
projects? I ask because there are different ways to set up a project
to depend on LLVM.

No, I don’t have that kind of information. All I know is that it is quite difficult to find a working version on the trunk. I think this especially hit those that are new to LLVM and yet want to play around with the bleeding edge version.

I think you are right about extracting the last green build using the JSON API instead of a LKG tarball. I didn’t know it existed and hardly know what it means. JSON is some Java technology? Is there any documentation of this anywhere?

Perhaps somebody will volunteer to write up a few pages on how to extract the last green build like you say? That would be most helpful.

2012/11/5 Sean Silva <silvas@purdue.edu>

I think you are right about extracting the last green build using the JSON
API instead of a LKG tarball. I didn't know it existed and hardly know what
it means. JSON is some Java technology? Is there any documentation of this
anywhere?

JSON is a serialization language like XML, although it is a lot more
lightweight than XML. It is typically used in web applications as the
format for returning structured data from an API. Google around to
find more info about it.

You can find information about the JSON API at
<http://lab.llvm.org:8011/json/help>. Tiny modifications of this
example that it gives is probably all you need:

/json/builders/llvm-x86_64-linux/builds?select=-1/source_stamp/changes&select=-2/source_stamp/changes
     - Changes of the two last builds on 'llvm-x86_64-linux' builder.

If you aren't familiar with a scripting language that can pull and
process this info easily, then you should learn one. I would use
node.js or Python for this particular task; both work perfectly on
windows. The actual task that this script needs to do is extremely
trivial---the only problem that you are going to run against is wading
through the API's for long enough to find where it gives the
information you want.

I think we have an MSVC buildbot somewhere, but I forget the URL; if
not, then you may want to host one (see
<http://llvm.org/docs/HowToAddABuilder.html>).

-- Sean Silva