Plans for the Apple supported Darwin buildbot cluster

There as been a bit of talk about zorg recently, so I thought this would be a good time to share our plans for the darwin buildbots that live here:

http://lab.llvm.org:8013/builders

To be clear, those are not these buildbots:

http://lab.llvm.org:8011/builders

We are not talking about changing the main buildbots.

We are working on a redesign of the Darwin build cluster that will include some hardware upgrades.

The most notable change will be a switch from Buildbot to Jenkins. We will be replacing the current phased system with a Jenkins analog and reproducing the current builds. The Jenkins will execute scripts for each “project” that we will be checked in to zorg.

Hi Chris,

I'm very interested in this. Can you share some of the decisions on
moving to Jenkins?

I ask that because we have our GCC testing on Jenkins and LLVM testing
on Buildbots, and to be honest, buildbots are more stable and
reliable, not to mention having more readable web pages, error
messages and emailing committers, etc. Can Jenkins (at least in your
setup) do the same?

I really don't care much how I run the tests, as long as they keep me
informed and don't hassle me, which has been an almost free ride with
the buildbots so far.

If you guys add Jenking support I may also have a go here internally,
and if you guys could also share the Jenkins configuration, I could
update our own copy with a decent setup. :smiley:

cheers,
--renato

Would it be possible to drop using random high ports and just hook them
all up under http://lab.llvm.org directly?

Joerg

I think that would involve setting up a reverse proxy inside the cluster that dispatches requests to the different systems. We use a similar proxy internally here.

I think Galina knows best if that would be reasonable?

It was mostly pragmatic concerns that made us want to switch.

We are using both Jenkins and Buildbot internally. We found more people had Jenkins experience than Buildbot. We wrote custom code for a lot of features buildbot was missing, that seem to come standard in Jenkins. Our extensions to buildbot internally were getting out of control, and sucking up a lot of maintenance time.

As a project, Jenkins seems to be moving faster. Jenkins also makes it very easy to setup new jobs, no checking out zorg repos!

We have a selection of plugins which make some of these problems you mentioned go away. We are using the failure cause management plugin to extract errors from logs, and the email-ext plugin make nicer emails. One thing we have been working towards is isolating infrastructure failures (svn, disk space etc) from build and test failures.

We have not found one to be ‘more’ reliable than the other.

Sounds good. I’m sick of buildbot. I’d be happy to see it replaced with something better. We’ll see how it goes and can evaluate what we want to do with the lab.llvm.org:8011 bots. I also wish we weren’t using random high ports for this stuff…

Sorry to troll this thread, but jenkins sucks compared to bamboo and they probably would give you an open source license.

specifically - the web interface is more clean/intuitive, reporting for bamboo, configuration and overall clean/professional. etc

We're currently running LLVM test builds in Jenkins and I have patched lit to produce JUnit XML output that Jenkins can consume (it's not pretty, but it does at least let you produce the graphs of tests / failures for each build). If there's general interest in this feature, I'd be happy to upstream it. The current implementation writes the JUnit XML to a file specified on the command line, so you also get the normal output. It could be improved, but it's probably a sensible starting point for anyone wanting to work on this.

David

I think that'd be interesting.

--renato

Yes! That would be really great!

I've put the diff in Phabricator here:

http://reviews.llvm.org/D4901

If you think it looks vaguely sensible then let me know and I'll commit it.

David

Hi David,

Is there an option for "make check-all"? Or do you have to call lit in
a special way inside Jenkins?

--renato

Our Jenkins config just runs llvm-lit manually after building (we use Ninja not make in Jenkins because it's vastly faster for incremental builds and we have several LLVM forks / versions that we test). I agree that having a make / ninja target for the check would be useful, but the lit option was the minimum work we needed to get the pretty graphs out of Jenkins (and, more importantly, the regression emails).

Note that Jenkins doesn't have a default location for the XML file either, so we end up specifying this in the llvm-lit command and in the Jenkins config. It might be nice for the check-all-xml (or whatever) target to put it in a fixed location so people configuring Jenkins just need to tell it where to look in one place.

David

Our Jenkins config just runs llvm-lit manually after building (we use Ninja not make in Jenkins because it's vastly faster for incremental builds and we have several LLVM forks / versions that we test). I agree that having a make / ninja target for the check would be useful, but the lit option was the minimum work we needed to get the pretty graphs out of Jenkins (and, more importantly, the regression emails).

I use ninja, too. And ccache. And shared libs for debug builds. And gold. :slight_smile:

Note that Jenkins doesn't have a default location for the XML file either, so we end up specifying this in the llvm-lit command and in the Jenkins config. It might be nice for the check-all-xml (or whatever) target to put it in a fixed location so people configuring Jenkins just need to tell it where to look in one place.

Can't it be the output of a verbose build, like the textual version?

--renato

Sure, but where do you put it? The XML file that you produce is:

- Really, really big
- Not human readable (for most values of 'human')

So you don't want to send it to the standard output. The things that want to parse it expect a file in the filesystem. So if you want the verbose build to be sticking it in a well-known file then I don't have strong objections, but it seems to be conflating two things. We run lit in -q mode in Jenkins, because we don't want big console logs duplicating the information in the XML.

David

I could be wrong, but I thought that buildbots already stored the
standard output into a file for parsing, so Jenkins would possibly do
the same? It seems not.

Since you're requesting an XML output (same as when requesting verbose
textual output), you will get a lot of output, so redirecting stdout
to a file on the integration level seems like the most sensible
solution on the tool level.

cheers,
--renato

I had the impression that xunit output should be an extra file generated. Just like lit’s current dump command. So the tests run as they did before, but an extra report file is produced.

CCing Daniel

I'd be interested in seeing what you have planned upstreamed. I've used Jenkins on several projects.

Will Zorg be a vehicle for transition, where buildbot slaves become Jenkins nodes defined in the project files?

Do you see a majority of this this as a jenkins matrix project, or some other organization?

Rick

So you don't want to send it to the standard output. The things that want to parse it expect a file in the filesystem. So if you want the verbose build to be sticking it in a well-known file then I don't have strong objections, but it seems to be conflating two things. We run lit in -q mode in Jenkins, because we don't want big console logs duplicating the information in the XML.

I could be wrong, but I thought that buildbots already stored the
standard output into a file for parsing, so Jenkins would possibly do
the same? It seems not.

I've observed that Jenkins chokes horribly on test-suites that print a lot to standard output (without redirecting to a file). When it's egregiously bad, you can even crash Jenkins doing it.

It's best for them not to print so much in order to avoid this issue, and make the logs easier to be human read.

Cheers,
Jon