Using CMake/Ninja on buildbots

Hi all,

A discussion was being carried on llvmcommits about the extra time for cleaning and re-building objects that didn’t need to be built on buildbots.

Since they just update the repository, builds could be a lot faster if we let the objects in place. Even faster if we used Ninja with CMake. Is there a crucial reason why we’re still using autoconf for all builds?

Some of us acknowledge that sometimes the build gets stuck because of a CMake file problem or change that the build can’t get around automatically. It’s probably because of those issues that a “make clean” is done on almost all buildbots.

The point is that, for the exception, we’re paying it in full, every single build. Some of us proposed, then, that we could not have the clean phase on buildbots in the first place, and deal with the exceptions when they happen. Does that sound terrible to anyone?

I also think we could create a “clean” button on the bot web interface, so that anyone can just clean the build if the bot gets stuck. We also can perform a routine make clean, say, every Sunday, so that we’re sure we’re not wrongly passing tests that should be failing.

Thoughts?

cheers,
–renato

Hi all,

A discussion was being carried on llvmcommits about the extra time for
cleaning and re-building objects that didn't need to be built on buildbots.

Since they just update the repository, builds could be a lot faster if
we let the objects in place. Even faster if we used Ninja with CMake. Is
there a crucial reason why we're still using autoconf for all builds?

>

Some of us acknowledge that sometimes the build gets stuck because of a
CMake file problem or change that the build can't get around
automatically. It's probably because of those issues that a "make clean"
is done on almost all buildbots.

I am running the Polly buildbot which seems to be the only one that uses cmake. I have never seen any build being stuck. cmake failures reported by the polly bot are normally due to cmake config files that have not properly being updated.

The point is that, for the exception, we're paying it in full, every
single build. Some of us proposed, then, that we could not have the
clean phase on buildbots in the first place, and deal with the
exceptions when they happen. Does that sound terrible to anyone?

I believe the clean phase is mainly to catch buildsystem problems that only arise in a clean build. I agree that in a phased build the very first build could run without the clean phase to get fast results. For not that time critical builds, builds that include the clean phase are probablz OK.

Tpbo

Hi all,

A discussion was being carried on llvmcommits about the extra time for cleaning and re-building objects that didn’t need to be built on buildbots.

Since they just update the repository, builds could be a lot faster if we let the objects in place. Even faster if we used Ninja with CMake. Is there a crucial reason why we’re still using autoconf for all builds?

Some of us acknowledge that sometimes the build gets stuck because of a CMake file problem or change that the build can’t get around automatically. It’s probably because of those issues that a “make clean” is done on almost all buildbots.

The point is that, for the exception, we’re paying it in full, every single build. Some of us proposed, then, that we could not have the clean phase on buildbots in the first place, and deal with the exceptions when they happen. Does that sound terrible to anyone?

I also think we could create a “clean” button on the bot web interface, so that anyone can just clean the build if the bot gets stuck. We also can perform a routine make clean, say, every Sunday, so that we’re sure we’re not wrongly passing tests that should be failing.

Thoughts?

cheers,

–renato

Hi Renato,

ninja does not bring any advantage with a “clean” build, but once you tried it with a “warm” build, you can not do any longer without it :slight_smile:

I think there are different aspects there :

  • the ninja vs make aspect is more a matter of sparing time and energy, as they should give the same results

  • autoconf vs cmake : I do not know how many bots are using cmake today, probably a few at least for windows, but this would definitely help to maintain consistency between cmake and autoconf, if we take into consideration the point below

  • build options coverage : for example with or without shared libraries. I have not checked recently, but cmake+shared_libs used to fail in “make (or ninja) check-llvm” for bugpoint.

On my side, I have been using both options (clean build or warm build). The problem is you do not catch the same rough edges / corner cases : stale file pick up or a problem with a automatically generated files comes to me as examples. The clean build is simpler from a lazy user point of view. You do not even need to clean things, the bot will do it for you :slight_smile:

I think we just need to increase coverage. Everything you can do to build (even slightly) differently than other bots is good to have.

My 2 cents,

Hi Arnaud,

I agree building with { CMake, autoconf } x { Cold, Warm } will catch more
corner cases than defaulting all builds to the same standard, however,
relying on patchy distribution to achieve that is naive. Also, we don't
need to catch build corner cases on every commit...

A standard build system for buildbots and developers is beneficial because
you don't need to run around to fix bugs specific to a build system that is
not often used. The fact that people wanted to remove the MBlaze back-end
today is for that very reason. Generic changes on other parts demand
specific changes on a part of the code that is not used often.

That said, it is possible that some of the options we have with autoconf
are not available on a CMake build (I'm guessing here), and thus
deprecating autoconf entirely is not an option right now. If the reason is
strong enough to keep autoconf for the foreseeable future, than we do need
coverage.

But coverage means running both CMake and autoconf, both warm and cold, on
each variant that we care about. So, if that would be true, I'd have to
have at least 4 buildbot configurations for every ARM platform I care
about. For now, I care about A9 and A15, so I'd have to have at least 8
bots. How much of that I can ignore depends on my interest on them,
availability of hardware, etc.

Thinking that I can get away and have { warm+autoconf on A9 } + {
cold+ninja on A15 } and saving 6 bots is naive, at best. However, having {
warm+ninja } on both and, during weekends doing one of each { warm+autoconf
}, { cold+ninja } and { cold+autoconf } on the same commit, then continuing
with the bot schedule, would at least give you a uniform, but not precise,
view of the build system failures. The three additional builds will rarely
give you real code errors, so it's ok to be only once a week.

I don't believe Buildbot is capable of such strategy, though. Galina may
know of a way of doing this... But I'm ok with just running { warm+ninja }
for the foreseeable future...

cheers,
--renato

IMO, any functional/correctness difference between an incremental and
clean build should be considered a build system bug, especially for
C++ projects where incremental vs. clean can mean 10 second vs 30
minute build times.

-- Sean Silva

I think we just need to increase coverage. Everything you can do to build
(even slightly) differently than other bots is good to have.

Hi Arnaud,

I agree building with { CMake, autoconf } x { Cold, Warm } will catch more
corner cases than defaulting all builds to the same standard, however,
relying on patchy distribution to achieve that is naive. Also, we don't need
to catch build corner cases on every commit...

A standard build system for buildbots and developers is beneficial because
you don't need to run around to fix bugs specific to a build system that is
not often used. The fact that people wanted to remove the MBlaze back-end
today is for that very reason. Generic changes on other parts demand
specific changes on a part of the code that is not used often.

That said, it is possible that some of the options we have with autoconf are
not available on a CMake build (I'm guessing here), and thus deprecating
autoconf entirely is not an option right now. If the reason is strong enough
to keep autoconf for the foreseeable future, than we do need coverage.

But coverage means running both CMake and autoconf, both warm and cold, on
each variant that we care about. So, if that would be true, I'd have to have
at least 4 buildbot configurations for every ARM platform I care about. For
now, I care about A9 and A15, so I'd have to have at least 8 bots. How much
of that I can ignore depends on my interest on them, availability of
hardware, etc.

Thinking that I can get away and have { warm+autoconf on A9 } + { cold+ninja
on A15 } and saving 6 bots is naive, at best. However, having { warm+ninja }
on both and, during weekends doing one of each { warm+autoconf }, {
cold+ninja } and { cold+autoconf } on the same commit, then continuing with
the bot schedule, would at least give you a uniform, but not precise, view
of the build system failures. The three additional builds will rarely give
you real code errors, so it's ok to be only once a week.

You're right - it's a tradeoff & I think in favor of not wasting
resources validating all the different build tools on every commit. If
there are bugs in the warm builds of Ninja I think it's better to fix
those than to run clean builds - we should have immense confidence in
incremental rebuilds or we're going to be much slower as developers.

Especially for these slower hardware platforms it's much more
important to diagnose the more likely bugs (actual problems that only
arise on this hardware - since it's also hardware that is going to be
very infrequently run by the general developer community pre-commit)
and do so as fast (& with as small of a commit/blame range) as
possible.

I don't believe Buildbot is capable of such strategy, though.

http://buildbot.net/buildbot/docs/0.8.0/Nightly-Scheduler.html#Nightly-Scheduler

IMO, any functional/correctness difference between an incremental and
clean build should be considered a build system bug,

If your (c)makefile underspecifies dependencies, there's nothing the
build system can do.

Indeed. Not specifying dependencies correctly is an example of what I
consider to be a bug. (by "build system" I mostly mean "the
(c)makefiles in our project" and to a much lesser extent
autoconf/cmake themselves).

-- Sean Silva

+1.
It seems to make sense to me to leave the "make clean" configurations to
platforms that can build very quickly from scratch; and for the available
arm buildbots to set up builds so that they do incremental builds as
quickly as possible.

Kristof