Questions about the LLDB testsuite and improving its reliability

I’m not sold on this particular reason. Make is not the LLVM build system, CMake is. “I don’t know the build system of the project I actually work on, but I do know this other build system” is not a compelling argument.

(As an aside, not every knows Make that well, but it doesn’t actually matter because the amount of actual Make code is negligibly small, i.e. 1-2 lines per test in a vast majority of cases)

I disagree that understanding CMake is required to build LLVM. When I build top-of-tree on Linux (as opposed to a build that is Hexagon only) I make a build directory at the same level as my checkout, and simply run “cmake …/llvm”. I don’t need to know anything.

Yeah, w.r.t. the actual builder part, it seems to me any option is going to be sufficiently simple to use that it would be hard for the incremental benefits to lldb developers to ever amortize the cost of switching. The only compelling reason to me is if one or the other tool made it much easier to get building the test cases out of tree working, but that seems unlikely.

Jim

Indeed. So just to close the loop on this, it sounds like everybody is in agreement that running the tests out-of-tree is a worthwhile goal. So I will focus on implementing this next. My rough idea is to generate a fresh directory for each test configuration in $builddir/path/to/testname.config and run "make -C $srcdir/path/to/testname" there. It looks like the right place to implemented this is probably dotest.py.

I like the idea of running dotest from lit, but I'll save this for later.

-- adrian

I don’t know what would be involved in getting the tests building out of tree with Make. But I do know it would be simple with CMake. I’m sure it’s probably not terrible with Make either, I just don’t know enough about it to say.

One thing that I do like about CMake is that it can be integrated into the existing LLDB build configuration step, which already uses CMake, to build inferiors up front. This has the potential to speed up the test suite by an order of magnitude.

Can we get that same effect with a Make-based solution?

Everything sounds good on this thread. My two cents:

We should add some post verification after each test that looks for file that are left over after the "clean" phase. This can help us catch the tests that aren't cleaning up after themselves. This will help us weed out the bad tests and fix this ASAP. This can be done both for in tree and out of tree solutions to verify there is no source polution.

We can easily move build artifacts out of the source tree. Running the test suite remotely via "lldb-server platform" has code that creates directories for each test in the platform working directory. If the test runs fine and passes, it cleans up the entire numbered test directory, else it leaves the numbered directory there so we can debug any issues. Shouldn't be hard to enable this.

I like the current default of having a new directory with the time and data with results inside as it allows comparison of one test suite run to another.

Switching build systems to cmake is fine if someone has the energy to do it, that would be great. I don't see why we would go with a lit based system that manually specifies the arguments. Seems like a pain to get the right compiler flags for creating shared libs on different systems (or executables, frameworks, etc). Seems like cmake is good at that and very simple.

I don't know what would be involved in getting the tests building out of tree with Make. But I do know it would be simple with CMake. I'm sure it's probably not terrible with Make either, I just don't know enough about it to say.

One thing that I do like about CMake is that it can be integrated into the existing LLDB build configuration step, which already uses CMake, to build inferiors up front. This has the potential to speed up the test suite by an order of magnitude.

Can we get that same effect with a Make-based solution?

You are going to muck with a bunch of tests to get this to work. For instance, only dotest currently knows what debug variants we are building for (they are often specified in the test file itself.) Also a number of tests - for things like rebuild & rerun and others build several times in the test. The test for basic types has a single source file with the basic type in a define that runs the build & debug once for each basic type, supplying the appropriate define each time.

So for reasons not related to make vrs. cmake I think this is a harder thing than you are thinking it is.

Jim

One reason it’s nice is because you can specify more than just a compiler command line. You can take your input from an assembly file, for example and compile it with llc

You can mess around with files and stick them into an archive or you can compile a dll, then run a tool on it to strip something out of it. You can copy files around to set up a build directory a certain way. Only limited by your imagination.

When the compiler invocation is “just another command”, you can easily create test cases that are a lot more interesting than just “make an executable from this source code, and debug it”.

I think it can be done, and would be valuable if done right, but I do think getting it right would take some care.

I don’t know what would be involved in getting the tests building out of tree with Make. But I do know it would be simple with CMake. I’m sure it’s probably not terrible with Make either, I just don’t know enough about it to say.

One thing that I do like about CMake is that it can be integrated into the existing LLDB build configuration step, which already uses CMake, to build inferiors up front. This has the potential to speed up the test suite by an order of magnitude.

Since the tests in the LLDB testsuite are typically very small and don’t import a lot of headers I’m not convinced that an incremental build of the tests will have a very big impact on the running time of the testsuite, but to be honest I also haven’t benchmarked it to see where the time is really spent.

– adrian

I don't see any of these operations that can't be done in a make file, after all you can run arbitrary commands there. We do make directories, dylibs, move and strip files, etc in some of the makefiles in the test case.

OTOH, it is pretty common to have a test directory that has a Test.py with a bunch of test cases that all build the same thing. If we use a command-style driver for building, each of these tests will do a full rebuild, whereas now make figures out what needs to be done, and the build only actually happens once. If we think avoiding extra compiles is important then you do want the build tool to be able to compute dependencies.

Jim

It actually is pretty significant. Part of this is due to the fact that a single .py file has multiple tests, and the compile happens for every one of those tests, even though it just produces the exact same output every time.

Make doesn’t have builtin substitutions like %t or even better, ones you can define yourself in Python. Make can obviously do anything (it can even run a shell script!) but when you have a DSL, then it becomes much easier to do domain specific things.

Sure but I can't imagine anything we want to do here that that easiness delta is going to be significant, and if you do want to do complex things, python also has the system command.

Jim

Ack, that doesn't seem necessary, right? The clean should happen as part of the test case object cleanup, and then make can figure out what needs to be built. This would have to be done with a little care, since it puts the responsibility on any test that mucks with the built product to clean as a test cleanup, but I bet very few tests do that.

Again, only worth while after we actually measure the time spent in the various parts of the test run. But this seems like not that hard to fix.

Jim

Everything sounds good on this thread. My two cents:

We should add some post verification after each test that looks for file that are left over after the "clean" phase. This can help us catch the tests that aren't cleaning up after themselves. This will help us weed out the bad tests and fix this ASAP. This can be done both for in tree and out of tree solutions to verify there is no source polution.

We can easily move build artifacts out of the source tree. Running the test suite remotely via "lldb-server platform" has code that creates directories for each test in the platform working directory. If the test runs fine and passes, it cleans up the entire numbered test directory, else it leaves the numbered directory there so we can debug any issues. Shouldn't be hard to enable this.

For completeness, I looked at this and it doesn't look like that is how it works. My understanding (and keep in mind that this is the first time I am looking at this code so I might be misunderstanding something here) is that the remote platform support works by building the test on the host in-tree and then _RemoteProcess.launch() ships over only the binary when called from Base.spawnSubprocess().

That's not a big deal, though. I will introduce the concept of a build directory (which has to be separate from the remote platform working directory) and find a way to pass the source directory to runBuildCommands().

-- adrian

Looks like I missed a party. :slight_smile:

I'll try to give my thoughts on some of the things that were said here:

make -C

I don't think make -C does what you think it does. "make -C foo" is
basically equivalent to "cd foo && make", which is what we are doing
now already. Of course, you can make this work, but you would have to
pass an extra OUTDIR=... argument to make and then modify the
Makefiles to reference $(OUTDIR) for its outputs:
$(OUTDIR)/a.out: main.cc
  $(CC) -o $(OUTDIR)/a.out main.cc ...

The standard way of doing an out-of-tree build with make is to have
the Makefile in the build-directory and to set the magic VPATH
variable in the Makefile (or as a part of make invocation). VPATH
alters make's search path, so when searching for a dependency foo and
the foo is not present in the current (build) directory, it will go
searching for it in the VPATH (source) directory. You still need to be
careful about paths in the command line (generally this means using
make variables like $@ and $< instead of bare file names), but our
makefiles are generally pretty good at this already. We even have a
couple of makefiles using VPATH already (see TestConcurrentEvents) --
Todd added this to speed up the build by spreading out tests over
different folders while sharing sources (the serial execution
problem).

I still fully support being able to build the tests out of tree, I
just think it may be a bit more involved than you realise.

cmake

I agree that using cmake for building tests would some things simpler.
Building fully working executables is fairly tricky, especially when
you're cross-compiling. Take test/testcases/Android.rules for example.
This is basically a reimplementation of the android cmake toolchain
file distributed with the android ndk. If we had cmake for building
tests we could delete all of that and replace it with
-DCMAKE_TOOLCHAIN_FILE=$(ANDROID_NDK_HOME)/android.toolchain.cmake.
However, I only had to write Android.rules just once, so it's not that
big of a deal for me.

explicit RUN lines:

Yes, it's true that all you need is custom CXXFLAGS (and LDFLAGS), but
those CXX flags could be quite complex. I'm not really opposed to
that, but I don't see any immediate benefit either (the only impact
for me would be that I'd have to translate Android.rules to python).

running clean after every test

Currently we must run "make clean" after every test because make does
not track the changes in it's arguments. So, if you first run "make
MAKE_DWO=NO" and then "make MAKE_DWO=YES", make will happily declare
that it has nothing to do without building the DWO binary. (This will
go away if we run each test variant in a separate directory).

I don't have any data, but I would expect that running make upfront
would make a significant performance improvement on windows (spinning
up tons of processes and creating/deleting a bunch of files tends to
be much slower there), but to have no noticable difference on other
platforms.

Looks like I missed a party. :slight_smile:

I'll try to give my thoughts on some of the things that were said here:

make -C

I don't think make -C does what you think it does. "make -C foo" is
basically equivalent to "cd foo && make", which is what we are doing
now already. Of course, you can make this work, but you would have to
pass an extra OUTDIR=... argument to make and then modify the
Makefiles to reference $(OUTDIR) for its outputs:
$(OUTDIR)/a.out: main.cc
$(CC) -o $(OUTDIR)/a.out main.cc ...

The standard way of doing an out-of-tree build with make is to have
the Makefile in the build-directory and to set the magic VPATH
variable in the Makefile (or as a part of make invocation). VPATH
alters make's search path, so when searching for a dependency foo and
the foo is not present in the current (build) directory, it will go
searching for it in the VPATH (source) directory. You still need to be
careful about paths in the command line (generally this means using
make variables like $@ and $< instead of bare file names), but our
makefiles are generally pretty good at this already. We even have a
couple of makefiles using VPATH already (see TestConcurrentEvents) --
Todd added this to speed up the build by spreading out tests over
different folders while sharing sources (the serial execution
problem).

This is of course correct. Thanks for pointing this out and for outlining all the necessary changes!

I still fully support being able to build the tests out of tree, I
just think it may be a bit more involved than you realise.

Sounds good.

cmake

I agree that using cmake for building tests would some things simpler.
Building fully working executables is fairly tricky, especially when
you're cross-compiling. Take test/testcases/Android.rules for example.
This is basically a reimplementation of the android cmake toolchain
file distributed with the android ndk. If we had cmake for building
tests we could delete all of that and replace it with
-DCMAKE_TOOLCHAIN_FILE=$(ANDROID_NDK_HOME)/android.toolchain.cmake.
However, I only had to write Android.rules just once, so it's not that
big of a deal for me.

explicit RUN lines:

Yes, it's true that all you need is custom CXXFLAGS (and LDFLAGS), but
those CXX flags could be quite complex. I'm not really opposed to
that, but I don't see any immediate benefit either (the only impact
for me would be that I'd have to translate Android.rules to python).

running clean after every test

Currently we must run "make clean" after every test because make does
not track the changes in it's arguments. So, if you first run "make
MAKE_DWO=NO" and then "make MAKE_DWO=YES", make will happily declare
that it has nothing to do without building the DWO binary. (This will
go away if we run each test variant in a separate directory).

I don't have any data, but I would expect that running make upfront
would make a significant performance improvement on windows (spinning
up tons of processes and creating/deleting a bunch of files tends to
be much slower there), but to have no noticable difference on other
platforms.

I have not thought of Windows being a possible bottleneck. That sounds plausible. I'm still wondering how useful incremental builds for the testsuite are going to be. When I'm looking at our bots, almost all incoming commits are not in LLDB, but in LLVM or Clang. If we are modeling dependencies correctly, then each of these commits that changes the compiler should still trigger a full rebuild of all testcases, even with CMake. The only situation where incremental builds would be useful are for a configuration where we are building the LLDB testsuite against a fixed compiler, such as a released/stable version of clang or gcc. That's not to say that that isn't an important use-case, too, of course, but it's not how the bots on green dragon are configured at the moment.

-- adrian

It wouldn't speed up the bots, but it may make the workflow of an lldb
developer (who changes only lldb code most of the time) faster.
However, that doesn't matter I guess, as that's not the direction
we're going now.

I uploaded my first attempt of implementing something along these lines to https://reviews.llvm.org/D42281 . Feedback of all kinds is very welcome!

-- adrian