Questions about the LLDB testsuite and improving its reliability

Hi lldb-dev!

I've been investigating some spurious LLDB test suite failures on All build groups [Jenkins] that had to do with build artifacts from previous runs lying around in the test directories and this prompted me to ask a couple of general noob questions about the LLDB testsuite.

My understanding is that all execution tests are compiled using using `make` in-tree. I.e.: the test driver (dotest.py) effectively executes something equivalent to `cd $srcdir/packages/.../mytest && make`. And it does this in a serial fashion for all configurations (dwarf, dSYM, dwo, ...) and relies on the `clean` target to be implemented correctly.

I don't understand all the design decisions that went into the LLDB testsuite, but my naive intuition tells me that this is sub-optimal (because of the serialization of the configurations) and dangerous (because it relies on make clean being implemented correctly). It seems to me that a better approach would be to create a separate build directory for each test variant and then invoke something like `cd $builddir/test/mytest.dwarf && make -C $srcdir/packages/.../mytest`. This way all configurations can build in parallel, and we can simply nuke the build directory afterwards and this way get rid of all custom implementations of the `clean` target.

- Is this already possible, and/or am I misunderstanding how it works?
- Would this be a goal that is worthwhile to pursue?
- Is there a good reason why we are not already doing it this way?

thanks,
adrian

If we’re going to be making any significant changes to the way inferiors are compiled, why not use cmake? Make clean is already not implemented correctly in many places, leading to lots of remnants left over in the source tree after test runs. Furthermore, make is run every single time currently, leading to hundreds (if not thousands) of unnecessary compilations. Seems to me like all the inferiors should be compiled one time, up front, as part of the configure step, and into the build directory. This is nice because it already integrates perfectly into the existing LLVM “way” of building things.

I think this is a much better strategy. FWIW, I wouldn't object if you
want to switch to cmake entirely as LLVM is using it as its only true
build system, but that seems a much larger change.
In any case, whatever gets decided, happy to help you with that.

It would be really great to get all the binaries that you need for tests building outside of the test directory. It was done in tree originally for expediency - the tests need to know where their binaries are, and that task is simple if they are in CWD of the test. But it is annoying, both because it relies on each test to clean up after itself, and because you can't preserve one test run's results while making another, or preserve the debug variants. Plus, jamming stuff willy-nilly into the source tree is not something you should do.

It shouldn't be that hard to make a parallel hierarchy for the tests in the build directory, and pass that to the test as the root for products. That would be a valuable project!

Jim

As we're discussing lldb test suite changes, another detail that I
find a little weird is that every time you execute the test suite you
get a new build directory named after the time at which you run the
test.
It would be much much better IMHO to just have a `log/` generic
directory where the failures are logged, and those who want to
override this setting can just pass a flag.

I often use the fact that the past couple run's logs are preserved, and often find this valuable after the fact, so I'd like it to continue to be the default.

If you want to change it, you can pass "-s log" or whatever you like to dotest.py.

Jim

(The logs should also be moved out of tree, FWIW).

To be honest, I have not considered the the tests to be part of the build. Doing so is an interesting idea that I haven't thought about so far, here are some thoughts about this:
- CMake is more a replacement for autoconf than for make and I'm not sure if we need a better tool for the "configuration" part of the testsuite.
- Some of the tests purposefully do weird stuff, such as deleting or damaging one .o file to test LLDB's abilities to cope with incomplete debug info. To implement this we will need to micro-manage things at the "make" level that could be hard to express in CMake.
- The LLVM/CFE way of building *tests* is to hard-code the commands to recompile the testcases every time you run the testsuite.
- We probably want to be able to run the LLDB testsuite using many different compilers. It could be possible that CMake is helpful for this, but then again I doubt that we need much more than setting a custom CC/CXX/CFLAGs to support this.

I don't want to immediately shoot this idea down, but I think it is one step further than I would like to go at this time. Thanks for pointing it out though!

-- adrian

I would prefer having all of our test dependencies tracked by CMake for all the reasons Zach brought up, but I think we should defer that undertaking until after the bots are more stable. We have some immediate problems caused by stale in-tree test artifacts. As a first milestone, it'd be great to not have to run `git clean -fdx` anymore.

Hi lldb-dev!

I've been investigating some spurious LLDB test suite failures on All build groups [Jenkins] that had to do with build artifacts from previous runs lying around in the test directories and this prompted me to ask a couple of general noob questions about the LLDB testsuite.

My understanding is that all execution tests are compiled using using `make` in-tree. I.e.: the test driver (dotest.py) effectively executes something equivalent to `cd $srcdir/packages/.../mytest && make`. And it does this in a serial fashion for all configurations (dwarf, dSYM, dwo, ...) and relies on the `clean` target to be implemented correctly.

I don't understand all the design decisions that went into the LLDB testsuite, but my naive intuition tells me that this is sub-optimal (because of the serialization of the configurations) and dangerous (because it relies on make clean being implemented correctly). It seems to me that a better approach would be to create a separate build directory for each test variant and then invoke something like `cd $builddir/test/mytest.dwarf && make -C $srcdir/packages/.../mytest`. This way all configurations can build in parallel, and we can simply nuke the build directory afterwards and this way get rid of all custom implementations of the `clean` target.

This sgtm as a starting point.

vedant

Hi lldb-dev!

I've been investigating some spurious LLDB test suite failures on All build groups [Jenkins] that had to do with build artifacts from previous runs lying around in the test directories and this prompted me to ask a couple of general noob questions about the LLDB testsuite.

My understanding is that all execution tests are compiled using using `make` in-tree. I.e.: the test driver (dotest.py) effectively executes something equivalent to `cd $srcdir/packages/.../mytest && make`. And it does this in a serial fashion for all configurations (dwarf, dSYM, dwo, ...) and relies on the `clean` target to be implemented correctly.

I don't understand all the design decisions that went into the LLDB testsuite, but my naive intuition tells me that this is sub-optimal (because of the serialization of the configurations) and dangerous (because it relies on make clean being implemented correctly). It seems to me that a better approach would be to create a separate build directory for each test variant and then invoke something like `cd $builddir/test/mytest.dwarf && make -C $srcdir/packages/.../mytest`. This way all configurations can build in parallel, and we can simply nuke the build directory afterwards and this way get rid of all custom implementations of the `clean` target.

- Is this already possible, and/or am I misunderstanding how it works?
- Would this be a goal that is worthwhile to pursue?
- Is there a good reason why we are not already doing it this way?

As we're discussing lldb test suite changes, another detail that I
find a little weird is that every time you execute the test suite you
get a new build directory named after the time at which you run the
test.
It would be much much better IMHO to just have a `log/` generic
directory where the failures are logged, and those who want to
override this setting can just pass a flag.

(The logs should also be moved out of tree, FWIW).

If I'm going to move the test build artifacts out-of-source-tree the logs would naturally end up there, too. Let's discuss whether creating a timestamped log directory should be the default or an option in a different thread to keep things simple. This is entirely orthogonal.

-- adrian

I would prefer having all of our test dependencies tracked by CMake for all the reasons Zach brought up, but I think we should defer that undertaking until after the bots are more stable. We have some immediate problems caused by stale in-tree test artifacts. As a first milestone, it'd be great to not have to run `git clean -fdx` anymore.

I'm pretty sure I do not want to go all the way to CMake right now, but I am curious about your motivation: Why do you think that using CMake to build the tests in the testsuite is a good idea? In my opinion this adds a layer of complexity to the tests that makes it harder to understand what exactly is happening and test authors now need to understand CMake *and* the compiler invocations they want to execute in their tests. Do you also share Zachary's point of view that the tests should be build artifacts that should be kept after an incremental build?

-- adrian

I don’t think new test authors really need to add CMake any more so than they currently need to understand Make. Which is to say, not very much. Most Makefiles are currently 1-2 lines of code that simply does nothing other than include the common Makefile.

On the other hand, CMake defines a lot of constructs designed to support portable builds, so actually writing and maintaining that common CMake build file would be much easier. The existing Makefile-based system already doesn’t require you to understand the specific compiler invocations you want. Here’s 3 random Makefiles, which is hopefully representative given that I pulled them completely at random.

breakpoint-commands/Makefile:

LEVEL = …/…/…/make
CXX_SOURCES := nested.cpp

include $(LEVEL)/Makefile.rules

functionalities/inferior-assert:

LEVEL = …/…/make
C_SOURCES := main.c

include $(LEVEL)/Makefile.rules

types:

LEVEL = …/make

Example:

I don't think new test authors really need to add CMake any more so than they currently need to understand Make. Which is to say, not very much. Most Makefiles are currently 1-2 lines of code that simply does nothing other than include the common Makefile.

On the other hand, CMake defines a lot of constructs designed to support portable builds, so actually writing and maintaining that common CMake build file would be much easier. The existing Makefile-based system already doesn't require you to understand the specific compiler invocations you want. Here's 3 random Makefiles, which is hopefully representative given that I pulled them completely at random.

breakpoint-commands/Makefile:
LEVEL = ../../../make
CXX_SOURCES := nested.cpp
include $(LEVEL)/Makefile.rules

functionalities/inferior-assert:
LEVEL = ../../make
C_SOURCES := main.c
include $(LEVEL)/Makefile.rules

types:
LEVEL = ../make
# Example:
#
# CXX_SOURCES := int.cpp
include $(LEVEL)/Makefile.rules

None of this is particularly interesting. There are a very few tests that need to do something weird. I opened 10 other random Makefiles and still didn't find any. I don't believe it would be hard to support those cases.

So now instead of "understand Make" it becomes "understand CMake". Whic is already a requirement of building LLVM.

Fair point. I would suggest that I'll try to make LLDB's testsuite build out-of-tree using the existing Makefile system. That should be a generally useful first step. After doing this I will hopefully have a much better understanding of the requirements of the Makefiles and then we can revisit this idea with me actually knowing what I'm talking about :slight_smile:

If our test suite was lit-based where you actually had to write compiler invocations into the test files, I would feel differently, but that isn't what we have today. We have something that is almost a direct mapping to using CMake.

Question: how would you feel about converting the Makefiles to LIT-style .test files with very explicit RUN-lines?

-- adrian

I don't think new test authors really need to add CMake any more so than they currently need to understand Make. Which is to say, not very much. Most Makefiles are currently 1-2 lines of code that simply does nothing other than include the common Makefile.

On the other hand, CMake defines a lot of constructs designed to support portable builds, so actually writing and maintaining that common CMake build file would be much easier. The existing Makefile-based system already doesn't require you to understand the specific compiler invocations you want. Here's 3 random Makefiles, which is hopefully representative given that I pulled them completely at random.

breakpoint-commands/Makefile:
LEVEL = ../../../make
CXX_SOURCES := nested.cpp
include $(LEVEL)/Makefile.rules

functionalities/inferior-assert:
LEVEL = ../../make
C_SOURCES := main.c
include $(LEVEL)/Makefile.rules

types:
LEVEL = ../make
# Example:
#
# CXX_SOURCES := int.cpp
include $(LEVEL)/Makefile.rules

None of this is particularly interesting. There are a very few tests that need to do something weird. I opened 10 other random Makefiles and still didn't find any. I don't believe it would be hard to support those cases.

So now instead of "understand Make" it becomes "understand CMake". Whic is already a requirement of building LLVM.

Fair point. I would suggest that I'll try to make LLDB's testsuite build out-of-tree using the existing Makefile system. That should be a generally useful first step. After doing this I will hopefully have a much better understanding of the requirements of the Makefiles and then we can revisit this idea with me actually knowing what I'm talking about :slight_smile:

If our test suite was lit-based where you actually had to write compiler invocations into the test files, I would feel differently, but that isn't what we have today. We have something that is almost a direct mapping to using CMake.

Question: how would you feel about converting the Makefiles to LIT-style .test files with very explicit RUN-lines?

I'm not sure what you mean by this.

Jim

I don't think new test authors really need to add CMake any more so than they currently need to understand Make. Which is to say, not very much. Most Makefiles are currently 1-2 lines of code that simply does nothing other than include the common Makefile.

On the other hand, CMake defines a lot of constructs designed to support portable builds, so actually writing and maintaining that common CMake build file would be much easier. The existing Makefile-based system already doesn't require you to understand the specific compiler invocations you want. Here's 3 random Makefiles, which is hopefully representative given that I pulled them completely at random.

breakpoint-commands/Makefile:
LEVEL = ../../../make
CXX_SOURCES := nested.cpp
include $(LEVEL)/Makefile.rules

functionalities/inferior-assert:
LEVEL = ../../make
C_SOURCES := main.c
include $(LEVEL)/Makefile.rules

types:
LEVEL = ../make
# Example:
#
# CXX_SOURCES := int.cpp
include $(LEVEL)/Makefile.rules

None of this is particularly interesting. There are a very few tests that need to do something weird. I opened 10 other random Makefiles and still didn't find any. I don't believe it would be hard to support those cases.

So now instead of "understand Make" it becomes "understand CMake". Whic is already a requirement of building LLVM.

Fair point. I would suggest that I'll try to make LLDB's testsuite build out-of-tree using the existing Makefile system. That should be a generally useful first step. After doing this I will hopefully have a much better understanding of the requirements of the Makefiles and then we can revisit this idea with me actually knowing what I'm talking about :slight_smile:

If our test suite was lit-based where you actually had to write compiler invocations into the test files, I would feel differently, but that isn't what we have today. We have something that is almost a direct mapping to using CMake.

Question: how would you feel about converting the Makefiles to LIT-style .test files with very explicit RUN-lines?

I'm not sure what you mean by this.

Instead of using a build system at all to build the tests, we would hard-code the compiler and linker invocations without encoding any dependencies. Because we still need this to be configurable, it would probably look something like this:

  RUN: %CXX test.cpp -O0 %CXXFLAGS -o test.exe
  RUN: %test_driver test.exe mytest.py

-- adrian

As a general rule, I support moving towards explicit run lines and lit-style tests, but some care has to be taken. If you examine the common Makefiles, you’ll see that there’s already a lot of special logic for different platforms and compilers. It might be hard to maintain that if we go back to explicit run lines. I’m sure there’s a way and I’m happy to help brainstorming ideas for how to do it.

As a first idea, maybe we could have something like a REQUIRES line but call it COMPILER-SETTINGS instead. And you could write something like:

COMPILER-SETTINGS: pic, dwarf, shared-library

And that would be responsible for figuring out what compiler options to put depending on the platform and compiler.

the main challenge with using explicit run lines is going to be figuring out how to write run lines that work across all compilers and platforms. (Luckily we don’t have to care about MSVC, mostly just clang + gcc)

I don't think new test authors really need to add CMake any more so than they currently need to understand Make. Which is to say, not very much. Most Makefiles are currently 1-2 lines of code that simply does nothing other than include the common Makefile.

On the other hand, CMake defines a lot of constructs designed to support portable builds, so actually writing and maintaining that common CMake build file would be much easier. The existing Makefile-based system already doesn't require you to understand the specific compiler invocations you want. Here's 3 random Makefiles, which is hopefully representative given that I pulled them completely at random.

breakpoint-commands/Makefile:
LEVEL = ../../../make
CXX_SOURCES := nested.cpp
include $(LEVEL)/Makefile.rules

functionalities/inferior-assert:
LEVEL = ../../make
C_SOURCES := main.c
include $(LEVEL)/Makefile.rules

types:
LEVEL = ../make
# Example:
#
# CXX_SOURCES := int.cpp
include $(LEVEL)/Makefile.rules

None of this is particularly interesting. There are a very few tests that need to do something weird. I opened 10 other random Makefiles and still didn't find any. I don't believe it would be hard to support those cases.

So now instead of "understand Make" it becomes "understand CMake". Whic is already a requirement of building LLVM.

Fair point. I would suggest that I'll try to make LLDB's testsuite build out-of-tree using the existing Makefile system. That should be a generally useful first step. After doing this I will hopefully have a much better understanding of the requirements of the Makefiles and then we can revisit this idea with me actually knowing what I'm talking about :slight_smile:

If our test suite was lit-based where you actually had to write compiler invocations into the test files, I would feel differently, but that isn't what we have today. We have something that is almost a direct mapping to using CMake.

Question: how would you feel about converting the Makefiles to LIT-style .test files with very explicit RUN-lines?

I'm not sure what you mean by this.

Instead of using a build system at all to build the tests, we would hard-code the compiler and linker invocations without encoding any dependencies. Because we still need this to be configurable, it would probably look something like this:

RUN: %CXX test.cpp -O0 %CXXFLAGS -o test.exe
RUN: %test_driver test.exe mytest.py

I'm worried we'd back into building another make system over time. What advantage would we get from this.

Jim

Lit provides some helpful utilities which make it easier to write portable tests. E.g %t, for temporary test directories, and portable replacements for things like `diff -r`. This is how compiler-rt's end-to-end tests are structured, and we haven't needed any build-system like functionality there.

vedant

(It's possible that this isn't the right trade-off, I'm just exploring ideas here)

Some advantages would be:
- remove the dependency on Make
- possibly easier top debug testcases because all actions are explicit
- potentially slightly faster build times because we don't need to spawn make, but note that it also means that these actions will always be unconditionally executed
- uniformity with other LLVM projects and thus a smaller cognitive burden for developers that touch both clang and lldb

On the other hand:
- everybody already knows make
- maybe we do want a full build system to allow incremental builds of testcases

-- adrian

Note that we’re going off topic from the original goal, and I just want to re-iterate that I fully support smaller, incremental changes. But since I like talking about lit so much, I can’t help but chime in :slight_smile:

If we did want to move to a lit based system for the end to end based tests, the first step would be to make an LLDBTestFormat and teach it to literally just call dotest.py in single-process mode with the path to a specific test file.

That would be a great first step, and also very manageable and bite sized. I think we would see a measurable reduction in flakiness just from this change.

On the other side though, I still think it is very important to increase test coverage of non end-to-end tests. When you increase the granularity of the tests you’re able to write, a lot more things become possible. For non end-to-end based tests, we can still use a more traditional llvm style lit test based on lldb-test (which is still new, but the foundation is there).