[RFC] Cross-project lit test suite

Dear all,

Recently, I and a number of my colleagues have run into cases where we would like the ability to write tests that involve components from multiple LLVM projects, for example using both clang and LLD. Similarly, I have seen a few instances recently where tests would ideally make use of LLD but only to help generate input objects for testing an LLVM tool, such as llvm-symbolizer (see for example https://reviews.llvm.org/D88988). Currently, there is no location where lit tests that use both clang and LLD can be put, whilst the llvm-symbolizer cases I’ve hit are testing llvm-symbolizer (and not LLD), so don’t really fit in the LLD test suite. I therefore have prototyped a lit test suite that would be part of the monorepo, and which can support tests that use elements from multiple projects - see https://reviews.llvm.org/D95339. Tests could be added to this suite as needed. The suite is modelled as an additional top-level directory, and is enabled by enabling the “cross-project-tests” project in CMake. I have initially added support for both LLD and clang. At configuration time, the tests that require LLD or clang will only be enabled when the respective projects are enabled, so that developers continue to benefit from the subset of tests that are applicable for the projects they are building. Note that I am not especially familiar with CMake or lit, so this code may not be perfect, but it should be sufficient to demonstrate what it can do.

One could argue that these sorts of tests should belong in the external (to the monorepo) test-suite, but this is a) quite distant from the existing testing, and therefore easily forgotten, delaying potential feedback for any breakages/resulting in potentially duplicate testing etc, and b) is not as easy to set up and run (owing to the fact that it isn’t part of the monorepo, isn’t connected to check-all etc), therefore making it harder for developers to maintain the tests. Back in October 2019, there was an extensive discussion on end-to-end testing and how to write them (starting from https://lists.llvm.org/pipermail/cfe-dev/2019-October/063509.html). The suggestion was that these tests would be lit-based and run as part of check-all, and would not be inside the clang tree, although there was some opposition. This concluded with a round table. Unfortunately, I am unaware of what the conclusion of that round table conversation was, so it’s possible that what I am proposing is redundant/being worked on by someone else. Additionally, I don’t consider all classes of tests that the proposed lit suite would be useful for to be “end-to-end” testing. For example, llvm-symbolizer is usually used on linked output containing debug information. Usually tests that consume objects that have debug data in them rely on assembly that has been written by hand or generated by clang prior to commit, with a limited set able to make use of yaml2obj to generate the debug data instead. However, the output of these approaches is typically not a fully linked output (yaml2obj output can be made to look like one, but getting all the addresses to match up in a maintainable manner makes this approach not particularly viable). Being able to use LLD to link the object file produced would make the test significantly more readable, much as using llvm-mc and assembly to generate test inputs is more preferable to using prebuilt binaries. Such a test is ultimately not really any more an end-to-end test than an llvm-symbolizer test that just uses the object produced by the assembler directly.

What do people think?

James

Dear all,

Recently, I and a number of my colleagues have run into cases where we would like the ability to write tests that involve components from multiple LLVM projects, for example using both clang and LLD. Similarly, I have seen a few instances recently where tests would ideally make use of LLD but only to help generate input objects for testing an LLVM tool, such as llvm-symbolizer (see for example https://reviews.llvm.org/D88988). Currently, there is no location where lit tests that use both clang and LLD can be put, whilst the llvm-symbolizer cases I’ve hit are testing llvm-symbolizer (and not LLD), so don’t really fit in the LLD test suite. I therefore have prototyped a lit test suite that would be part of the monorepo, and which can support tests that use elements from multiple projects - see https://reviews.llvm.org/D95339. Tests could be added to this suite as needed. The suite is modelled as an additional top-level directory, and is enabled by enabling the “cross-project-tests” project in CMake. I have initially added support for both LLD and clang. At configuration time, the tests that require LLD or clang will only be enabled when the respective projects are enabled, so that developers continue to benefit from the subset of tests that are applicable for the projects they are building. Note that I am not especially familiar with CMake or lit, so this code may not be perfect, but it should be sufficient to demonstrate what it can do.

One could argue that these sorts of tests should belong in the external (to the monorepo) test-suite, but this is a) quite distant from the existing testing, and therefore easily forgotten, delaying potential feedback for any breakages/resulting in potentially duplicate testing etc, and b) is not as easy to set up and run (owing to the fact that it isn’t part of the monorepo, isn’t connected to check-all etc), therefore making it harder for developers to maintain the tests. Back in October 2019, there was an extensive discussion on end-to-end testing and how to write them (starting from https://lists.llvm.org/pipermail/cfe-dev/2019-October/063509.html). The suggestion was that these tests would be lit-based and run as part of check-all, and would not be inside the clang tree, although there was some opposition. This concluded with a round table. Unfortunately, I am unaware of what the conclusion of that round table conversation was, so it’s possible that what I am proposing is redundant/being worked on by someone else. Additionally, I don’t consider all classes of tests that the proposed lit suite would be useful for to be “end-to-end” testing. For example, llvm-symbolizer is usually used on linked output containing debug information. Usually tests that consume objects that have debug data in them rely on assembly that has been written by hand or generated by clang prior to commit, with a limited set able to make use of yaml2obj to generate the debug data instead. However, the output of these approaches is typically not a fully linked output (yaml2obj output can be made to look like one, but getting all the addresses to match up in a maintainable manner makes this approach not particularly viable). Being able to use LLD to link the object file produced would make the test significantly more readable, much as using llvm-mc and assembly to generate test inputs is more preferable to using prebuilt binaries. Such a test is ultimately not really any more an end-to-end test than an llvm-symbolizer test that just uses the object produced by the assembler directly.

What do people think?

Is this similar to what you are looking for?

https://github.com/opencollab/llvm-toolchain-integration-test-suite/

-Tom

Dear all,

Recently, I and a number of my colleagues have run into cases where we would like the ability to write tests that involve components from multiple LLVM projects, for example using both clang and LLD. Similarly, I have seen a few instances recently where tests would ideally make use of LLD but only to help generate input objects for testing an LLVM tool, such as llvm-symbolizer (see for example https://reviews.llvm.org/D88988). Currently, there is no location where lit tests that use both clang and LLD can be put, whilst the llvm-symbolizer cases I’ve hit are testing llvm-symbolizer (and not LLD), so don’t really fit in the LLD test suite. I therefore have prototyped a lit test suite that would be part of the monorepo, and which can support tests that use elements from multiple projects - see https://reviews.llvm.org/D95339. Tests could be added to this suite as needed. The suite is modelled as an additional top-level directory, and is enabled by enabling the
“cross-project-tests” project in CMake. I have initially added support for both LLD and clang. At configuration time, the tests that require LLD or clang will only be enabled when the respective projects are enabled, so that developers continue to benefit from the subset of tests that are applicable for the projects they are building. Note that I am not especially familiar with CMake or lit, so this code may not be perfect, but it should be sufficient to demonstrate what it can do.

One could argue that these sorts of tests should belong in the external (to the monorepo) test-suite, but this is a) quite distant from the existing testing, and therefore easily forgotten, delaying potential feedback for any breakages/resulting in potentially duplicate testing etc, and b) is not as easy to set up and run (owing to the fact that it isn’t part of the monorepo, isn’t connected to check-all etc), therefore making it harder for developers to maintain the tests. Back in October 2019, there was an extensive discussion on end-to-end testing and how to write them (starting from https://lists.llvm.org/pipermail/cfe-dev/2019-October/063509.html). The suggestion was that these tests would be lit-based and run as part of check-all, and would not be inside the clang tree, although there was some opposition. This concluded with a round table. Unfortunately, I am unaware of what the conclusion of that round table conversation was, so it’s possible that what I am proposing
is redundant/being worked on by someone else. Additionally, I don’t consider all classes of tests that the proposed lit suite would be useful for to be “end-to-end” testing. For example, llvm-symbolizer is usually used on linked output containing debug information. Usually tests that consume objects that have debug data in them rely on assembly that has been written by hand or generated by clang prior to commit, with a limited set able to make use of yaml2obj to generate the debug data instead. However, the output of these approaches is typically not a fully linked output (yaml2obj output can be made to look like one, but getting all the addresses to match up in a maintainable manner makes this approach not particularly viable). Being able to use LLD to link the object file produced would make the test significantly more readable, much as using llvm-mc and assembly to generate test inputs is more preferable to using prebuilt binaries. Such a test is ultimately not really any
more an end-to-end test than an llvm-symbolizer test that just uses the object produced by the assembler directly.

What do people think?

Is this similar to what you are looking for?

https://github.com/opencollab/llvm-toolchain-integration-test-suite/

-Tom

Thanks for the tip. At first glance, this looks to have some implementation that could prove useful, in that it allows tests to be run dependent on available tools, so I certainly might be able to use it to improve the existing prototype. However, it doesn’t address the more fundamental issues of distance and ease (i.e. the tests are not part of the monorepo and aren’t part of check-all as far as I understand it).

Some concerns (the usual: Things should be tested in isolation, things
should be tested independently - but end to end tests have some value
too), but generally seems good.

Though perhaps debuginfo-tests (this presumably already supports the
multiple-subproject mechanical isssue you're discussing?) could be
generalized/renamed to be all our cross-project lit testing
(Essentially an in-monorepo, lit-only, high-reliability/not-flakey/etc
version of the test-suite).

- Dave

Indeed this is a usual concern: such tests shouldn’t be seen as replacing isolated lit tests (“unit tests”).
But I have another question about the cost of maintenance here: are we gonna revert patches to either project when one of the integration tests fails?
What about integration tests that require to be updated manually when changing another component?
I find the cost of maintenance of end-to-end tests is often hard to carry over, especially since they are only supplementing and not replacing lit “unit-tests”.

Best,

Dear all,

Recently, I and a number of my colleagues have run into cases where we would like the ability to write tests that involve components from multiple LLVM projects, for example using both clang and LLD. Similarly, I have seen a few instances recently where tests would ideally make use of LLD but only to help generate input objects for testing an LLVM tool, such as llvm-symbolizer (see for example https://reviews.llvm.org/D88988). Currently, there is no location where lit tests that use both clang and LLD can be put, whilst the llvm-symbolizer cases I’ve hit are testing llvm-symbolizer (and not LLD), so don’t really fit in the LLD test suite. I therefore have prototyped a lit test suite that would be part of the monorepo, and which can support tests that use elements from multiple projects - see https://reviews.llvm.org/D95339. Tests could be added to this suite as needed. The suite is modelled as an additional top-level directory, and is enabled by enabling the “cross-project-tests” project in CMake. I have initially added support for both LLD and clang. At configuration time, the tests that require LLD or clang will only be enabled when the respective projects are enabled, so that developers continue to benefit from the subset of tests that are applicable for the projects they are building. Note that I am not especially familiar with CMake or lit, so this code may not be perfect, but it should be sufficient to demonstrate what it can do.

One could argue that these sorts of tests should belong in the external (to the monorepo) test-suite, but this is a) quite distant from the existing testing, and therefore easily forgotten, delaying potential feedback for any breakages/resulting in potentially duplicate testing etc, and b) is not as easy to set up and run (owing to the fact that it isn’t part of the monorepo, isn’t connected to check-all etc), therefore making it harder for developers to maintain the tests. Back in October 2019, there was an extensive discussion on end-to-end testing and how to write them (starting from https://lists.llvm.org/pipermail/cfe-dev/2019-October/063509.html). The suggestion was that these tests would be lit-based and run as part of check-all, and would not be inside the clang tree, although there was some opposition. This concluded with a round table. Unfortunately, I am unaware of what the conclusion of that round table conversation was, so it’s possible that what I am proposing is redundant/being worked on by someone else. Additionally, I don’t consider all classes of tests that the proposed lit suite would be useful for to be “end-to-end” testing. For example, llvm-symbolizer is usually used on linked output containing debug information. Usually tests that consume objects that have debug data in them rely on assembly that has been written by hand or generated by clang prior to commit, with a limited set able to make use of yaml2obj to generate the debug data instead. However, the output of these approaches is typically not a fully linked output (yaml2obj output can be made to look like one, but getting all the addresses to match up in a maintainable manner makes this approach not particularly viable). Being able to use LLD to link the object file produced would make the test significantly more readable, much as using llvm-mc and assembly to generate test inputs is more preferable to using prebuilt binaries. Such a test is ultimately not really any more an end-to-end test than an llvm-symbolizer test that just uses the object produced by the assembler directly.

What do people think?

Some concerns (the usual: Things should be tested in isolation, things
should be tested independently - but end to end tests have some value
too), but generally seems good.

Indeed this is a usual concern: such tests shouldn’t be seen as replacing isolated lit tests (“unit tests”).
But I have another question about the cost of maintenance here: are we gonna revert patches to either project when one of the integration tests fails?

Possibly, yeah. If they demonstrate a bug.

What about integration tests that require to be updated manually when changing another component?

If they need to be updated, because their failure isn’t representative of a bug, yes.

I find the cost of maintenance of end-to-end tests is often hard to carry over, especially since they are only supplementing and not replacing lit “unit-tests”.

One of the nice thing about end to end tests (as with all tests, if designed carefully - eg: don’t take some arbitrary code, compile it with optimizations, and expect a very specific backtrace - optimizations might lead to different line table/stack frame details (if some code was merged, or moved, it might lose or gain a specific source file/line)) is that they can be pretty resilient to implementation details, so less likely to need updating due to changes in implementation details. If someone changes the output format of llvm-symbolizer these would require updating and I think it’d be reasonable to expect that to be updated/not left failing.

  • Dave

Dear all,

Recently, I and a number of my colleagues have run into cases where we would like the ability to write tests that involve components from multiple LLVM projects, for example using both clang and LLD. Similarly, I have seen a few instances recently where tests would ideally make use of LLD but only to help generate input objects for testing an LLVM tool, such as llvm-symbolizer (see for example https://reviews.llvm.org/D88988). Currently, there is no location where lit tests that use both clang and LLD can be put, whilst the llvm-symbolizer cases I’ve hit are testing llvm-symbolizer (and not LLD), so don’t really fit in the LLD test suite. I therefore have prototyped a lit test suite that would be part of the monorepo, and which can support tests that use elements from multiple projects - see https://reviews.llvm.org/D95339. Tests could be added to this suite as needed. The suite is modelled as an additional top-level directory, and is enabled by enabling the “cross-project-tests” project in CMake. I have initially added support for both LLD and clang. At configuration time, the tests that require LLD or clang will only be enabled when the respective projects are enabled, so that developers continue to benefit from the subset of tests that are applicable for the projects they are building. Note that I am not especially familiar with CMake or lit, so this code may not be perfect, but it should be sufficient to demonstrate what it can do.

One could argue that these sorts of tests should belong in the external (to the monorepo) test-suite, but this is a) quite distant from the existing testing, and therefore easily forgotten, delaying potential feedback for any breakages/resulting in potentially duplicate testing etc, and b) is not as easy to set up and run (owing to the fact that it isn’t part of the monorepo, isn’t connected to check-all etc), therefore making it harder for developers to maintain the tests. Back in October 2019, there was an extensive discussion on end-to-end testing and how to write them (starting from https://lists.llvm.org/pipermail/cfe-dev/2019-October/063509.html). The suggestion was that these tests would be lit-based and run as part of check-all, and would not be inside the clang tree, although there was some opposition. This concluded with a round table. Unfortunately, I am unaware of what the conclusion of that round table conversation was, so it’s possible that what I am proposing is redundant/being worked on by someone else. Additionally, I don’t consider all classes of tests that the proposed lit suite would be useful for to be “end-to-end” testing. For example, llvm-symbolizer is usually used on linked output containing debug information. Usually tests that consume objects that have debug data in them rely on assembly that has been written by hand or generated by clang prior to commit, with a limited set able to make use of yaml2obj to generate the debug data instead. However, the output of these approaches is typically not a fully linked output (yaml2obj output can be made to look like one, but getting all the addresses to match up in a maintainable manner makes this approach not particularly viable). Being able to use LLD to link the object file produced would make the test significantly more readable, much as using llvm-mc and assembly to generate test inputs is more preferable to using prebuilt binaries. Such a test is ultimately not really any more an end-to-end test than an llvm-symbolizer test that just uses the object produced by the assembler directly.

What do people think?

Some concerns (the usual: Things should be tested in isolation, things
should be tested independently - but end to end tests have some value
too), but generally seems good.

Indeed this is a usual concern: such tests shouldn’t be seen as replacing isolated lit tests (“unit tests”).

I completely agree. Indeed, the llvm-symbolizer test referenced in the review I’d class as an isolated test - LLD was just being used to generate the input for the test. The key here is that the thing being tested was the llvm-symbolizer behaviour, and not the linker behaviour. As mentioned, this isn’t really different to how llvm-mc or llc might be used to convert some input source (asm/IR etc) into something the tool under test wants to be run on. Potentially, changes in those tools might break things, but as long as the input is specific enough, this shouldn’t happen often.

But I have another question about the cost of maintenance here: are we gonna revert patches to either project when one of the integration tests fails?

Possibly, yeah. If they demonstrate a bug.

That would be my intention - these tests should be classed as first-class citizens as much as any other lit test. They’re just unit tests that use other components (LLD, clang etc) to generate inputs.

What about integration tests that require to be updated manually when changing another component?

If they need to be updated, because their failure isn’t representative of a bug, yes.

Hopefully these sorts of failures would occur only infrequently. As noted, my aim here isn’t to provide a place in opensource LLVM to do integration testing, but rather unit tests that just can’t sit in the corresponding lit area, so input variability should be minimal.

That being said, the proposed suite could be used for integration testing, if the community agreed such testing belonged in the monorepo - indeed, I have plans for some downstream integration tests that would make use of this if it lands - but that isn’t the goal here.

I find the cost of maintenance of end-to-end tests is often hard to carry over, especially since they are only supplementing and not replacing lit “unit-tests”.

One of the nice thing about end to end tests (as with all tests, if designed carefully - eg: don’t take some arbitrary code, compile it with optimizations, and expect a very specific backtrace - optimizations might lead to different line table/stack frame details (if some code was merged, or moved, it might lose or gain a specific source file/line)) is that they can be pretty resilient to implementation details, so less likely to need updating due to changes in implementation details. If someone changes the output format of llvm-symbolizer these would require updating and I think it’d be reasonable to expect that to be updated/not left failing.

Right, just as we would change other tests for tools where the output format changed.

  • Dave

Best,


Mehdi

Though perhaps debuginfo-tests (this presumably already supports the
multiple-subproject mechanical isssue you’re discussing?) could be
generalized/renamed to be all our cross-project lit testing
(Essentially an in-monorepo, lit-only, high-reliability/not-flakey/etc
version of the test-suite).

The existing debug-info test area looks like it could work if it was more generalised. It looks like we’d want the ability to have tests that would still work without clang/lldb being built, and similarly which could use LLD, but I don’t think those are insurmountable issues - it would just require taking some of the ideas from my existing prototype (or equivalent from elsewhere) and merging them in.

Sounds good to me - might want some buy-in from other debuginfo-tests folks, though.

I don’t mind generalizing debuginfo-tests to support more use-cases, but I have some very practical concerns: Because of the hardware constraints on green-dragon (currently a single Intel Mac Mini running all LLDB tests) I would like to avoid getting into a situation where we need to build more than clang+lldb in order to run the debuginfo-tests. I also don’t want to have to run more than the debuginfo tests on that bot. So as long as a separate “check-…” target is used and the bots keep working, I’m fine with this.

– adrian

Build design question: How would you feel if it was the same check- target, but it dynamically chose what to run based on which projects were checked out/being built? I’m not sure if that’s a good design or not, but having a combinatorial set of explicit check-* names based on which projects are enabled would seem unfortunaet.

(though, inversely - if some set of tests (like the existing debug info ones) are in a subdirectory, and like the current check-llvm-debuginfo-* etc where you can specify a target that’s a subdirectory, would mean you coudl run just the debuginfo ones - that wouldn’t be so bad, but maybe we’d eventually have tests under there that might require lld for instance - and it’d still be nice for those tests to degrade gracefully to “unsupported” if you weren’t building lld (similarly, if you aren’t building clang, the existing debuginfo tests could degrade to “unsupported”))

Dear all,

Recently, I and a number of my colleagues have run into cases where we would like the ability to write tests that involve components from multiple LLVM projects, for example using both clang and LLD. Similarly, I have seen a few instances recently where tests would ideally make use of LLD but only to help generate input objects for testing an LLVM tool, such as llvm-symbolizer (see for example https://reviews.llvm.org/D88988). Currently, there is no location where lit tests that use both clang and LLD can be put, whilst the llvm-symbolizer cases I’ve hit are testing llvm-symbolizer (and not LLD), so don’t really fit in the LLD test suite. I therefore have prototyped a lit test suite that would be part of the monorepo, and which can support tests that use elements from multiple projects - see https://reviews.llvm.org/D95339. Tests could be added to this suite as needed. The suite is modelled as an additional top-level directory, and is enabled by enabling the “cross-project-tests” project in CMake. I have initially added support for both LLD and clang. At configuration time, the tests that require LLD or clang will only be enabled when the respective projects are enabled, so that developers continue to benefit from the subset of tests that are applicable for the projects they are building. Note that I am not especially familiar with CMake or lit, so this code may not be perfect, but it should be sufficient to demonstrate what it can do.

One could argue that these sorts of tests should belong in the external (to the monorepo) test-suite, but this is a) quite distant from the existing testing, and therefore easily forgotten, delaying potential feedback for any breakages/resulting in potentially duplicate testing etc, and b) is not as easy to set up and run (owing to the fact that it isn’t part of the monorepo, isn’t connected to check-all etc), therefore making it harder for developers to maintain the tests. Back in October 2019, there was an extensive discussion on end-to-end testing and how to write them (starting from https://lists.llvm.org/pipermail/cfe-dev/2019-October/063509.html). The suggestion was that these tests would be lit-based and run as part of check-all, and would not be inside the clang tree, although there was some opposition. This concluded with a round table. Unfortunately, I am unaware of what the conclusion of that round table conversation was, so it’s possible that what I am proposing is redundant/being worked on by someone else. Additionally, I don’t consider all classes of tests that the proposed lit suite would be useful for to be “end-to-end” testing. For example, llvm-symbolizer is usually used on linked output containing debug information. Usually tests that consume objects that have debug data in them rely on assembly that has been written by hand or generated by clang prior to commit, with a limited set able to make use of yaml2obj to generate the debug data instead. However, the output of these approaches is typically not a fully linked output (yaml2obj output can be made to look like one, but getting all the addresses to match up in a maintainable manner makes this approach not particularly viable). Being able to use LLD to link the object file produced would make the test significantly more readable, much as using llvm-mc and assembly to generate test inputs is more preferable to using prebuilt binaries. Such a test is ultimately not really any more an end-to-end test than an llvm-symbolizer test that just uses the object produced by the assembler directly.

What do people think?

Some concerns (the usual: Things should be tested in isolation, things
should be tested independently - but end to end tests have some value
too), but generally seems good.

Indeed this is a usual concern: such tests shouldn’t be seen as replacing isolated lit tests (“unit tests”).

I completely agree. Indeed, the llvm-symbolizer test referenced in the review I’d class as an isolated test - LLD was just being used to generate the input for the test. The key here is that the thing being tested was the llvm-symbolizer behaviour, and not the linker behaviour. As mentioned, this isn’t really different to how llvm-mc or llc might be used to convert some input source (asm/IR etc) into something the tool under test wants to be run on. Potentially, changes in those tools might break things, but as long as the input is specific enough, this shouldn’t happen often.

But I have another question about the cost of maintenance here: are we gonna revert patches to either project when one of the integration tests fails?

Possibly, yeah. If they demonstrate a bug.

That would be my intention - these tests should be classed as first-class citizens as much as any other lit test. They’re just unit tests that use other components (LLD, clang etc) to generate inputs.

What about integration tests that require to be updated manually when changing another component?

If they need to be updated, because their failure isn’t representative of a bug, yes.

Hopefully these sorts of failures would occur only infrequently. As noted, my aim here isn’t to provide a place in opensource LLVM to do integration testing, but rather unit tests that just can’t sit in the corresponding lit area, so input variability should be minimal.

That being said, the proposed suite could be used for integration testing, if the community agreed such testing belonged in the monorepo - indeed, I have plans for some downstream integration tests that would make use of this if it lands - but that isn’t the goal here.

I find the cost of maintenance of end-to-end tests is often hard to carry over, especially since they are only supplementing and not replacing lit “unit-tests”.

One of the nice thing about end to end tests (as with all tests, if designed carefully - eg: don’t take some arbitrary code, compile it with optimizations, and expect a very specific backtrace - optimizations might lead to different line table/stack frame details (if some code was merged, or moved, it might lose or gain a specific source file/line)) is that they can be pretty resilient to implementation details, so less likely to need updating due to changes in implementation details. If someone changes the output format of llvm-symbolizer these would require updating and I think it’d be reasonable to expect that to be updated/not left failing.

Right, just as we would change other tests for tools where the output format changed.

  • Dave

Best,


Mehdi

Though perhaps debuginfo-tests (this presumably already supports the
multiple-subproject mechanical isssue you’re discussing?) could be
generalized/renamed to be all our cross-project lit testing
(Essentially an in-monorepo, lit-only, high-reliability/not-flakey/etc
version of the test-suite).

The existing debug-info test area looks like it could work if it was more generalised. It looks like we’d want the ability to have tests that would still work without clang/lldb being built, and similarly which could use LLD, but I don’t think those are insurmountable issues - it would just require taking some of the ideas from my existing prototype (or equivalent from elsewhere) and merging them in.

Sounds good to me - might want some buy-in from other debuginfo-tests folks, though.

I don’t mind generalizing debuginfo-tests to support more use-cases, but I have some very practical concerns: Because of the hardware constraints on green-dragon (currently a single Intel Mac Mini running all LLDB tests) I would like to avoid getting into a situation where we need to build more than clang+lldb in order to run the debuginfo-tests. I also don’t want to have to run more than the debuginfo tests on that bot. So as long as a separate “check-…” target is used and the bots keep working, I’m fine with this.

Build design question: How would you feel if it was the same check- target, but it dynamically chose what to run based on which projects were checked out/being built? I’m not sure if that’s a good design or not, but having a combinatorial set of explicit check-* names based on which projects are enabled would seem unfortunaet.

Generally I’m not a fan of automatic behavior: You could have a misconfiguration that leaves out a target and the bot would still run the same target and it would aways be green, but it wouldn’t run any tests. But I’m not going to stand in the way if such a set up makes most sense here.

(though, inversely - if some set of tests (like the existing debug info ones) are in a subdirectory, and like the current check-llvm-debuginfo-* etc where you can specify a target that’s a subdirectory, would mean you coudl run just the debuginfo ones - that wouldn’t be so bad, but maybe we’d eventually have tests under there that might require lld for instance - and it’d still be nice for those tests to degrade gracefully to “unsupported” if you weren’t building lld (similarly, if you aren’t building clang, the existing debuginfo tests could degrade to “unsupported”))

That actually sounds nice. The unsupported category would show up in the test result xml file and thus any change would show in the Jenkins statistics.

– adrian

I just tested a build. After `ninja lldb` (~4000 actions), `ninja lld` requires just ~152 actions. `ninja llvm-symbolizer` requires 7 actions.
Perhaps the additional actions are fine? (Linking lld may need some resources, as it links in IR/CodeGen libraries for LLVM LTO.)

There are currently some magic `IN_LIST` in CMakeLists.txt: compiler-rt, lldb, mlir.
compiler-rt & mlir are for one or two tiny targets.

I guess this is fine — that particular bot is already building the lld project.

-- adrian

In this particular case maybe, but this is a valid general concern for the approach: this RFC is about cross-project tests in general and not only this specific case.

In general the ability to not build and run the world to change a single component is quite valuable during development. Historically the testing of these components is very decoupled and it seems important to me to keep this property as much as possible.

Here I can see how this does seem harmless when presented as “LLD was just being used to generate the input for the test. The key here is that the thing being tested was the llvm-symbolizer behaviour, and not the linker behaviour”, but it is very easy to introduce coupling between these components, and increase the maintenance cost.

I’d be nice to have more clear guidelines about what is / isn’t OK there, and it isn’t obvious to me how to draw the line or identify when is it suitable to introduce cross-components dependencies for the sake of testing.

>
>>
>>>
>>>>
>>>> >
>>>> > Dear all,
>>>> >
>>>> > Recently, I and a number of my colleagues have run into cases where
we would like the ability to write tests that involve components from
multiple LLVM projects, for example using both clang and LLD. Similarly, I
have seen a few instances recently where tests would ideally make use of
LLD but only to help generate input objects for testing an LLVM tool, such
as llvm-symbolizer (see for example ⚙ D88988 [llvm-symbolizer] Add inline stack traces for Windows. <
⚙ D88988 [llvm-symbolizer] Add inline stack traces for Windows.>). Currently, there is no location where
lit tests that use both clang and LLD can be put, whilst the
llvm-symbolizer cases I’ve hit are testing llvm-symbolizer (and not LLD),
so don’t really fit in the LLD test suite. I therefore have prototyped a
lit test suite that would be part of the monorepo, and which can support
tests that use elements from multiple projects - see
⚙ D95339 [RFC][test] Adapt debug-info lit framework for more general purposes - part 1. Tests
could be added to this suite as needed. The suite is modelled as an
additional top-level directory, and is enabled by enabling the
“cross-project-tests” project in CMake. I have initially added support for
both LLD and clang. At configuration time, the tests that require LLD or
clang will only be enabled when the respective projects are enabled, so
that developers continue to benefit from the subset of tests that are
applicable for the projects they are building. Note that I am not
especially familiar with CMake or lit, so this code may not be perfect, but
it should be sufficient to demonstrate what it can do.
>>>> >
>>>> > One could argue that these sorts of tests should belong in the
external (to the monorepo) test-suite, but this is a) quite distant from
the existing testing, and therefore easily forgotten, delaying potential
feedback for any breakages/resulting in potentially duplicate testing etc,
and b) is not as easy to set up and run (owing to the fact that it isn’t
part of the monorepo, isn’t connected to check-all etc), therefore making
it harder for developers to maintain the tests. Back in October 2019, there
was an extensive discussion on end-to-end testing and how to write them
(starting from
[cfe-dev] RFC: End-to-end testing <
[cfe-dev] RFC: End-to-end testing>). The
suggestion was that these tests would be lit-based and run as part of
check-all, and would not be inside the clang tree, although there was some
opposition. This concluded with a round table. Unfortunately, I am unaware
of what the conclusion of that round table conversation was, so it’s
possible that what I am proposing is redundant/being worked on by someone
else. Additionally, I don’t consider all classes of tests that the proposed
lit suite would be useful for to be “end-to-end” testing. For example,
llvm-symbolizer is usually used on linked output containing debug
information. Usually tests that consume objects that have debug data in
them rely on assembly that has been written by hand or generated by clang
prior to commit, with a limited set able to make use of yaml2obj to
generate the debug data instead. However, the output of these approaches is
typically not a fully linked output (yaml2obj output can be made to look
like one, but getting all the addresses to match up in a maintainable
manner makes this approach not particularly viable). Being able to use LLD
to link the object file produced would make the test significantly more
readable, much as using llvm-mc and assembly to generate test inputs is
more preferable to using prebuilt binaries. Such a test is ultimately not
really any more an end-to-end test than an llvm-symbolizer test that just
uses the object produced by the assembler directly.
>>>> >
>>>> > What do people think?
>>>>
>>>> Some concerns (the usual: Things should be tested in isolation, things
>>>> should be tested independently - but end to end tests have some value
>>>> too), but generally seems good.
>>>>
>>>> Indeed this is a usual concern: such tests shouldn't be seen as
replacing isolated lit tests ("unit tests").
>>>>
>>>> I completely agree. Indeed, the llvm-symbolizer test referenced in
the review I'd class as an isolated test - LLD was just being used to
generate the input for the test. The key here is that the thing being
tested was the llvm-symbolizer behaviour, and not the linker behaviour. As
mentioned, this isn't really different to how llvm-mc or llc might be used
to convert some input source (asm/IR etc) into something the tool under
test wants to be run on. Potentially, changes in those tools might break
things, but as long as the input is specific enough, this shouldn't happen
often.
>>>>
>>>> But I have another question about the cost of maintenance here: are
we gonna revert patches to either project when one of the integration tests
fails?
>>>>
>>>> Possibly, yeah. If they demonstrate a bug.
>>>>
>>>> That would be my intention - these tests should be classed as
first-class citizens as much as any other lit test. They're just unit tests
that use other components (LLD, clang etc) to generate inputs.
>>>>
>>>> What about integration tests that require to be updated manually when
changing another component?
>>>>
>>>> If they need to be updated, because their failure isn't
representative of a bug, yes.
>>>>
>>>> Hopefully these sorts of failures would occur only infrequently. As
noted, my aim here isn't to provide a place in opensource LLVM to do
integration testing, but rather unit tests that just can't sit in the
corresponding lit area, so input variability should be minimal.
>>>>
>>>> That being said, the proposed suite could be used for integration
testing, if the community agreed such testing belonged in the monorepo -
indeed, I have plans for some downstream integration tests that would make
use of this if it lands - but that isn't the goal here.
>>>>
>>>> I find the cost of maintenance of end-to-end tests is often hard to
carry over, especially since they are only supplementing and not replacing
lit "unit-tests".
>>>>
>>>> One of the nice thing about end to end tests (as with all tests, if
designed carefully - eg: don't take some arbitrary code, compile it with
optimizations, and expect a very specific backtrace - optimizations might
lead to different line table/stack frame details (if some code was merged,
or moved, it might lose or gain a specific source file/line)) is that they
can be pretty resilient to implementation details, so less likely to need
updating due to changes in implementation details. If someone changes the
output format of llvm-symbolizer these would require updating and I think
it'd be reasonable to expect that to be updated/not left failing.
>>>>
>>>> Right, just as we would change other tests for tools where the output
format changed.
>>>>
>>>> - Dave
>>>>
>>>> Best,
>>>>
>>>> --
>>>> Mehdi
>>>>
>>>> Though perhaps debuginfo-tests (this presumably already supports the
>>>> multiple-subproject mechanical isssue you're discussing?) could be
>>>> generalized/renamed to be all our cross-project lit testing
>>>> (Essentially an in-monorepo, lit-only, high-reliability/not-flakey/etc
>>>> version of the test-suite).
>>>>
>>>> The existing debug-info test area looks like it could work if it was
more generalised. It looks like we'd want the ability to have tests that
would still work without clang/lldb being built, and similarly which could
use LLD, but I don't think those are insurmountable issues - it would just
require taking some of the ideas from my existing prototype (or equivalent
from elsewhere) and merging them in.
>>>>
>>>> Sounds good to me - might want some buy-in from other debuginfo-tests
folks, though.
>>>
>>> I don't mind generalizing debuginfo-tests to support more use-cases,
but I have some very practical concerns: Because of the hardware
constraints on green-dragon (currently a single Intel Mac Mini running all
LLDB tests) I would like to avoid getting into a situation where we need to
build more than clang+lldb in order to run the debuginfo-tests. I also
don't want to have to run more than the debuginfo tests on that bot. So as
long as a separate "check-..." target is used and the bots keep working,
I'm fine with this.
>>>
>>> Build design question: How would you feel if it was the same check-
target, but it dynamically chose what to run based on which projects were
checked out/being built? I'm not sure if that's a good design or not, but
having a combinatorial set of explicit check-* names based on which
projects are enabled would seem unfortunaet.
>>
>> Generally I'm not a fan of automatic behavior: You could have a
misconfiguration that leaves out a target and the bot would still run the
same target and it would aways be green, but it wouldn't run any tests. But
I'm not going to stand in the way if such a set up makes most sense here.
>>
>>> (though, inversely - if some set of tests (like the existing debug
info ones) are in a subdirectory, and like the current
check-llvm-debuginfo-* etc where you can specify a target that's a
subdirectory, would mean you coudl run just the debuginfo ones - that
wouldn't be so bad, but maybe we'd eventually have tests under there that
might require lld for instance - and it'd still be nice for those tests to
degrade gracefully to "unsupported" if you weren't building lld (similarly,
if you aren't building clang, the existing debuginfo tests could degrade to
"unsupported"))
>>
>> That actually sounds nice. The unsupported category would show up in
the test result xml file and thus any change would show in the Jenkins
statistics.
>>
>> -- adrian
>
> I just tested a build. After `ninja lldb` (~4000 actions), `ninja lld`
requires just ~152 actions. `ninja llvm-symbolizer` requires 7 actions.
> Perhaps the additional actions are fine? (Linking lld may need some
resources, as it links in IR/CodeGen libraries for LLVM LTO.)
>
> There are currently some magic `IN_LIST` in CMakeLists.txt: compiler-rt,
lldb, mlir.
> compiler-rt & mlir are for one or two tiny targets.

I guess this is fine — that particular bot is already building the lld
project.

(My previous reply is about llvm-symbolizer. I think that is fine -
and thanks to Adrian for allowing it.

* lld/ELF and lld/COFF are very stable now.
* If we write specific llvm-symbolizer tests, the risk of lld changes causing
   trouble is low -- as long as we don't hard code the addresses.
* Symbolization code is amenable to section/segment address changes.

llvm-symbolizer on object files have comprehensive tests, but on
linkaged images the coverage is probably loose.)

In this particular case maybe, but this is a valid general concern for the
approach: this RFC is about cross-project tests in general and not only
this specific case.

In general the ability to not build and run the world to change a single
component is quite valuable during development. Historically the testing of
these components is very decoupled and it seems important to me to keep
this property as much as possible.

Here I can see how this does seem harmless when presented as "LLD was just
being used to generate the input for the test. The key here is that the
thing being tested was the llvm-symbolizer behaviour, and not the linker
behaviour", but it is very easy to introduce coupling between these
components, and increase the maintenance cost.

I'd be nice to have more clear guidelines about what is / isn't OK there,
and it isn't obvious to me how to draw the line or identify when is it
suitable to introduce cross-components dependencies for the sake of testing.

I have a similar concern about whether to draw the line for such
cross-project testing. To make concrete examples:

* For lld/ELF changes, I generally just test check-lld-elf. The test
   passing gives me large confidence that my change is good. I commit
   most of my changes this way. When I know my change my affect other
   binary formats supported by lld, I may choose check-lld.
   Actually lldb/test/Shell has some `REQUIRES: lld` tests. Rarely I need to notice them.
   In rarer cases I need to test a stage-2 build with -fuse-ld=lld.
* For many clang changes, check-clang is sufficient. When changing
   public APIs, technically clang-tools-extra/lldb/flang/polly may all be
   affected. It seems that as of now the developer does not have to worry
   about lldb/flang/polly too much. The breakage is rare.
   (ce5e21868c22479df62ebd8884adc1bd7c964433 is an example I fix lldb
   after a removed API from clang)
* For many llvm changes, check-llvm is sufficient. Public API changes
   sometimes require clang changes (from my experience), but very rare.
   (I think) Developers can largely ignore polly if they don't use it.
   (And it looks that Polly has fewer eyes on it, so a breakage may last
   longer. Persoanlly I've only fixed that three times for others' LLVM
   commits).
* If the new top-level project requires tight tanglement with lld/ELF output,
   I'd surely be concerned. (For lld, given the positive experience with
   low maintenance cost on lldb/test/Shell tests, I think such tanglement
   is not too likely, but I can imagine some potential friction for the
   less stable ports of lld).
   But requiring `check-Y` in the workfolow of contributors who
   mostly only develop/care about X, I can see that it can lower productivity.

Yeah, I wouldn’t expect everyone to test this repo any more than I expect clang, llvm, or lld developers to test the lldb repo before they commit. But if it fails and there’s a reasonable fix someone can make, I think they should.

Similarly, as Fangrui points out, the existence of lldb’s tests not causing a huge amount of friction on the subprojects it depends might show that there’s a good chance this new repo could work out without being too much of a drag.

Still, certainly some open questions about what goes here - certainly I’d never want something to go untested in its own repo, so this should only be for extra coverage, not as a substitute for testing within a project. Hopefully can help us check some end to end scenarios a bit more easily, to help ensure that the boundaries line up correctly (so we have fewer cases of Clang produces A, LLVM given A produces B - but then Clang changes to produce A’ and LLVM doesn’t handle A’ correctly so B isn’t produced anymore in the end-to-end situation)

Thanks all for the input! I’ve culled the thread history to avoid bloat. To summarise the concerns as I see them:

  1. We don’t want to have to test the world to make sure we haven’t broken tests somewhere else. E.g. ideally a change to LLD would only need to run the lld/test tests.

  2. Related to 1) - we don’t want to have to build the world to run the existing debug-info tests.

  3. It’s not clear what should be allowed into the cross-project test area.

Please let me know if I missed any.

For 1), maybe we just don’t expect people to run check-all for every change - in practice we don’t already anyway, even if we claim we do. Build bots will report (hopefully as a pre-merge check in Phabricator) any problems, and the change can be then fixed/reverted as needed. It’s possibly not ideal, but I would expect that tests in this area would be relatively few in number and stable, so the chances of a tool change causing them to fail should be relatively slim. Indeed, in practice, that’s what we already do with other tools. For example, the LLVM binutils like llvm-readobj, llvm-objdump and so on are often used to verify other tools’ behaviour. However, I wouldn’t necessarily expect a developer who has made a small change to them to run the whole set of LLVM tests across all projects, especially if the change shouldn’t change the output in 99% of cases. If there are odd tests that need updating later, they are usually easy to fix up.

For 2), my prototype adds REQUIRES tags for lld and clang, which are enabled if the relevant projects are enabled. I’d envision changing the existing debug-info tests to make use of these and other tags (lldb etc) in the same manner. These could be configured on a per-directory basis, to avoid needing to modify every test, if desired. If your CMake configuration doesn’t enable building LLD for example, this should mean the check-debug-info (or whatever it ends up being) shouldn’t build LLD and would mark such tests as UNSUPPORTED. Alternatively/additionally, depending on how we address 3), we could also divide the test directory into separate sub-directories, which allow for check-debug-info/check-lld-llvm-symbolizer/… to target just the specified directories.

For 3) I don’t have a great answer to this. Whatever we decide, I think it should be documented, so that reviewers can point to the documentation to keep out things that don’t belong. Certainly, it seems like tests which are likely to be fragile to changes in tools (especially those merely used to generate inputs) would need to be avoided. I think we’d need to leave this up to those involved in writing/reviewing any new tests to determine whether this is likely to be an issue for any given test. One option could be for now to limit the new tests to specifically llvm-symbolizer tests that make use of LLD, as that is a known hole in test coverage. We could then discuss each new category of tests (e.g. end-to-end testing as discussed in David Blaikie’s latest email) as the need for them arises. The doc could then say something like:

"Only tests that meet one of the following criteria should be added to this location:

  1. <one or more things about the existing debug-info tests, best written by the existing maintainers>.
  2. Tests for llvm-symbolizer and llvm-addr2line which require LLD to create a linked object for an input file.
    Other tests will be considered on a case-by-case basis, and should be discussed on the llvm-dev mailing list. Should other cases be approved, they should be added to this list."

James Henderson via llvm-dev <llvm-dev@lists.llvm.org> writes:

Currently, there is no location where lit tests that use both clang and LLD
can be put, whilst the llvm-symbolizer cases I’ve hit are testing
llvm-symbolizer (and not LLD), so don’t really fit in the LLD test suite. I
therefore have prototyped a lit test suite that would be part of the
monorepo, and which can support tests that use elements from multiple
projects - see https://reviews.llvm.org/D95339. Tests could be added to
this suite as needed. The suite is modelled as an additional top-level
directory, and is enabled by enabling the “cross-project-tests” project in
CMake.

This is fantastic!

Back in October 2019, there was an extensive discussion on end-to-end
testing and how to write them (starting from
https://lists.llvm.org/pipermail/cfe-dev/2019-October/063509.html).
The suggestion was that these tests would be lit-based and run as part
of check-all, and would not be inside the clang tree, although there
was some opposition. This concluded with a round table. Unfortunately,
I am unaware of what the conclusion of that round table conversation
was, so it’s possible that what I am proposing is redundant/being
worked on by someone else.

I started that thread and IIRC we ended up with the suggestion that such
tests should live in test-suite. As you noted having tests separated
from the monorepo is less than ideal. I haven't done anything with this
conclusion yet, mostly due to lack of time. If your proposal gains
traction I would like to see if we could build end-to-end testing on top
of it.

Additionally, I don’t consider all classes of tests that the proposed
lit suite would be useful for to be “end-to-end” testing.

Agreed. There are various classes of tests that could make use of your
proposed layout, one of which is "end-to-end." Your proposal doesn't
provide end-to-end testing per se, but it does make adding end-to-end
tests later on more straightforward.

             -David

I think a useful distinction here is that lit tests are generally very
focused on a specific feature/function, where test-suite has a much
broader scope. Another slice at it would be that lit tests tend to be
"regression" tests, while test-suite is more of an "integration" suite.

I am not a QA person so I may be abusing some of these terms, but that's
how I look at it. Sometimes writing that focused lit test ends up
depending on multiple tools, and the cross-project lit suite would be a
good place to drop those; debuginfo-tests is a prime example.
--paulr

Given that the debuginfo tests already have cross-project dependencies, I figured I’d try adapting them instead. I’ve updated https://reviews.llvm.org/D95339 accordingly. Ideally, I think making the existing debug-info tests a subdirectory, and renaming the top-level directory, might be a good idea, but I haven’t really come to any conclusions about that yet.

I also found that several of the existing debuginfo-test tests fail for me. Are these tests expected to work on Windows? If so, are there any slightly more unusual prerequisites that I might be missing?

What do people think?

James