[RFC][LLDB] Support remote run of Shell tests

This RFC proposes to extend LLDB test suite functionality by adding the ability to run Shell tests against the remote target platform.

Motivation

CLI is one of the main ways users interact with LLDB, along with LLDB Python API/SB API. Since some clients are interested in using it for remote debugging, e.g. for new platforms, it is important to ensure that their experience with the remote debugging will be as smooth as with the local one.

Currently, the remote debugging functionality and cases with complicated build scenarios are tested using API tests. However, it would be nice to be able to run Shell tests with remote execution as well, since after connecting to the remote, the state of LLDB and the behavior of the LLDB command interpreter changes (for example see [lldb] Disable find-module.test in case of a remote target by slydiman ¡ Pull Request #94165 ¡ llvm/llvm-project ¡ GitHub). Many Shell tests launch processes, so we want them to do so remotely in the case of a cross environment.

Also, it may reveal potential future bugs, extend the Shell testing capabilities, and draw LLDB contributors’ attention to the importance of considering remote debugging cases in their commits.

Implementation

To support cross environment with the LLDB Shell tests it seems to be sufficient to

  1. Add -O "platform select remote-linux" -O "platform connect <url>" to the %lldb lit substitution in case of remote test run configuration.

  2. Add proper --sysroot <...> flag to %clang substitution in case of cross-compilation.

  3. Disable some tests and make some minor changes that consider that the host and target triples may be different.

Since it is done in a unified way, this does not add to the tests complexity and seems to cause only a little maintenance overhead down the road. Test source files that require complex build scenarios are rather used in API tests anyway. I also think it is better to limit the supported toolchains to clang+lld for remote Shell tests run to avoid the explosive growth of possible build configurations.
Is there a better way of doing this?

The implementation for the proposed change can be found here [lldb][test] Support remote run of Shell tests by dzhidzhoev ¡ Pull Request #95986 ¡ llvm/llvm-project ¡ GitHub. Comments, reviews, other approaches, and ideas are welcomed!

As long as this configuration is run regularly either publicly or with prompt reporting upstream.

And we have a good way to reproduce these issues. So a public bot would be a good excuse to document that more than we have. Between Testing LLDB using QEMU - 🐛 LLDB and Testing - 🐛 LLDB most of the material is there but not in a logical order.

Can you show what a cmake command looks like when you’re setting all these extra things?

Did you find any existing shell tests that fit this description? That would benefit from becoming API tests.

What would explode here exactly? The amount of code to handle the configurations, or the results of builds with those configurations?

The PR you have supports only remote-linux but it’s not obvious to me why it couldn’t be substituting in any platform type.

But of course the obvious result if we allow that is “I ran X → Y testing and it’s broken, is it supposed to work?” which we can only answer with a public builder, and if that’s X → Linux then that’s the only one we can verify works.

So “you aren’t going to need it” kicks in at some point but we could also say “here is a generic mechanism for this, X → Linux is the only setup we as a project verify”.

For clang+lld as well, is it just easier to hardcode adding --sysroot than add a whole set of -DLLDB_EXTRA_CROSS_COMPILE_FLAGS to cmake?

And does this interact with -DLLDB_TEST_COMPILER at all?

It sounds a lot like libcxx does to support emulators and remote runs where the “executor” is replaced with some qemu or ssh command.

One might say why not copy lldb and lldb-server to the remote and run it all there but that’s not testing remote debugging that’s just testing native debugging at arms length. We already get that coverage by running buildbots on multiple platforms.

You could inject the platform select... via an init file but it amounts to the same thing really. In some ways doing it in lit is more clear because when the test fails you’d have the full command visible.

1 Like

I agree. This work is going on Add new LLDB remote-linux cross builder 'lldb-remote-linux-ubuntu'. by vvereschaka ¡ Pull Request #226 ¡ llvm/llvm-zorg ¡ GitHub.

cmake \
  -G Ninja \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_INSTALL_PREFIX=native \
  -DLLDB_CAN_USE_LLDB_SERVER=OFF \
  -DLLDB_TEST_ARCH=aarch64 \
  -DLLDB_TEST_COMPILER=/home/parallels/llvm-build-release/native/bin/clang \
  -DLLDB_TEST_SYSROOT=/home/parallels/fsroot \
  -DLLDB_PLATFORM_URL=<URL> \
  -DLLDB_PLATFORM_WORKING_DIR=/home/ubuntu/lldb-tests \
  -DLLDB_TEST_USER_ARGS='--env;CFLAGS=-fuse-ld=lld;--platform-name;remote-linux' \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DLLVM_ENABLE_PROJECTS="llvm;clang;clang-tools-extra;lld;lldb;compiler-rt" \
  -DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi;libunwind" \
  -DTOOLCHAIN_TARGET_SYSROOTFS=/home/parallels/fsroot \
  -DTOOLCHAIN_TARGET_TRIPLE=aarch64-unknown-linux-gnu \
  -DRUNTIMES_aarch64-unknown-linux-gnu_CMAKE_SYSTEM_NAME=Linux \
  -DRUNTIMES_aarch64-unknown-linux-gnu_CMAKE_SYSTEM_PROCESSOR=AArch64 \
  -DRUNTIMES_aarch64-unknown-linux-gnu_LIBCXX_HERMETIC_STATIC_LIBRARY=ON \
  -DRUNTIMES_aarch64-unknown-linux-gnu_LLVM_NATIVE_TOOL_DIR=/home/parallels/llvm-build-release/natvie/bin \
  -C ../llvm-project/clang/cmake/caches/CrossWinToARMLinux.cmake \
  ../llvm-project/llvm

ninja install
ninja check-lldb

It may look strange that I use CrossWinToARMLinux.cmake, but I’m reusing it’s capabilities for both Windows and Linux hosts to set up a cross toolchain.

@labath expressed a concern that the remote execution functionality for Shell tests will lead to the situation when the amount of code to handle different compiler/linker/etc combinations will grow too large (his comment on that [lldb][test] Support remote run of Shell tests by dzhidzhoev ¡ Pull Request #95986 ¡ llvm/llvm-project ¡ GitHub).

No. And my point is that we won’t struggle much with the necessity of supporting various compilers/linkers/etc since Shell tests don’t have complex build scenarios. Additionally, we can stick to clang+lld for launching Shell tests remotely to avoid that concern.

I’m not against that, but my primary focus was on the Linux target. My colleagues spent some time on making remote-linux work from Windows host, and even for API tests it revealed a lot of problems (and required us to overhaul Makefile.rules for them), which have been resolved by recent PRs.
Honestly, I’m not even sure if other targets but remote-linux/android/darwin are regularly tested by someone (but if so, I’d like to know more about it!). Also, it would have taken a lot of time to set up all possible build configurations/VMs. That’s why I started from remote-linux.

The only reason LLDB_TEST_SYSROOT is added is to avoid duplicating this “–sysroot” flags for API and Shell tests in the cmake arguments. Yes, it’s not necessary.

In the mainline, this flag is ignored by Shell tests now. There’s some logic in lldb/test/Shell/helper/build.py to choose the “best” compiler for the platform if it is not set explicitly. I didn’t wanna break it in “Remote Shell tests” PR, but I agree that it would be good to reuse -DLLDB_TEST_COMPILER flag in Shell tests if it won’t break anything.

We are going to host public buildbots for cross LLDB.

We start from x86_64 Linux cross AArch64 Linux. This activity is a part of preparation for that. Then x86_64 Windows cross AArch64 Linux will follow.

If everything will be as expected, at that point all known issues should be resolved, the infrastructure should be in place, and adding new configurations should be easy enough for those who will follow with more.

1 Like

These look at a glance to be things that will be built into the lldb binary. Like clang’s default target triple is.

Can they be LLDB_TEST_... ?

(otherwise the command is a lot neater than I’d expected)

Ok I see. It’s not that we couldn’t, but it could lead to a lot of test annotations. In that case I agree that it’s best to stick with as few configurations as possible.

I ask because I use it for API tests from an x86 host to an AArch64 VM. I’m using a host build of lldb, so the clang there won’t cross compile unless I give it a sysroot and all that, which is often harder than pointing at an existing gcc install.

But if we’re going to limit shell tests to clang then maybe it doesn’t make sense to have this global override. At least update the docstring for it, or even change the name. LLDB_TEST_API_TEST_COMPILER ? (I know it’s awkward but I like keeping the LLDB_TEST prefix.

So yeah as long as all these new options are documented on Testing - 🐛 LLDB I think I’m on board.

Though, if we’re limiting ourselves to clang and lld are we implying the latest version or one built from the same sources as the lldb under test? I assume the latter as that avoids the version annotations that Pavel brought up.

This doesn’t seem unreasonable. I imagine most users of this testing configuration are going to be shipping a toolchain and if I was doing that, I’d want everything to match anyway.

The limitation to ToT-clang/lld/etc. tests definitely eases my concerns, though I still feel like this is not a feature we really need right now. I totally believe it can increase the test coverage and find new bugs (those are monotonic functions), but I think it also exacerbates our problems with the reproducibility of lldb tests (see other thread in this category).

I think it’s a very bad experience that new contributors are not able to get a clean test run, because they’ve configured their build slightly differently from what the core lldb devs and buildbots use. The only way I know how to fix that is to write the tests in a way that isolates them from the environment as much as possible. Running tests remotely incentivises the exact opposite, as for a test to be “useful” remotely, it should in some way depend on the environment writ large.

Testing remote debugging is definitely useful, but we already have that in the form of API tests, so I just don’t see a need to extend that here. That said, I also don’t think I get a veto, and I appreciate you trying to address my concerns.

That makes sense. I’ll update that in an MR.

Yes, that is ok for us. But I also think Shell test sources shouldn’t be compiled with “the brand new” clang flags to avoid problems with different clang versions (e.g. mainline clang vs latest clang release).

I think we should check that

  1. clang compiler is used
  2. sysroot is set
  3. remote URL is specified

And launch Shell tests in remote mode only if all three conditions are met. Otherwise, just fall through and try to execute Shell tests the way they are executed currently. Does this sound like a good measure so as not to break current build configurations?

Documenting is important, thank you for reminding about that.

Our users can launch lldb command in their favorite shell and do something like this:

(lldb) platform select remote-linux
(lldb) platform connect connect://remote:1234
(lldb) platform process list
...

This is an expected and supported way of using LLDB, right? We have to have tests to make sure each of these commands works as expected and is not broken.

Could you elaborate on how do you propose to test this with API tests only, please?

The only feasible option I see is to test that each command actually invoke a correct set of API calls in the right order without checking the results of API execution in between the calls. Is this what you are suggesting?

But even so, this does not make end-to-end tests redundant. We can group them together and have them behind “expensive tests” flag for turning on/off or something, if proven to be slow.

This might be a different thread, though. I don’t want to hijack this one to move the discussion to the existing LLDB test suite redesign topic.

API tests can and do test the command line interface as well. A quick grep finds at least half a dozen hits for each of those commands.

Granted, these tests use SBCommandInterpreter::HandleCommand to pass the command strings rather than feeding them through stdin. However:

  • I think we’d have bigger problems (which would get noticed quickly) if this API did not do the same thing as entering the same command through stdin. And these particular commands (unlike e.g. expr, which accepts multi-line input) really don’t do anything special when it comes to their inputs/arguments
  • The “shell” tests aren’t the best mechanism for end-to-end testing anyway because they’re non-interactive (the input comes from a file or pipe rather than a terminal). We actually have a better mechanism for full end-to-end testing using pexpect – which is a subset of API tests. I’m not sure if these currently run remotely (they don’t look like they could complete successfully, but I also did not find anything that would prevent them from running) right now, but it should be possible to make them do so. I think most of them are not interesting to run remotely (e.g. tests for cursor movement, which should really work the same way), but I certainly can imagine running (some of?) them that way.

@DavidSpickett addressed your proposals in PR.

Considering that, I’ve added LLDB_SHELL_TESTS_DISABLE_REMOTE flag that can disable remote execution of Shell tests. I think it’s more clear that turning on remote execution basing on a compiler name. In that case, lit will attempt to run Shel tests locally, similarly to the current behavior.

So we have got to a point with the associated PR ([lldb][test] Support remote run of Shell tests by dzhidzhoev ¡ Pull Request #95986 ¡ llvm/llvm-project ¡ GitHub) where the code changes are ok (at least I think so).

We are still unsure if this feature is going to add enough value right now, and whether it will actually reduce value long term.

I think @labath 's comments on the PR speak to that (I won’t quote them here because they should be read in context).

I don’t think I have the experience to asses that myself.

Another question would be, if we commit this and it does have adverse side effects, does anyone have the authority to remove it. We could define a trial period? It would have to be >= 6 months though given the frequency of changes, and who knows whether anyone would have the appetite to deal with it when the time is up.

Perhaps some concrete data would be coverage for an API only vs. API+shell tests run. Would that be possible @avdzhidzhoev ? (getting lldb-server might be tricky, but lldb alone would be interesting)

Also the opinion of @JDevlieghere would be very useful here. As lead code owner and someone who also (I presume) supports users who do remote debugging.

(and would be the tie breaker if we needed one, but let’s see what we can do to avoid that :slight_smile: )

I haven’t chimed in here because I share Pavel’s concerns and his comments pretty much cover my stance.

Personally, I see the Shell tests are serving a relatively specific purpose, which the website does a good job at documenting. I share the concern that this adds undue complexity and IMHO goes against that spirit of those tests.

That said, I also don’t want to stand in the way here. @avdzhidzhoev has addressed the more concrete concerns and the resulting PR doesn’t seem too invasive, though I still think it’s taking the shell tests in the wrong direction.

1 Like

Thanks Jonas.

Pavel gave an example on the PR of an existing test that was tricky to get right cross platform, and I can’t find it now, but basically it was an example of what the docs already say:

Finally, the shell tests always run in batch mode. You start with some input and the test verifies the output. The debugger can be sensitive to its environment, such as the platform it runs on. It can be hard to express that the same test might behave slightly differently on macOS and Linux. Additionally, the debugger is an interactive tool, and the shell test provide no good way of testing those interactive aspects, such as tab completion for example.

Would folks be more comfortable with remote shell tests if we extended the docs to say:

  • “and this is why you should consider an API test for anything more complicated than…”
  • “although shell tests can be run remotely, as soon a shell test has to be modified significantly or disabled for remote testing, it should be converted to an API test so coverage is not lost”

Phrasing WIP, we can make that more or less strict. Of course this just gives us text to cite, it doesn’t mean anyone (including ourselves) will follow it necessarily.

Perhaps we keep this feature as long as there is a public bot and someone who is committed to helping fix the issues resulting from it (including doing it themselves if they have to). When that’s no longer the case, at the first instance it causes problems, it’s removed.

If we reject this proposal, the alternative for @avdzhidzhoev I think would be to use coverage data to identify shell tests that should be converted to API tests so that there is coverage for remote testing. However this does not address new features, as they already mentioned on the PR I think.

The mitigation there is perhaps to monitor coverage on an ongoing basis, and step in when something major needs an API test for remote testing coverage. This has no impact on upstream lldb, but it is intensive and likely noisy. So I personally wouldn’t bother with it.

Which is all to say, I’m inclined to accept the proposal on the basis that it gives @avdzhidzhoev and their colleagues an early warning mechanism that they need to pay attention to something related to remote testing.

I think it’s clear from others’ repsonses that once that motivation has gone, this feature will likely be removed quite quickly.

Would everyone else be ok with that conclusion?

I think we could accept this proposal because it provides a mechanism for the interested parties to identify places where shell tests are not a good fit for remote testing.

As opposed to accepting this because we want to make shell tests remote compatible as standard. I agree that that is not a good goal.

1 Like

Sorry for the delay, I had to take a leave.

Sounds interesting. I’ll check it out.

I like this idea.

I think that’s already said in the line “A good rule of thumb is to prefer shell tests when what is being tested is relatively simple.”

  • “although shell tests can be run remotely, as soon a shell test has to be modified significantly or disabled for remote testing, it should be converted to an API test so coverage is not lost”

I think that there should be a room left for “local-only” Shell tests. I expect Shell tests to be disabled mostly based on the target platform rather than on the remote/local execution type. I’d write it like

  • “although shell tests can be run remotely, as soon a shell test has to be complicated significantly for remote testing, it should be converted to an API test so coverage is not lost”

Thank you! I think that’s inevitable in case of rejection when we encounter remote testing problems in the future.

I would even go further and say that while it may be possible to run the shell tests remotely, you shouldn’t write a shell test to test remote behavior and that should definitely be an API test. I realize that’s pretty nuanced but hopefully that conveys that we don’t want folks to write new remote shell tests.

Yes, I think that’s the strongest argument in favor for this RFC. We’re generally supportive of things that help others as long as it doesn’t introduce an undue burden on the community. Vladislav, you have clearly made various efforts to do so and committed to maintain this, so in that context I’m happy to accept this.

1 Like

Sounds quite clear to me actually, that’s a good way to put it.

Assuming this RFC is accepted, shell tests can possibly be run against remote targets but this is only to check that scenarios common to remote and local testing work the same way. As soon as you’re into requires-remote or if remote do x else y then go to API testing.

Which isn’t really a change in rules, we just never had to write that one down because a remote shell test today adds 0 coverage for technical reasons.

@avdzhidzhoev perhaps you could add the wording suggestions you/I/Jonas have made, to the testing doc (Testing - 🐛 LLDB) in the PR.

2 Likes

Thank you!

Added.

Now that this ([lldb][test] Support remote run of Shell tests by dzhidzhoev ¡ Pull Request #95986 ¡ llvm/llvm-project ¡ GitHub) has landed, what is the state of the buildbot and what are the future plans for it?

We’re going to fix the configuration a bit (since during the review we’ve changed the names of flags) and send a couple of PRs to make Linux-Linux buildbot green.
We plan to run x86_64 Windows-AArch64 Linux as well.
@vvereschaka can elaborate on that.