[RFC] Enabling the Lit Internal Shell by Default

We propose to enable llvm-lit internal shell by default for all LLVM projects. We believe that by using the internal shell, we will eliminate platform dependencies, improve test run times by reducing the number of subprocess invocations, and improving cross-platform testing consistency.

Background

LLVM’s main test running tool, lit, is used as a test driver for end-to-end tests. It is designed to parse and execute shell-like commands, which are specified in the comments of source files. Currently, lit’s default behavior is to use the system’s shell when executing RUN commands. However, lit also has an internal shell implementation, which is currently the default for Windows systems. This internal shell can be enabled regardless of platform by turning on the LIT_USE_INTERNAL_SHELL environment variable.

Motivations

llvm-lit‘s dependency on the system shell introduces inconsistencies across the various platforms LLVM is expected to support. On POSIX-compliant systems, like Linux, the commands typically execute as expected. However, on non-POSIX platforms, such as Windows and Fuchsia, these commands sometimes fail due to differences in shell environment.

The LLVM community has been discussing ways to improve the usability and functionality of lit for a while now. There has been previous discussion[1] about making improvements to lit’s internal shell, and even to deprecate the use of external shells and convert lit’s internal shell to be the default. There are several reasons that suggest a transition to using lit’s internal shell:

1. Fewer CI Breakages: Inconsistencies in shell behavior across different platforms are a frequent cause of CI breakages. These issues result in increased maintenance efforts as someone has to manually identify and fix the platform-specific failures.

2. Improved run times: This was brought up in a previous discussion[2], where it was observed that using lit’s internal shell resulted in faster times when running lit tests.

3. Enhanced Debug Output: The current debug output often repeats shell commands, making it unnecessarily difficult to navigate the code segments. For example, shell commands appear twice and the execution trace is not always clear. However, in RFC: Improving lit’s debug output[1], the usability of debug output for lit’s internal shell was improved upon to address difficulties with the debug output for external shells.

Implementation Plan

The internal shell currently has missing capabilities necessary for it to run tests across all LLVM subprojects. We have identified 5 missing or incomplete features that would allow us to enable lit’s internal shell by default.

1. Built-in Commands (cat, diff, env, ulimit, unset)

  • Adding additional options to the cat, diff, and env commands.
  • Implementing the ulimit and unset commands.
  • Subprojects affected: clang-tools-extra, compiler-rt, lldb, llvm

2. Environment Variables (XRAY_OPTIONS, DEFAULT_TRIPLE, JITDUMPDIR, etc)

  • Implementing the ability to set environment variables without the use of the env command. For example:

    RUN: XRAY_OPTIONS="patch_premain=false" %run %t | FileCheck %s

  • Subprojects affected: clang, compiler-rt, lld, llvm

3. Brackets (“{}“, “()“, “[ ]“ )

  • Implementing bracket/parentheses/brace recognition and execution.
  • Subprojects affected: bolt, clang, compiler-rt, lld

4. Command Substitution

  • Implementing the use of $() syntax for command substitution
  • Subprojects affected: clang, compiler-rt

5. Command Output Redirection

  • Implementing recognition of |& syntax to redirect stdout and stderr to the following command
  • Subprojects affected: bolt

6. (Other) Python Implementation Fixes

  • Fixing bugs in current lit internal shell implementation (e.g. TypeError, Unsupported Redirect, No Such File or Directory)
  • Subprojects affected: bolt, compiler-rt, lldb, llvm

Implementation Strategy

We plan to address internal shell failures subproject by subproject, starting with LLVM. We hope that by doing this, we can switch on the internal shell for individual test suites as we fill in the missing functionalities, handling any regressions as we tackle each subproject. Our hope is to eventually be able to switch on the internal shell for all of the test suites and make it the default shell across all lit testing.

References

[1] RFC: Improving lit’s debug output RFC: Improving lit's debug output

[2] Why does llvm-lit use bash? Why does llvm-lit use bash?

CC: @connieyzhu @ilovepi

7 Likes

CC: @petrhosek @jayfoad

CC @jdenny-ornl @MaskRay

+1 to the goal of enabling the internal shell by default. I think I measured a ~ 10% speedup just from doing that, on some LLVM CodeGen tests.

As for your implementation plan, I don’t think we should necessarily implement all of the shell features you mentioned; in some cases we could and should change the RUN lines in the affected tests instead. E.g. the bashism foo |& bar can be rewritten as foo 2>&1 | bar and in some cases $() can be rewritten using temporary files.

Another option you could explore: currently lit is configured to either use bash for all tests, or to use the internal shell for all tests (skipping the tests that require an external shell). Instead, wouldn’t it be great if it used the internal shell for most tests and bash only for the tests that require it? One way to achieve that might be to change the tests that require an external shell to explicitly invoke bash on the RUN line, like:

// RUN: bash -c "command goes in here | FileCheck whatever"
3 Likes

This could be keyed off of REQUIRES: shell instead of rewriting the commands to explicitly invoke bash. The REQUIRES is still required, because Windows doesn’t have bash.

2 Likes

+1 to the idea. Avoiding the subprocesses would be a big performance boost on z/OS. I expect we will have concerns and things to hash out though. A number of the commands you mentioned like cat need to work on text files and binary files. For text files we need to make sure code page conversions and done the same way as the shell commands.

1 Like

Eliminating shell inconsistencies sounds good to me.

1 Like

So is there any downsides here? (you are listing lots of motivating upsides, but if there only were upsides then I guess there wouldn’t be much to discuss here)

If I remember correctly there were some concerns in RFC: Improving lit's debug output about making it harder to reproduce failed tests by simply taking the bash command line from the lit output and then run it directly in the shell without running it via llvm-lit.For example to run the test case in a debugger, or to redirect the output to a file, or to just run a single RUN line, etc). Maybe such things won’t be more complicated when using the internal shell? Or is there a risk that the RUN lines will start using features only available in the internal shell making it harder to re-run the test without having to wrap it inside an llvm-lit invocation?

2 Likes

I suspect that internal shell models UNIX shell, so reproducing tests on Windows might become more challenging.

I’m still in favor of RFC, though.

Windows already uses the internal shell by default. I’ve not had any issues with reusing command-lines in general, but then I rarely see the more esoteric features being used.

It’s a minor issue but when I saw this RFC I noticed in our LLVM subproject that we have lots of incompatible issues with this setting leading to failing tests. But I imagine we’ll just have to flip it off and fix those.

Speaking for former colleagues working with LLVM on Windows, I support this RFC, and it’s a big win for improving LLVM lit test portability. Developers often use bash-isms when writing lit tests, and in the end, it’ll be easier to write portable tests if we limit the shell dialect to just what lit supports.

2 Likes

That was my only, minor, source of reservations. But:

So +1 from me. Using LIT internal shell means consistency, and consistency is key :slight_smile:

If that’s a problem, how hard would it be to just provide an llvm-lit-shell tool?

2 Likes

This can be a concern for complicated pipelines, but lit prints the commands with an escaping strategy that makes them easy to copy-paste into cmd or bash. It’s usually not too much of a burden to reconstruct pipelines when they come up.

Personally, I usually don’t want to keep the pipe to FileCheck, I usually want to re-run the command and pipe the output to less or a file, so I kind of prefer the way the internal shell prints each process command line separately.

2 Likes

Yes, this would be a good idea to replace ‘|&’ with ‘2>&1 |’ to minimize the use of bashisms. For $() though, we do think supporting this syntax would provide a better developer experience so that we don’t constantly have to utilize intermediary temp files.

We agree that this could be a valuable improvement to make things more efficient. However, it does complicate matters because REQUIRES affect the entire file. This is definitely something to consider in the future. However, in our opinion, the long term goal should be to eliminate uses of REQUIRES: shell, so that our tests are portable across all supported platforms.

We had the same understanding of the internal shell - since it’s already the default for Windows, for consistency’s sake we would like to use it across other platforms as well.

While we plan to make the internal shell the default for LLVM, we recognize the ongoing need to support native shell invocation for debugging. Ensuring that the generated native shell invocation matches the internal shell execution logic will require thorough testing.
And it’s probably worth considering the suggested llvm-lit-shell, to run commands in the same way as tests are run. We imagine that it would improve usability. However, it is probably better left for a future project or another contributor. As students and interns with limited time, our immediate goal is to improve lit’s internal shell to make it the default, and we don’t want to give the false impression that we have the ability to deliver that tooling at the moment.

Our approach is to address the internal shell’s deficiencies before turning it on for each subproject. This will ensure that we resolve any existing issues and improve compatibility. By doing so, we can better manage and mitigate the risk of test failures and other complications.

Using temporary files on disk is a common practice in our lit tests. Indeed, in some (many/all?) cases, it is actually useful for debugging purposes, because it is possible to inspect the intermediate files if a test fails, to see what the problem is, without needing to modify and rerun the test. As a result, I see supporting this syntax as low priority personally. Do you have an idea of how many tests use the $() syntax and where those tests are? It’s important we get feedback from those who maintain the tests that use that syntax, to ensure we are meeting their needs.

The proposal appears to combine two goals: “changing the default for non-Windows platforms” and “implementing additional features.”

I’ve noticed recent patches from @jayfold to enhance the internal lit shell for non-Windows platforms. Could you provide an estimate of the remaining issues or any areas where you might need assistance from the community?

Perhaps the primary problem is the 100+ tests that do REUIQRES: shell. We could migrate them or implement a temporary option in lit that uses the external shell for these specific tests on non-Windows platforms.

% rg 'REQUIRES: .*shell' -l | wc -l
154

The “Implementation Plan” in this proposal outlines features that I believe aren’t crucial for changing the default.
While supporting some features might improve the developer experience, I’d like to see concrete examples.

1 Like

I’ve encountered issues with using temp files in tests before when they run in highly parallel manner and by chance some tests get the same tmp file name which leads to all sorts of annoying failures, so I’d rather avoid relying on tmp files unless strictly necessary.

Also, don’t see an issue with supporting some common bash-isms (like |&) if it’s trivial to do.

That makes me think there is a problem with how those tests generate their temp file names. Lit can generate per-test-unique names, if the test uses %t there should be no problem. Unless you are running multiple lit sessions in the same tree concurrently.

2 Likes