RFC: A cross platform way of using shell commands in lit tests

I see many tests that are prefixed with “requires: shell”. A quick search over the codebase shows 101 tests that contain this directive. This basically means there are 101 tests that say “I don’t support running on Windows”.

I would like to get this number to 0.

Ironically, many of these tests can be made to run on Windows simply by removing the requires: shell line. As in, it doesn’t actually require a shell. This is because the meaning of requires shell is not intuitive. GnuWin32 is already a required dependency, so any shell command that you are familiar with probably exists on Windows already. For example, consider the following test in clang-tidy:

// REQUIRES: shell
// RUN: mkdir -p %T/compilation-database-test/include
// RUN: mkdir -p %T/compilation-database-test/a
// RUN: mkdir -p %T/compilation-database-test/b
// RUN: echo ‘int *AA = 0;’ > %T/compilation-database-test/a/a.cpp
// RUN: echo 'int AB = 0;’ > %T/compilation-database-test/a/b.cpp
// RUN: echo 'int BB = 0;’ > %T/compilation-database-test/b/b.cpp
// RUN: echo 'int BC = 0;’ > %T/compilation-database-test/b/c.cpp
// RUN: echo 'int HP = 0;’ > %T/compilation-database-test/include/header.h
// RUN: echo ‘#include “header.h”’ > %T/compilation-database-test/b/d.cpp
// RUN: sed ‘s|test_dir|%T/compilation-database-test|g’ %S/Inputs/compilation-database/template.json > %T/compile_commands.json
// RUN: clang-tidy --checks=-
,modernize-use-nullptr -p %T %T/compilation-database-test/b/not-exist -header-filter=.

// RUN: clang-tidy --checks=-
,modernize-use-nullptr -p %T %T/compilation-database-test/a/a.cpp %T/compilation-database-test/a/b.cpp %T/compilation-database-test/b/b.cpp %T/compilation-database-test/b/c.cpp %T/compilation-database-test/b/d.cpp -header-filter=.
-fix
// RUN: FileCheck -input-file=%T/compilation-database-test/a/a.cpp %s -check-prefix=CHECK-FIX1
// RUN: FileCheck -input-file=%T/compilation-database-test/a/b.cpp %s -check-prefix=CHECK-FIX2
// RUN: FileCheck -input-file=%T/compilation-database-test/b/b.cpp %s -check-prefix=CHECK-FIX3
// RUN: FileCheck -input-file=%T/compilation-database-test/b/c.cpp %s -check-prefix=CHECK-FIX4
// RUN: FileCheck -input-file=%T/compilation-database-test/include/header.h %s -check-prefix=CHECK-FIX5

This test does not require a shell! Remove the line (and change the %S and %T on the sed line to %/S and %/T) and this test already works on Windows!

I have inspected around 20 tests so far that claim to require shell, and so far only about 3 of them are more complicated than removing the line.

“Great, so just remove the lines! Why are you posting this?”

Well, It would be nice if we could remove the GnuWin32 dependency as well. Lots of tests use grep, sed, cd, ln, or whatever. Anyone who’s ever tried downloading and installing GnuWin32 knows what a pain it is. It installs programs called, for example, “link” which interfere with standard windows tools, and puts them in your path. There are other examples of this. And it’s also HUGE. And nobody can ever figure out how to download it correctly. Plus there are some minor differences in the tools behavior.

I’m proposing that we replace shell commands in tests with a tool that we create which implements the most common shell commands and options.

it doesn’t need to be a complete implementation of the commands. Nobody cares about echo -n, or sed -u, for example. We could keep it simple with the most common commands, most of which would be trivially implementable using the llvm Support library already. Complicated commands like awk, for example, could be absent. Chances are you need to simplify your test if you’re using something like that (presently exactly 1 test in all of LLVM uses it). What I’m imagining is something like this:

D:\src\llvmbuild\ninja>bin\llvm-shutil --help
OVERVIEW: LLVM Shell Utility Program

USAGE: llvm-shutil.exe [subcommand] [options]

SUBCOMMANDS:

cd - Change current working directory
mkdir - Make a directory
ln - Create a link
grep - Search for content in a file
sort - Sort lines
// etc, rest of commands go here.

Type “llvm-shutil.exe -help” to get more help on a specific subcommand

OPTIONS:

Generic Options:

-stdout= - Redirect stdout to the specified file
-stderr= - Redirect stderr to the specified file
-stdin= - Read stdin from the specified file
-help - Display available options (-help-hidden for more)
-help-list - Display list of available options (-help-list-hidden for more)
-version - Display the version of this program

With something like this, we could kill two birds with one stone. GnuWin32 dependency is gone, and there is no more confusion about what “requires: shell” actually means.

The reason we wrote lit the way we did is so that everyone could be speaking the same language when they write tests, no matter which platform they’re on. So it would be nice to eliminate the last area where this is still not the case.

thoughts?

I'm all for both:
Removing all meaningless instances of requires: shell, and remove
calls to external programs if they're not needed;
Removing reliance on anything that "requires: shell".

Annoying things like creating/removing directories would become easier
to always get right, and have them work everywhere.
There are harder things to support, like backticks in some tests:
http://llvm.org/klaus/compiler-rt/blob/master/test/asan/TestCases/Posix/coverage.cc#L-5
This one is especially annoying, as any test you make that would end
up telling you if you have the "shell" feature available will apply to
the host, not the target.

Unsure if you'll be able to do the "cd" command in shutil, but we
already deal with it in lit: _executeShCmd in
utils/lit/lit/TestRunner.py:148
I'd rather have lit do most of it than doing a totally separate
utility for the shell commands which has a probability of being
misused and have people start relying on it as a compatibility layer.
Only if it doesn't become a big burden to keep everything in lit, of
course. I'm not opposed to something like you suggested (llvm-shutil).

Thank you,
Filipe

I see many tests that are prefixed with "requires: shell". A quick search
over the codebase shows 101 tests that contain this directive. This
basically means there are 101 tests that say "I don't support running on
Windows".

I would like to get this number to 0.

Ironically, many of these tests can be made to run on Windows simply by
removing the requires: shell line. As in, it doesn't *actually* require a
shell. This is because the meaning of requires shell is not intuitive.
GnuWin32 <Getting Started with the LLVM System using Microsoft Visual Studio — LLVM 18.0.0git documentation; is already
a required dependency, so any shell *command *that you are familiar with
probably exists on Windows already. For example, consider the following
test in clang-tidy:

// REQUIRES: shell
// RUN: mkdir -p %T/compilation-database-test/include
// RUN: mkdir -p %T/compilation-database-test/a
// RUN: mkdir -p %T/compilation-database-test/b
// RUN: echo 'int *AA = 0;' > %T/compilation-database-test/a/a.cpp
// RUN: echo 'int *AB = 0;' > %T/compilation-database-test/a/b.cpp
// RUN: echo 'int *BB = 0;' > %T/compilation-database-test/b/b.cpp
// RUN: echo 'int *BC = 0;' > %T/compilation-database-test/b/c.cpp
// RUN: echo 'int *HP = 0;' > %T/compilation-database-test/
include/header.h
// RUN: echo '#include "header.h"' > %T/compilation-database-test/b/d.cpp
// RUN: sed 's|test_dir|%T/compilation-database-test|g'
%S/Inputs/compilation-database/template.json > %T/compile_commands.json
// RUN: clang-tidy --checks=-*,modernize-use-nullptr -p %T
%T/compilation-database-test/b/not-exist -header-filter=.*
// RUN: clang-tidy --checks=-*,modernize-use-nullptr -p %T
%T/compilation-database-test/a/a.cpp %T/compilation-database-test/a/b.cpp
%T/compilation-database-test/b/b.cpp %T/compilation-database-test/b/c.cpp
%T/compilation-database-test/b/d.cpp -header-filter=.* -fix
// RUN: FileCheck -input-file=%T/compilation-database-test/a/a.cpp %s
-check-prefix=CHECK-FIX1
// RUN: FileCheck -input-file=%T/compilation-database-test/a/b.cpp %s
-check-prefix=CHECK-FIX2
// RUN: FileCheck -input-file=%T/compilation-database-test/b/b.cpp %s
-check-prefix=CHECK-FIX3
// RUN: FileCheck -input-file=%T/compilation-database-test/b/c.cpp %s
-check-prefix=CHECK-FIX4
// RUN: FileCheck -input-file=%T/compilation-database-test/include/header.h
%s -check-prefix=CHECK-FIX5

This test does not require a shell! Remove the line (and change the %S
and %T on the sed line to %/S and %/T) and this test already works on
Windows!

I have inspected around 20 tests so far that claim to require shell, and
so far only about 3 of them are more complicated than removing the line.

"Great, so just remove the lines! Why are you posting this?"

Well, It would be nice if we could remove the GnuWin32 dependency as
well. Lots of tests use grep, sed, cd, ln, or whatever. Anyone who's ever
tried downloading and installing GnuWin32 knows what a pain it is. It
installs programs called, for example, "link" which interfere with standard
windows tools, and puts them in your path. There are other examples of
this. And it's also HUGE. And nobody can ever figure out how to download
it correctly. Plus there are some minor differences in the tools behavior.

I'm proposing that we replace shell commands in tests with a tool that we
create which implements the most common shell commands and options.

it doesn't need to be a complete implementation of the commands. Nobody
cares about echo -n, or sed -u, for example. We could keep it simple with
the most common commands, most of which would be trivially implementable
using the llvm Support library already.

Or implement the shutil as a python script based on python's shutil?

Zachary Turner via llvm-dev <llvm-dev@lists.llvm.org> writes:

I see many tests that are prefixed with "requires: shell". A quick search
over the codebase shows 101 tests that contain this directive. This basically
means there are 101 tests that say "I don't support running on Windows".

I would like to get this number to 0.

Ironically, many of these tests can be made to run on Windows simply by
removing the requires: shell line. As in, it doesn't *actually* require a
shell. This is because the meaning of requires shell is not intuitive.
GnuWin32 is already a required dependency, so any shell command that you are
familiar with probably exists on Windows already. For example, consider the
following test in clang-tidy:

Anecdotally, most of the times I've seen people add "REQUIRES: shell" on
a test they really meant "disable on windows". I think/hope that's
usually temporary while they investigate problems, but it might explain
why so many tests say they require shell when they don't.

// REQUIRES: shell
// RUN: mkdir -p %T/compilation-database-test/include
// RUN: mkdir -p %T/compilation-database-test/a
// RUN: mkdir -p %T/compilation-database-test/b
// RUN: echo 'int *AA = 0;' > %T/compilation-database-test/a/a.cpp
// RUN: echo 'int *AB = 0;' > %T/compilation-database-test/a/b.cpp
// RUN: echo 'int *BB = 0;' > %T/compilation-database-test/b/b.cpp
// RUN: echo 'int *BC = 0;' > %T/compilation-database-test/b/c.cpp
// RUN: echo 'int *HP = 0;' > %T/compilation-database-test/include/header.h
// RUN: echo '#include "header.h"' > %T/compilation-database-test/b/d.cpp
// RUN: sed 's|test_dir|%T/compilation-database-test|g' %S/Inputs/
compilation-database/template.json > %T/compile_commands.json
// RUN: clang-tidy --checks=-*,modernize-use-nullptr -p %T %T/
compilation-database-test/b/not-exist -header-filter=.*
// RUN: clang-tidy --checks=-*,modernize-use-nullptr -p %T %T/
compilation-database-test/a/a.cpp %T/compilation-database-test/a/b.cpp %T/
compilation-database-test/b/b.cpp %T/compilation-database-test/b/c.cpp %T/
compilation-database-test/b/d.cpp -header-filter=.* -fix
// RUN: FileCheck -input-file=%T/compilation-database-test/a/a.cpp %s
-check-prefix=CHECK-FIX1
// RUN: FileCheck -input-file=%T/compilation-database-test/a/b.cpp %s
-check-prefix=CHECK-FIX2
// RUN: FileCheck -input-file=%T/compilation-database-test/b/b.cpp %s
-check-prefix=CHECK-FIX3
// RUN: FileCheck -input-file=%T/compilation-database-test/b/c.cpp %s
-check-prefix=CHECK-FIX4
// RUN: FileCheck -input-file=%T/compilation-database-test/include/header.h %s
-check-prefix=CHECK-FIX5

This test does not require a shell! Remove the line (and change the %S and %T
on the sed line to %/S and %/T) and this test already works on Windows!

I have inspected around 20 tests so far that claim to require shell, and so
far only about 3 of them are more complicated than removing the line.

"Great, so just remove the lines! Why are you posting this?"

Well, It would be nice if we could remove the GnuWin32 dependency as well.
Lots of tests use grep, sed, cd, ln, or whatever. Anyone who's ever tried
downloading and installing GnuWin32 knows what a pain it is. It installs
programs called, for example, "link" which interfere with standard windows
tools, and puts them in your path. There are other examples of this. And
it's also HUGE. And nobody can ever figure out how to download it correctly.
Plus there are some minor differences in the tools behavior.

I'm proposing that we replace shell commands in tests with a tool that we
create which implements the most common shell commands and options.

To some degree, we already do this. See for example r231017, which adds
'cd' to lit's "internal shell". Adding some more common things here
seems reasonable, as long as they aren't too complicated.

it doesn't need to be a complete implementation of the commands. Nobody cares
about echo -n, or sed -u, for example. We could keep it simple with the most
common commands, most of which would be trivially implementable using the llvm
Support library already. Complicated commands like awk, for example, could be
absent. Chances are you need to simplify your test if you're using something
like that (presently exactly 1 test in all of LLVM uses it). What I'm
imagining is something like this:

From a quick grep, I'm not even convinced one test in llvm uses awk -

there's a test (X86/hoist-spill) with a comment that has a shell
pipeline including awk, but it isn't in a RUN: line at all :confused:

I see many tests that are prefixed with "requires: shell". A quick search
over the codebase shows 101 tests that contain this directive. This
basically means there are 101 tests that say "I don't support running on
Windows".

I would like to get this number to 0.

Ironically, many of these tests can be made to run on Windows simply by
removing the requires: shell line. As in, it doesn't *actually* require a
shell. This is because the meaning of requires shell is not intuitive.
GnuWin32 is already a required dependency, so any shell command that you are
familiar with probably exists on Windows already. For example, consider the
following test in clang-tidy:

// REQUIRES: shell
// RUN: mkdir -p %T/compilation-database-test/include
// RUN: mkdir -p %T/compilation-database-test/a
// RUN: mkdir -p %T/compilation-database-test/b
// RUN: echo 'int *AA = 0;' > %T/compilation-database-test/a/a.cpp
// RUN: echo 'int *AB = 0;' > %T/compilation-database-test/a/b.cpp
// RUN: echo 'int *BB = 0;' > %T/compilation-database-test/b/b.cpp
// RUN: echo 'int *BC = 0;' > %T/compilation-database-test/b/c.cpp
// RUN: echo 'int *HP = 0;' > %T/compilation-database-test/include/header.h
// RUN: echo '#include "header.h"' > %T/compilation-database-test/b/d.cpp
// RUN: sed 's|test_dir|%T/compilation-database-test|g'
%S/Inputs/compilation-database/template.json > %T/compile_commands.json
// RUN: clang-tidy --checks=-*,modernize-use-nullptr -p %T
%T/compilation-database-test/b/not-exist -header-filter=.*
// RUN: clang-tidy --checks=-*,modernize-use-nullptr -p %T
%T/compilation-database-test/a/a.cpp %T/compilation-database-test/a/b.cpp
%T/compilation-database-test/b/b.cpp %T/compilation-database-test/b/c.cpp
%T/compilation-database-test/b/d.cpp -header-filter=.* -fix
// RUN: FileCheck -input-file=%T/compilation-database-test/a/a.cpp %s
-check-prefix=CHECK-FIX1
// RUN: FileCheck -input-file=%T/compilation-database-test/a/b.cpp %s
-check-prefix=CHECK-FIX2
// RUN: FileCheck -input-file=%T/compilation-database-test/b/b.cpp %s
-check-prefix=CHECK-FIX3
// RUN: FileCheck -input-file=%T/compilation-database-test/b/c.cpp %s
-check-prefix=CHECK-FIX4
// RUN: FileCheck -input-file=%T/compilation-database-test/include/header.h
%s -check-prefix=CHECK-FIX5

This test does not require a shell! Remove the line (and change the %S and
%T on the sed line to %/S and %/T) and this test already works on Windows!

I have inspected around 20 tests so far that claim to require shell, and so
far only about 3 of them are more complicated than removing the line.

"Great, so just remove the lines! Why are you posting this?"

Well, It would be nice if we could remove the GnuWin32 dependency as well.
Lots of tests use grep, sed, cd, ln, or whatever. Anyone who's ever tried
downloading and installing GnuWin32 knows what a pain it is. It installs
programs called, for example, "link" which interfere with standard windows
tools, and puts them in your path. There are other examples of this. And
it's also HUGE. And nobody can ever figure out how to download it
correctly. Plus there are some minor differences in the tools behavior.

I'm proposing that we replace shell commands in tests with a tool that we
create which implements the most common shell commands and options.

it doesn't need to be a complete implementation of the commands. Nobody
cares about echo -n, or sed -u, for example. We could keep it simple with
the most common commands, most of which would be trivially implementable
using the llvm Support library already. Complicated commands like awk, for
example, could be absent. Chances are you need to simplify your test if
you're using something like that (presently exactly 1 test in all of LLVM
uses it). What I'm imagining is something like this:

D:\src\llvmbuild\ninja>bin\llvm-shutil --help
OVERVIEW: LLVM Shell Utility Program

USAGE: llvm-shutil.exe [subcommand] [options]

SUBCOMMANDS:

  cd - Change current working directory
  mkdir - Make a directory
  ln - Create a link
  grep - Search for content in a file
  sort - Sort lines
  // etc, rest of commands go here.

  Type "llvm-shutil.exe <subcommand> -help" to get more help on a specific
subcommand

OPTIONS:

Generic Options:

  -stdout=<file> - Redirect stdout to the specified file
  -stderr=<file> - Redirect stderr to the specified file
  -stdin=<file> - Read stdin from the specified file
  -help - Display available options (-help-hidden for more)
  -help-list - Display list of available options (-help-list-hidden for
more)
  -version - Display the version of this program

With something like this, we could kill two birds with one stone. GnuWin32
dependency is gone, and there is no more confusion about what "requires:
shell" actually means.

The reason we wrote lit the way we did is so that everyone could be speaking
the same language when they write tests, no matter which platform they're
on. So it would be nice to eliminate the last area where this is still not
the case.

thoughts?

I would *love* it if we got rid of the GnuWin32 dependency! Making lit
tests more uniform is also a really great win, but anything we can do
to reduce our dependencies on Windows is a win, and as you pointed
out, this particular one is complicated and comes with downsides when
you install it.

~Aaron

Anecdotally, most of the times I've seen people add "REQUIRES: shell" on
a test they really meant "disable on windows". I think/hope that's
usually temporary while they investigate problems, but it might explain
why so many tests say they require shell when they don't.

In at least one case I can think of
("clang/test/Format/style-on-command-line.cpp") the "REQUIRES: shell" is
there to work around annoying intermittent failures on Windows where 'rm'
was failing with permission denied errors just occasionally enough to be a
problem. I've definitely seen this happen in a number of other places
previously although I can't think whether this affects any other lit tests
off-hand, but I know that personally I've had to implement 'robust rm'
(basically, keep on trying until the error goes away on its own) on Windows
in a number of different systems to work around this sort of problem
before. Replacing gnuwin32 with something where we could easily implement
something like 'robust <cmd>' versions for Windows when we encounter these
sorts of issues would be fantastic, so I'm definitely in favour of removing
the dependency.

-Greg

That’s a good point, I’ve definitely experienced that problem as well (and arrived at the same solution). It would be easy to make it retry a few times on Windows

You can use "REQUIRES: can-remove-opened-file" to express this requirement
more precisely.

That’s more like a workaround though. The real problem is that file deletion is racy on Windows. I can’t count the number of times I’ve encountered this error, only to find out that adding a retry fixed the problem. Sure, if you can remove opened files then there’s no problem to begin with, but even if you can’t, you can still remove the file by retrying again 100ms later or so. It seems silly to disable a test which could be testing useful functionality because of some quirk of an OS.

If you had your own robust rm command, it can retry a few times, allowing the test to run everywhere.

You can use msys2 rather than gnuwin32.
Nevertheless it would be great to get rid of this dependency, as well as speedup the regressions tests. Process creation is slower on Windows than Linux and so the tests run much slower.

Unfortunately the proposal here doesn’t get rid of process creation, it just changes the process being created from a GnuWin32 one to one that we write ourselves. Happy to entertain suggestions for how to get rid of process creation entirely, but it doesn’t seem simple to me.

Yes, the lit internal shell already interprets several shell commands by itself, such as env. These are the processes meant.

One other thing to note is that the dependency problem is going to go away very soon just a result of moving to git. The normal Windows install of git comes with all the utilities you need to run the LLVM tests anyway. I haven’t bothered to install gnuwin32 on my new machine.

I think we do want to rewrite a few key utilities, like ‘rm’, to make them reliable on Windows, but for most things like ‘grep’ and ‘sed’, we should use the standard things. There’s no reason to want to have our own subtly different versions of these tools.

If we do go all the way to avoid process creation, then it becomes interesting to have our own version of these things, and at that point, maybe having a fork or checked-in copy of busybox is the way to go.

That’s only if you use git bash, not cmd.

I selected the pants-on-head crazy button of “put all utilities on PATH”, and it works from cmd too. We could probably teach lit to find these utilities by searching relative to ‘git’, which is presumably already on PATH.

No wonder your power supply blew up if you’re doing crazy stuff like that!

Although it’s more work, honestly I like the idea of running as much as we can from within lit to avoid excess process creation. We should probably see how far that can get is and then take a look at the remaining dependencies.

How about just using busybox-w32 for find, grep, sed, and other
standard UNIX commands https://github.com/rmyorston/busybox-w32. It's
a single exe < 500KiB.

Alex

If you went this route, you’d really want toybox, which is BSD licensed, and not written by people with a penchant for litigation :slight_smile:
www.landley.net/toybox/about.html

I had considered this as well, and someone also asked the same question offline. Here’s the response I gave:

The real problem is that file deletion is racy on Windows.

That’s actually not strictly accurate. I filed a bug about this a while ago: https://llvm.org/bugs/show_bug.cgi?id=27386

The reasons are really complex/subtle, and you can see more at the bug tracker, or: Niall Douglas "Racing The File System"https://youtu.be/uhRWMGBjlO8 …I haven’t gotten to writing the actual patch yet. If anybody wants to pick it up, go right ahead.

Note: sent twice because I am unreasonably bad at mailing lists, and replied wrong on the first try.