FileCheck --allow-empty

Hi all,

By default, FileCheck will emit an error if you try to get it to check an empty file. This doesn’t seem like a bad idea to me, since it might protect against certain unintended usages. However, in every use-case I know of, it’s also combined with checks that essentially say “don’t just allow the output to be empty, fail if it isn’t”. In some cases, those checks are slightly more specific (e.g. checking that a specific string is not present), but the presence of --allow-empty seems to me to imply that the output is actually empty, and therefore you could make the test less brittle against output changes by a more general check.

For example, the following could be used to check that an invocation doesn’t produce a warning:

RUN: llvm-objcopy a.o b.o 2>&1 | FileCheck %s --allow-empty

CHECK-NOT: warning:

llvm-objcopy doesn’t produce any output, so the --allow-empty is required. However, what if LLVM decided that warnings should be printed as “WARNING:” rather than “warning:”? This test would then start spuriously passing, even if llvm-objcopy started emitting warnings. A more robust version of the test would be:

RUN: llvm-objcopy a.o b.o 2>&1 | FileCheck %s --allow-empty --implicit-check-not={{.}}

There’s nothing stopping this being written now, but at this point, the --allow-empty seems a bit pointless, since the check implies the output should be empty.

I propose one of two options:

  1. Remove --allow-empty completely, and permit empty output. From the original commit (http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140804/229905.html) it was added to allow pure check-not files, but for some reason count 0 didn’t work. It’s possible the motivation no longer applies (I don’t know what these “guard malloc” bots actually are, and whether they still exist). I’m not convinced by this option, since the empty output check may still be useful.

  2. Have --allow-empty imply --implicit-check-not={{.}}. I’d probably rename the option to --expect-empty or something similar. This wouldn’t allow the case where a user doesn’t care whether the output is actually empty or not, as long as it doesn’t have a specific string, but I haven’t come across any such test yet myself, nor do I think such tests are especially robust (see my example above), so this would be my preferred option.

Thoughts?

James

I assume the "guard malloc" issue still exists - reading
https://developer.apple.com/library/archive/documentation/Performance/Conceptual/ManagingMemory/Articles/MallocDebug.html
- I'm guessing what happens is that some bots are running with the
feature enabled and Clang/LLVM's tendency to never release some memory
may be being flagged as memory leaks, so the processes print some
debugging output about those leaks. (I'm /guessing/ if that's the
case, people aren't especially intentionally running LLVM with these
checks enabled (because they'd be too noisy to be actionable on every
failure - but maybe they're still usable enough if you only look when
you're investigating some other failure) but maybe have them enabled
for all development work)

So I think (2) would not address the motivation for the original
feature, which I /think/ is probably still a valid motivation. (CC'd
Bogner to weigh in here)

And (1) doesn't necessarily seem like an improvement to me - still
seems like a good chance of accidentally empty output being
problematic and the sort of thing we'd want to opt-in to rather than
have by default.

(that said - the existence of mechanisms that lead to extra output
from LLVM programs under test seems a bit problematic to me - so if
Bogner/others are up for making that not a thing, that'd be nice
probably (perhaps the noisy output cases (leaks, rather than actual
memory corruption/UB) can be disabled?))

(Sorry for the duplicate. Original didn’t make it to the list)

I don’t specifically remember what the relationship between `guard malloc` and `count 0` was here, but I’m pretty sure it was related to the count part more so than the llvm tools. Option (2) sounds like it avoids the “rely on external tools that may differ between environments” problem, and the rest of James’ motivation makes sense, so it sounds like the best way forward to me.

Thanks for the comments. I’ll look into option 2) in more detail in the next few weeks once I get a chance, if there are no further comments.