teaching FileCheck to handle variations in order

Hello all,

For the hexagon target, we have a couple of tests that are failing due to variations in the order of checked text. In these cases the ordering is not directly relevant to the functionality being tested.

For example:

; CHECK: memw(##a)
; CHECK: memw(##b)

%0 = load i32* @a, align 4
%1 = load i32* @b, align 4

requires that the compiler emit the memory operations for ‘a’ and ‘b’ in that order, even though the intent of the test might simply be to ensure that each ‘load’ results in a memory read.

I’d like to teach FileCheck to allow tests to be more tolerant of these incidental variations.

The attached patch implements one possible solution. It introduces a position stack and a couple of directives:

  • ‘CHECK-PUSH:’ pushes the current match position onto the stack.
  • ‘CHECK-POP:’ pops the top value off of the stack and uses it to set the current match position.
    The above test can now be re-written as:

; CHECK-PUSH:
; CHECK: memw(##a)
; CHECK-POP:
; CHECK: memw(##b)

%0 = load i32* @a, align 4
%1 = load i32* @b, align 4

which handles either ordering of memory reads for ‘a’ and ‘b’.

Thoughts?

Thanks,
Matthew Curtis

0001-Teach-FileCheck-to-handle-variations-in-order.patch (6.71 KB)

I'm not sure if I got the details of how the tests work, but wouldn't this allow false positives?

Take this, for example:

; CHECK-PUSH:
; CHECK: memw(##a)
; CHECK-POP:
; CHECK: memw(##b)

%0 = load i32* @a, align 4
%1 = load i32* @b, align 4

; CHECK: memw(##a)

%2 = load i32* @a, align 4

My understanding is that this is supposed to detect:
- 2 loads in any order, one from @a, the other from @b, followed by
- 1 load from @a,
that is, a total of 3 loads.

At the same time I believe that it would positively match this code:
   ... = memw(##b)
   ... = memw(##a)
i.e. only two loads. The second load would match two different CHECKs.

-K

You are correct. If the intent is to detect: the example does not do that. It matches additional sequences as well. I don’t believe the PUSH/POP functionality is expressive enough to handle this case. You would have to modify the test in some way or add more functionality to FileCheck to handle this. Thinking of possible solutions … We could keep track of a “high water mark” and add a directive to set the position to it: In this case CHECK-HWM (need a better name) will set the position to just after ‘memw(##a)’ or ‘memw(##b)’ whichever is later in the file. I’ve attached a new patch with support for the additional directive. Matthew Curtis.

0002-Teach-FileCheck-to-handle-variations-in-order.patch (7.69 KB)

Hello all,

For the hexagon target, we have a couple of tests that are failing due to
variations in the order of checked text. In these cases the ordering is not
directly relevant to the functionality being tested.

For example:

; CHECK: memw(##a)
; CHECK: memw(##b)

%0 = load i32* @a, align 4
%1 = load i32* @b, align 4

requires that the compiler emit the memory operations for 'a' and 'b' in
that order, even though the intent of the test might simply be to ensure
that each 'load' results in a memory read.

I'd like to teach FileCheck to allow tests to be more tolerant of these
incidental variations.

So, I'm not a huge fan of this, but for a strange reason.

Fundamentally, I agree with you that these ordering dependencies are
unfortunate in tests. However, I would make a few observations:

1) The order of instructions, unlike some things such as register
allocation's selection of registers, should ideally not vary much.
Currently it does more that we would like due to the inability to unit test
single pieces of the backend, but I don't think we should make FileCheck
better to cope with that, I think we should fix the underlying problem.

2) It is usually fairly obvious when two such checks don't actually have an
ordering constraint that is important, and where it isn't, a comment makes
the intent clear.

3) By checking the ordering, we gain at least some early detection of
non-determinism in the code generator. I definitely caught several bugs in
the block placement pass while I was working on it due to this very
"feature" of the test suite.

Given these points, the value tradeoff here of making FileCheck
*significantly* more complicated (and the tests significantly more complex
as well) versus the time savings updating test cases when the ordering
changes...

But maybe others who have spent more time hammering against these problems
in the backend have a different sense of the magnitude of the problem?

To be honest, I’m fairly new to LLVM and Clang so I don’t have a good feel for how often this will come up. I’ve encountered two tests over the past month and a half that have regressed due to ordering changes. In one case the order varies based on target, so I would have to duplicate the test and enable/disable one or the other based on target. Not great, but if it’s just one test probably acceptable. As far as complexity, I didn’t feel like the complexity added to FileCheck was that significant, though relative to how often the feature would (or should) get used perhaps it is. I was more concerned about adding 2-3 new directives to the user interface. In the end, that acceptable given that users would only see the additional complexity if they needed it. If no one else feels like they would benefit much from this, I’ll probably just wait and see how much of a maintenance burden this really is. If it continues to be an issue I can bring it back up to the community.

I disagree. A testcase should test what it was meant to test, and nothing else. Instruction ordering, for example, is not something that we can expect to remain the same. Various optimizations can affect it (changes in the scheduler being the most obvious case), causing spurious failures that would need to be investigated and addressed.

As a matter of fact, we should make an effort to make our tests independent of any unrelated (but valid) optimizations that may be added at some point in the future. I realize that this is not always possible to achieve completely, but the more precise we get with the tests, the better off we will be in the long run.

I don't think that FileCheck is modified as frequently as the compiler itself, so the added complexity (which apparently is not large in this case) is not likely to add a significant recurring cost.

-K

My 2c:

I think that ability to pattern match “out-of-order” in tests is critical. I currently have to disable instruction scheduling on several Hexagon tests in order to preserve “expected” order of patterns, which has nothing to do with actual correctness of the test… So if nothing else, scheduler is left untested for those cases. Needless to say, this feature must maintain status quo for all existing tests, and only introduce “new” functionality for future, or selected existing tests.

Sergei Larin

Just adding to the clamor for FileChecks ability to pattern match out-of-order (match for mere presence); Just in the last 2 weeks, I have come across at least a couple instances when I was unable to add small unit tests to the testsuite because of this deficiency in FileCheck. Also, I agree with Krzysztof about the lack of any real recurring overhead.

Can this feature please be added to FileCheck ?

Pranav

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

Just adding to the clamor for FileChecks ability to pattern match
out-of-order (match for mere presence); Just in the last 2 weeks, I have
come across at least a couple instances when I was unable to add small unit
tests to the testsuite because of this deficiency in FileCheck. Also, I
agree with Krzysztof about the lack of any real recurring overhead.

Can you explain why you were unable to add a small unit test to the
testsuite?

All we have here are claims that missing this feature is blocking something
from happening, but neither myself nor several other developers have ever
hit a situation where they were unable to test because of missing this.

Can you explain why you were unable to add a small unit test to the testsuite?

This is the case I have.

I'm really not understanding what is wrong with adding this now, and adding
nice comments to it clarifying that these two things don't need to be in
any particular order, and are just used to bound the CHECK-NOT above.

I mean, it's inconvenient, mostly for the person changing the scheduler,
but provided there are helpful comments for them, I suspect they're not
going to have a hard time updating the test cases.

Why would they have to do that in the first place? Like I said, tests should not be checking things that they were not specifically meant to test. Hard time or not, I don't think anyone would be excited about having to go through a list of potential false positives.

-K

Coming back to this after a little while, I think I'd like to change my position somewhat.

Given the feedback on this thread, as well as one other recent request <http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-September/053540.html> for the same functionality, I'd really like to see this added to FileCheck. It doesn't have to be in the exact form I've proposed but I'd like some solution to handling incidental variations in order.

The current proposal*,
- does not (IMO) complicate the implementation of FileCheck much,
- does not change the behavior of existing tests, and
- does not complicate or change the writing of tests if you do not use the new functionality

Chandler, do you disagree? Are you still opposed to adding this feature?

Thanks,
Matthew C.

[*]I actually have an alternate solution that I will post soon, but it doesn't change the assertions that I'm making and is a bit cleaner from a test writer's perspective.

becomes:

0001-FileCheck-Save-Restore.patch (9.24 KB)