RFC: FileCheck Enhancements

Hi everyone,

There was idea to add new directives to FileCheck:

  1. Directive to use some patterns as named template with or without parameters.

  2. CHECK-INCLUDE - Directive to include other file with checks to another.

  3. Expressions repeat for CHECK - If statement should be checked several times repeat modifiers {n}, {n,m} , {,n}, {n,}, *, + can be used.

  4. Repeat in regexs - Repeat with current number should become available by using {n}, {n,m} , {,n}, {n,}

  5. CHECK-LABEL-DAG - Not sequential order of labels.

  6. Check statement for words only - // CHECK-WORD, // CHECK-WORD-NEXT, // CHECK-WORD-SAME, // CHECK-WORD-DAG, // CHECK-WORD-NOT.

  7. Wildcard for prefixes - If some statements should be checked regardless prefix, it should be used //{{}}, //{{}}-NEXT, //{{*}}-SAME and etc.

  8. Prefix with regular expressions - If statement should be checked if prefix matches some regular expression, it should be used {{regex}}:, {{regex}}-NEXT and etc.

More information in file https://docs.google.com/document/d/1wAKNzU7-S2EeK1-aADwgP8dEiKfByKNazonybCQW3zs/edit?usp=sharing.

Now we have prototype with these features. It’s tested on LLVM 3.8.

There was found unsupported before directive in old test. Bug about this - https://llvm.org/bugs/show_bug.cgi?id=27852.

There is about 6% slowdown with new features when we tested them on 3.8.

I see that there are some changes in FileCheck LLVM 3.9 with new features too. We can publish patch for 3.8 and it can be adapted for LLVM 3.9. Is it interesting for anyone? And how will be better to publish patch as for 3.8 or for 3.9?

Thanks,

Elena.

Hi everyone,

There was idea to add new directives to FileCheck:

1.Directive to use some patterns as named template with or without
parameters.

2.CHECK-INCLUDE - Directive to include other file with checks to another.

3.Expressions repeat for CHECK - If statement should be checked several
times repeat modifiers {n}, {n,m} , {,n}, {n,}, *, + can be used.

4.Repeat in regexs - Repeat with current number should become available
by using {n}, {n,m} , {,n}, {n,}

5.CHECK-LABEL-DAG - Not sequential order of labels.

6.Check statement for words only - // CHECK-WORD, // CHECK-WORD-NEXT, //
CHECK-WORD-SAME, // CHECK-WORD-DAG, // CHECK-WORD-NOT.

What does this ^ do?

7.Wildcard for prefixes - If some statements should be checked
regardless prefix, it should be used //{{*}}, //{{*}}-NEXT, //{{*}}-SAME
and etc.

I'm not a fan of this ^ feature. I think it'll make testcases much harder to understand.

8.Prefix with regular expressions - If statement should be checked if
prefix matches some regular expression, it should be used {{regex}}:,
{{regex}}-NEXT and etc.

I'm not a fan of this ^ feature. I think it'll make testcases much harder to understand.

More information in file
https://docs.google.com/document/d/1wAKNzU7-S2EeK1-aADwgP8dEiKfByKNazonybCQW3zs/edit?usp=sharing.

Now we have prototype with these features. It’s tested on LLVM 3.8.

There was found unsupported before directive in old test. Bug about this
- https://llvm.org/bugs/show_bug.cgi?id=27852.

There is about 6% slowdown with new features when we tested them on 3.8.

I see that there are some changes in FileCheck LLVM 3.9 with new
features too. We can publish patch for 3.8 and it can be adapted for
LLVM 3.9. Is it interesting for anyone? And how will be better to
publish patch as for 3.8 or for 3.9?

Patches that apply on trunk are preferred (assuming the community accepts these changes).

Jon

Hi,

CHECK-WORD - If you want find some string in file, but you want to be sure, that this string should be a separate word.

There are examples in file.
Prefixes which can be described as regular expressions should be turning on with option -regex-prefixes . By default, you can't use it.

Thanks for your comments.

Hi,

CHECK-WORD - If you want find some string in file, but you want to be sure, that this string should be a separate word.

Is this functionally equivalent to doing:

// CHECK: {{\s}}whatever{{\s}}

Or is there some other subtlety about it?

Jon

It's equivalent to {{\b}}whatever{{\b}}. I amn't sure if assertion \b is supported.
\s will not match with start and of line, but it should be matched.

Elena.

Hi everyone,

I published patch for changes described in this RFC ⚙ D20668 FileCheck Enhancements.
But I don't know who can be reviewer for this patch.

Thanks,
Elena.

  1. Wildcard for prefixes - If some statements should be checked regardless prefix, it should be used //{{}}, //{{}}-NEXT, //{{*}}-SAME and etc.

Technically that will be possible, but it will be time consuming to figure
out which regular expressions should be highlighted, probability of mistake
will be higher, etc.

Okay, if this feature is undesirable by most of people, I can change published patch by removing this feature. But I would like to know opinion of people developing FileCheck before.

Elena.

It's also an entirely unnecessary feature: you can use multiple
--check-prefix arguments on the test run to accomplish the same thing, and
many tests do that today. (e.g. "FileCheck --check-prefix=CHECK
--check-prefix=SSE --check-prefix=SSE3").

But then I should write

// CHECK: something

// SSE: something

// SSE3: something

With this feature it can be write // {{[A-Z0-9]+}} : something

I don't see why you need this? Why cant you only write:

// CHECK: something.

And all these will match it:

FileCheck
FileCheck --check-prefix=CHECK --check-prefix=SSE
FileCheck --check-prefix=CHECK --check-prefix=SSE --check-prefix=SSE2
FileCheck --check-prefix=CHECK --check-prefix=SSE --check-prefix=SSE2 --check-prefix=SSE3

  1. Wildcard for prefixes - If some statements should be checked regardless prefix, it should be used //{{}}, //{{}}-NEXT, //{{*}}-SAME and etc.

For this one I agree that multiple check prefixes already provides this. The MIPS tests frequently have something like ‘–check-prefix=ALL --check-prefix=FOO’ on one command and ‘—check-prefix=ALL –check-prefix=BAR’.

  1. Prefix with regular expressions - If statement should be checked if prefix matches some regular expression, it should be used {{regex}}:, {{regex}}-NEXT and etc.

The previous example isn’t very compelling but I can see how this feature could be useful to me. I have a number of tests that do something like:

// O32: something for O32

// N32: something for N32

// N64: something for N64

// NEW: something for both N32 and N64

But this is a bit clearer:

// O32: something for O32

// N32: something for N32

// N64: something for N64

// {{N32|N64}}: something for both N32 and N64

There’s also some that define O32, O32EL, and O32EB which could drop the O32 and do:

// {{O32(EL|EB)}}: any endian

// {{O32(EL)}}: little endian

// {{O32(EB)}}: big endian

In this example, I’ve included redundant parenthesis so that vim’s ‘*’ key can find me all the O32 lines, all the little endian lines, etc.

One last example is that I have some tests that define MIPS32R1, MIPS32R2, MIPS32R3, MIPS32R5, MIPS32R6 and MIPS64 equivalents of each.

{{MIPS32R[2-5]}} would match MIPS32R2 through to MIPS32R6

{{MIPS(32|64)R6}} would match MIPS32R6 and MIPS64R6

{{MIPS64.*}} would match any MIPS64

This would remove a lot of redundancy but it’s starting to harm readability. I’m not sure where I draw the line on that trade-off but I definitely wouldn’t want complicated regexes.

For all of these changes: why/where are they actually useful?

Previous enhancements of FileCheck have mostly been added when needed by a test. It would greatly help your case for adding these new enhancements to show some tests which would be improved, or have greater clarity, by the use of these features.

I’m particularly skeptical about “3. Expressions repeat for CHECK”, “5. CHECK-LABEL-DAG”, and (as noted before “7. Wildcard for prefixes” and “8. Prefix with regular expressions”

For “6. Check statement for words only” – I think it might be better to just make that be the ONLY behavior, rather than an additional option – if you intended to end a match in the middle of a word, stick {{[^ ]*}} on it.

Hi James, all,

For all of these changes: why/where are they actually useful?
Previous enhancements of FileCheck have mostly been added when needed by a test. It would greatly help your case for adding these new enhancements to show some tests which would be improved, or have greater clarity, by the use of these features.

Let me clarify. Elena’s proposals are driven by development and testing of “downstream” commercial toolchain for embedded processors.
The proposed enhancements have practical use cases in internal code base. We did decide to upstream them.
However, we didn’t look for representative examples in upstream repository (yet).

“3. Expressions repeat for CHECK”

Tests containing duplicating CHECK lines: e.g. you want to match expression K times or maybe skip it to find specific expression.
Currently we have to duplicate CHECK or use grep and sed/awk scripts count patterns and then compare numbers later.
E.g. count number of LD instructions inside function or loop body. This extension can help to describe that as CHECK pattern.

“5. CHECK-LABEL-DAG”

CHECK-LABEL is used to annotate top-level entities, e.g. functions.
Suppose your tool can reorder output fragments (e.g. functions in ASM) comparing to original source file.
LABEL-DAG can help.

"8. Prefix with regular expressions - If statement should be checked if prefix matches some regular expression, it should be used {{regex}}:, {{regex}}-NEXT and etc. "

Our processors are highly configurable and we compile/apply same test for many ISA extensions options.
Image 100s variants of the same CHECK statements for various ISA configurations.
CHECK regexp prefixes can explain logic in compact and readable way, making it easier to understand

Use cases are similar to Daniel’s example for MIPS:

{{MIPS32R[2-5]}} would match MIPS32R2 through to MIPS32R6

{{MIPS(32|64)R6}} would match MIPS32R6 and MIPS64R6

{{MIPS64.*}} would match any MIPS64

Thanks,

Sergey

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of Elena Lepilkina via llvm-dev
Sent: Tuesday, May 24, 2016 6:51 AM
To: llvm-dev
Subject: [llvm-dev] RFC: FileCheck Enhancements

Hi everyone,

There was idea to add new directives to FileCheck:
1. Directive to use some patterns as named template with or without parameters.

Seems plausible, although I find the proposed syntax for a template call
with parameters to be very awkward.

2. CHECK-INCLUDE - Directive to include other file with checks to another.
3. Expressions repeat for CHECK - If statement should be checked several times repeat modifiers {n}, {n,m} , {,n}, {n,}, *, + can be used.
4. Repeat in regexs - Repeat with current number should become available by using {n}, {n,m} , {,n}, {n,}
5. CHECK-LABEL-DAG - Not sequential order of labels.

After reading through the google doc I see what you are trying to do.

The motivation for CHECK-DAG is to reduce future test "churn" in case
someone makes a change that influences the output order of something,
but the order is not important to the test. The motivation is NOT to
support instability in the output order across runs of the same compiler.
Just so we're clear on that.

In principle I could see how the order of blocks or entire functions
could change over time, with no real relevance to a test, but how
necessary is it really? I have seen work go into Clang/LLVM over the
years to ensure stable output order, and my impression has been that
these changes typically have little complexity or performance effect
while significantly simplifying test effort.

6. Check statement for words only - // CHECK-WORD, // CHECK-WORD-NEXT, // CHECK-WORD-SAME, // CHECK-WORD-DAG, // CHECK-WORD-NOT.

I would expect a regex package to provide a word-break meta symbol,
in which case this feature would be redundant. I admit I have not
checked whether the package we use has this feature, or how it would
be spelled.

7. Wildcard for prefixes - If some statements should be checked regardless prefix, it should be used //{{*}}, //{{*}}-NEXT, //{{*}}-SAME and etc.
8. Prefix with regular expressions - If statement should be checked if prefix matches some regular expression, it should be used {{regex}}:, {{regex}}-NEXT and etc.

More information in file https://docs.google.com/document/d/1wAKNzU7-S2EeK1-aADwgP8dEiKfByKNazonybCQW3zs/edit?usp=sharing.

I noticed the google doc stated that multi-line patterns are not
supported. That's not actually the case, although it's a bit obscure:
the [[:space:]] character-class will match EOL and allow you to write
a multi-line CHECK.

Now we have prototype with these features. It's tested on LLVM 3.8.
There was found unsupported before directive in old test. Bug about this - https://llvm.org/bugs/show_bug.cgi?id=27852.

There is about 6% slowdown with new features when we tested them on 3.8.

Is that a 6% slowdown just in FileCheck, or when running the entire
Clang/LLVM test suite? Making the entire test run 6% more expensive
seems like a lot.
--paulr

Hi Paul,

Thank you for information about the [[:space:]] character-class.
About performance I tested on Clang/LLVM test suite. I try to profile and the problem is that I used regular expressions a lot for supporting some new features and functions in your regex library are very slow . Regex library is very old and quite awkward, in my opinion.
May be you will see some ways to improve performance, if some features are decided to include in LLVM FileCheck and I publish patches for each feature separately(I saw your comment in published patch).
I see that there are a lot of opinions, may be it will be better to vote. I suggest vote in file https://docs.google.com/spreadsheets/d/1p8Hi_PH3Nd2kEtYveCwKXENmJGOzYWW6us0XK7eVvOw/edit?usp=sharing. There will be history and it's possible to check that nobody votes twice. Then somebody can stop voting at one moment and I will be able to understand what patches I should do and publish if there are some changes you would like.

Thanks,
Elena.

Hi all,

I' ll be glad to hear more opinions and may be some suggestions how to improve new features (may be there are ideas how template descriptions can be done simplier). After we try to accept your ideas and opinions to make new FileCheck features better.
I will make changes and publish separate patches a month later, because I'll take a holiday in June.

Thanks,
Elena.

Hi all,

Voting may be a bad idea. I' ll be glad to hear more opinions here and may be some suggestions how to improve new features (may be there are ideas how template descriptions can be done simplier). After I try to accept your ideas and opinions to make new FileCheck features better.
There was an idea to change regex library for support assertion \b. Are there any real plans to change regex library?
I will make changes and publish separate patches a month later, because I'll take a holiday in June.

Thanks,
Elena.

Also: CHECK-SAME.

Jon