[RFC][FileCheck] New option to negate check patterns

Hi all,

There have been a few cases recently where I’ve noticed two test cases in the same lit test that do the same thing except invert the CHECK, to show that something is NOT present. I’m talking about something like the following:

RUN: llvm-sometool --print-string | FileCheck %s --check-prefix=STRING

RUN: llvm-sometool --no-print-string | FileCheck %s --check-prefix=NO-STRING

STRING: This is the string

NO-STRING-NOT: This is the string

In such cases, as can be seen, the CHECK line effectively has to be duplicated (either in an explicit check like in the above example, or via --implicit-check-not). Duplication is generally bad, especially in this sort of case, as it only takes a typo in the NOT pattern, or a careless developer/reviewer pair changing the output to cause the NOT pattern to no longer be useful.

I’d like to propose a new FileCheck option (e.g. --check-not-prefix/–check-not-prefixes) which allows implicitly converting a check prefix to a -NOT version of the same prefix. That would allow writing the above example as:

RUN: llvm-sometool --print-string | FileCheck %s --check-prefix=STRING

RUN: llvm-sometool --no-print-string | FileCheck %s --check-not-prefix=STRING

STRING: This is the string

If there was a typo or somebody changed the string output, this mechanism would ensure there is no chance of the pattern rotting.

Caveat: I don’t know what would be the appropriate way of handling non-trivial checks, i.e. existing CHECK-NOT/CHECK-NEXT/CHECK-SAME etc. I’d appreciate any ideas on this.

Thoughts?

James

Hi James,

I feel it might be confusing to have a CHECK becomes effectively a CHECK-NOT, especially if the RUN line is far from the CHECK line (which is often the case when a single RUN line drives several groups of CHECK directives (e.g. code generation tested for several functions for a specific feature, like PIC). You also loose control on where the NOT should be: it would have to be at the same location as the CHECK even though for the NOT case you might want to check it somewhere else.

How about having a concept of regex variables where you give a name to a given directive’s pattern which you could reuse as another pattern. Something like (syntax TBD):

CHECK: mov [[REG:r[0-9]+]], #42
CHECK-NOT:

Best regards,

Thomas

​Hi all,

I feel it might be confusing to have a CHECK becomes effectively a CHECK-NOT,

especially if the RUN line is far from the CHECK line (which is often the case when

a single RUN line drives several groups of CHECK directives (e.g. code generation

tested for several functions for a specific feature, like PIC). You also loose control

on where the NOT should be: it would have to be at the same location as the

CHECK even though for the NOT case you might want to check it somewhere else.

I think I agree with Thomas.

  • the relationship with “CHECK-NOT/CHECK-NEXT/CHECK-SAME” might make things overcomplicated probably.

How about having a concept of regex variables where you give a name
to a given directive’s pattern which you could reuse as another pattern. Something like (syntax TBD):

CHECK: mov [[REG:r[0-9]+]], #42
CHECK-NOT:

I.e. without adding a new optinons for FileCheck, something like the following?

RUN: llvm-sometool --print-string | FileCheck %s --check-prefix=CHECK1

RUN: llvm-sometool --no-print-string | FileCheck %s --check-prefix=CHECK2

CHECK1: mov [[REG:r[0-9]+]], #42

CHECK2-NOT:

It might work probably. We already have the ability to name parts of

the output checked:

// CHECK: Dynamic Relocations {
// CHECK-NEXT: {{.}} R_AARCH64_RELATIVE - [[BAR_ADDR:.]]

// CHECK: Symbols [

// CHECK-NEXT: Value: [[BAR_ADDR]]

So adding a way for naming the whole line does not look
an unreasonable/inconsistent extention to me I think.

Best regards,​
George | Developer | Access Softek, Inc​​​​​​​

Seems to me there was a proposal (possibly years ago now) to allow defining named patterns, from someone who had (IIRC) implemented such a feature in a downstream project. I don’t remember the details of their use-case, but apparently by itself it wasn’t compelling enough to get the encouragement to proceed.

FileCheck directives are already easy to get wrong, and adding a command-line option to change their interpretation just seems like asking for trouble. But adding a way to define a pattern that is independent of the input text seems like it could be useful.

I’d suggest that initially at least, the define-a-pattern directive would take only “immediate” text, no embedded regexes. That is, you could do

DEFPAT[MYPATTERN]: Define a pattern here

but you couldn’t do

DEFPAT[MYPATTERN]: Define {{some|any}} pattern here

although it might be reasonable to allow

DEFPAT[PATTERN1]: some

DEFPAT[PATTERN2]: Define [[PATTERN1]] pattern here

as the [] substitution can be done when the directive is read.

My $.02,

–paulr

Thanks for the suggestions. I think the naming the whole line idea is okay, but it feels a bit clunky. Either we’d have to have a syntax that FileCheck would recognise without caring about the prefix (which seems to be against the ethos of FileCheck, and also makes it less flexible), or in the case I’m referring to, we’d have to have an extra line that does nothing other than define the pattern that can then be used later, and we’d still have some duplication (namely of the NOT check) i.e. something like:

RUN: tool --print-string | FileCheck %s --check-prefixes=COMMON,CHECK
RUN: tool --no-print-string | FileCheck %s --check-prefixes=COMMON,NO

COMMON-DEFINE-MYNAME: some string
CHECK: [[MYNAME]]
NO-NOT: [[MYNAME]]

Assuming we go with this modified proposal, I’d be tempted for something like the above as the syntax, namely “-DEFINE-”.

Perhaps a little less verbose, and more obvious could be the following:

CHECK-DEFINE: {{MYNAME:some string}}
or even
CHECK-DEFINE: {{VAR1:a literal}} {{VAR2:a.*pattern}}

Where “DEFINE” is a new kind of directive which says “don’t actually match anything, but do define regex patterns as defined on the line, so that they can be used in subsequent checks”. Patterns defined in this way would then be applied in the same way as {{.*}} style patterns, unlike other variables, which just match the earlier string they captured (ignoring Thomas’s recent changes at least).

Hi James,

Having a new DEFINE directive which also gets selected by check prefix sounds sensible to me. However I think we might want to allow later to give a name to any kind of pattern, even with variable (string or numeric) definition and use. Therefore I would suggest to stick to a single pattern name definition per DEFINE directive since it allows to get rid of the {{}} syntax. If it becomes too verbose we can allow later several pattern definition per DIRECTIVE with a new syntax to avoid confusion with the string substitution syntax. That would allow thing such as:

DEFINE: mov r{{[0-9]+}}, r{{[0-9]+}}

or even:

DEFINE: mov [[FORBIDDEN_REG]], r{{[0-9]+}}

Do you agree with that concern?

Best regards,

Thomas

It seems reasonable to stick to one pattern per DEFINE directive. However, what’s unclear to me is how you’d name the pattern in your suggestion. “CHECK-DEFINE-PATTERN: some pattern” seems non-obvious to me as a way to define “PATTERN”, whereas something like “CHECK-DEFINE: [[PATTERN:some pattern]]” or similar is clearer to me.