lit: deprecating trailing \ in RUN lines

Alp_Toker · December 8, 2013, 10:11am

I’d like to propose deprecating and shortly thereafter removing the lit test runner feature that concatenates RUN lines ending in a trailing \

Rationale:

Trailing \ has a special meaning in various language standards that we support nowadays. In the C preprocessor, for example, it’s handled before comments. Various compilers handle this differently and it introduces needless incompatibilities.
Forgetting the trailing \ between two RUN lines causes the lines to be run individually. People have checked in tests which they believed were getting run whereas the features being tested were actually silently broken. I’ve been committing fixes for some of these but it’s exceptionally time-consuming to hunt them down after the fact.
Removing trailing \ will introduce the neat property that one RUN line corresponds precisely to one command that’s executed. This is good for humans and will enable simplifications in the test runner.
Eliminating the trailing \ syntax will unblock work on improved failure source locations in lit. Right now, when the builders encounter a test RUN failure it’s a matter of guesswork as to which RUN line is failing, and the cycle of commit-fix-and-watch-buildbots is laborious. We’ve all wasted time with this at some point and can totally do better.

Considerations:

Some run lines will now go over 80 columns. I suggest living with it as a fact of life, or simplifying the lines on a case by case basis – exceptionally long RUN lines and pipe sequences can often be replaced with a more idiomatic sequence of commands that would help isolate test failures more rapidly.
There are external projects relying on RUN lines with a trailing . It’ll be difficult to fix them all in one go.

For these reasons, I propose fixing the core LLVM modules to begin and diagnosing with a lit warning for everyone else. That would be sufficient to switch on the enhanced lit diagnostics* with line information on the llvm.org builders which is where most time is wasted today.

Separate patch, will post after this one is resolved.

Alp.

Chandler_Carruth · December 8, 2013, 11:22am

I'd like to propose deprecating and shortly thereafter removing the lit
test runner feature that concatenates RUN lines ending in a trailing \

I'm really opposed to this. Especially for the Clang test suite where run
lines are often *very* long and hard to organize, read, and edit without
this feature.

*Rationale:*

   - Trailing \ has a special meaning in various language standards that
   we support nowadays. In the C preprocessor, for example, it's handled
   _before_ comments. Various compilers handle this differently and it
   introduces needless incompatibilities.

What incompatibilities? I've never had this be an issue.

   - Forgetting the trailing \ between two RUN lines causes the lines to
   be run individually. People have checked in tests which they believed were
   getting run whereas the features being tested were actually silently
   broken. I've been committing fixes for some of these but it's exceptionally
   time-consuming to hunt them down after the fact.

I'd like to understand the rate at which this happens (per RUN-line? per

test-file?). It's never been a problem for me, but that is in part because
I check that my tests fail without my change in addition to passing with my
change.

   - Removing trailing \ will introduce the neat property that one RUN
   line corresponds precisely to one command that's executed. This is good for
   humans and will enable simplifications in the test runner.

FWIW, I've never really had a problem that needed this. The RUN: forms a

prefix of a shell script in my head, and I know how to read shell scripts
including multiple lines.

   - Eliminating the trailing \ syntax will unblock work on improved
   failure source locations in lit. Right now, when the builders encounter a
   test RUN failure it's a matter of guesswork as to which RUN line is
   failing, and the cycle of commit-fix-and-watch-buildbots is laborious.
   We've all wasted time with this at some point and can totally do better.

While I would very much enjoy better failure reporting, I don't really

understand why it needs this. We have a in-python parser for the RUN: lines
which understands what lines a "command" spans?

Anyways, even though I would *really* like better failure reporting from
lit, not at the cost of less readable tests. That's the tail wagging the
dog IMO.

Alp_Toker · December 8, 2013, 12:24pm

It’s an issue if you try to run the clang tests against other compilers, say to check compatibility with MSVC. The problem is that “the trailing backslash on a continued line is commonly referred to as a backslash-newline” – ie. it’s handled by the preprocessor, so has significance rather than being part of the comment. That causes dissonance between what the compiler sees and what lit.py sees for no particularly good reason. One of the nice properties of lit tests is that they’re also valid compiler inputs, so trailing slash is a bit unfortunate. It less of a problem if you’re just consuming the suite with make check-all, more of a problem for authoring. If only everyone did check their changes. See r194071 where trailing backslash caused a test to always succeed, and r194073 for the kind of long-term a broken test has on code quality. Looking at the SVN log for test/ in clang, and to a lesser extent LLVM core, I’ve been doing nearly all the no-op test fixes in the last few months. Not all of them are related to trailing \ but those are the most pernicious kinds. It’s a bad idea to gamble that I or someone else will always be around taking the time to manually verify old tests to see if they do what they’re meant to do. The transformations lit does are really too complex and there’s at least one known bug to do with closed pipes that’s contributing to no-op tests (think the discussion thread was on cfe-dev). In a nutshell, the script output lit forms right now is not likely not the pipeline you had in your head We need to simplify this stuff to fix no-op test issues, and also to achieve improved source line information. So, my contention is that the \ is not making the long lines more readable, just pasting over the complexity and hiding bugs. After all, long pipelines aren’t how people use the LLVM tools in the real world and they totally miss out on testing file IO, losing stdout/stderr distinctions etc. Another option is to use a different break marker and require RUN-NEXT: on continuation lines. But my view is that long RUN lines could do with simplification anyway, so removing the feature is a better way forward. I’ll throw the ball in your court to see if you have a better solution going forward? Alp.

Chandler_Carruth · December 8, 2013, 1:12pm

I'd like to propose deprecating and shortly thereafter removing the lit
test runner feature that concatenates RUN lines ending in a trailing \

I'm really opposed to this. Especially for the Clang test suite where
run lines are often *very* long and hard to organize, read, and edit
without this feature.

*Rationale:*

   - Trailing \ has a special meaning in various language standards that
   we support nowadays. In the C preprocessor, for example, it's handled
   _before_ comments. Various compilers handle this differently and it
   introduces needless incompatibilities.

What incompatibilities? I've never had this be an issue.

It's an issue if you try to run the clang tests against other compilers,
say to check compatibility with MSVC.

The problem is that "the trailing backslash on a continued line is
commonly referred to as a backslash-newline" -- ie. it's handled by the
preprocessor, so has significance rather than being part of the comment.

That causes dissonance between what the compiler sees and what lit.py sees
for no particularly good reason. One of the nice properties of lit tests is
that they're also valid compiler inputs, so trailing slash is a bit
unfortunate.

AFAIK, the only interesting pattern of RUN lines remains valid compiler
input:

// RUN: ... \
// RUN: ...

This is fine because while the '\' may "surprisingly" make continue the
prior comment line, the next line consisted entirely of a comment, so
whatever. Clang's warning is even silenced here, and while MSVC and GCC may
warn, they still are required to accept the code.

That said, I don't think that we should make tests harder to read or write
just to work around problems in other compilers.

It less of a problem if you're just consuming the suite with make
check-all, more of a problem for authoring.

   - Forgetting the trailing \ between two RUN lines causes the lines to
   be run individually. People have checked in tests which they believed were
   getting run whereas the features being tested were actually silently
   broken. I've been committing fixes for some of these but it's exceptionally
   time-consuming to hunt them down after the fact.

I'd like to understand the rate at which this happens (per RUN-line?

per test-file?). It's never been a problem for me, but that is in part
because I check that my tests fail without my change in addition to passing
with my change.

If only everyone did check their changes.

See r194071 where trailing backslash caused a test to always succeed, and
r194073 for the kind of long-term a broken test has on code quality.

Looking at the SVN log for test/ in clang, and to a lesser extent LLVM
core, I've been doing nearly all the no-op test fixes in the last few
months. Not all of them are related to trailing \ but those are the most
pernicious kinds.

It's a bad idea to gamble that I or someone else will always be around
taking the time to manually verify old tests to see if they do what they're
meant to do.

This mostly seems to address the challenge of fixing existing tests, which
certainly is hard, but I'm more interested in the challenge of writing a
new test. That is, I'm not worried about the rate at which we clean this
up, but the rate at which we mess this up. if there are only a few cases of
this today after 10 years, then I actually think the problem isn't very
bad. Put another way, if this only happens once or twice a year, is this
really the biggest problem with our test suite?

   - Removing trailing \ will introduce the neat property that one RUN
   line corresponds precisely to one command that's executed. This is good for
   humans and will enable simplifications in the test runner.

FWIW, I've never really had a problem that needed this. The RUN: forms

a prefix of a shell script in my head, and I know how to read shell scripts
including multiple lines.

The transformations lit does are really too complex and there's at least
one known bug to do with closed pipes that's contributing to no-op tests
(think the discussion thread was on cfe-dev).

In a nutshell, the script output lit forms right now is not likely not the
pipeline you had in your head

I understand that you think this is too complex, but I'm suggesting that
this particular aspect of lit does not seem too complex to at least one
other developer, and thus you shouldn't assume it to be true.

We need to simplify this stuff to fix no-op test issues, and also to
achieve improved source line information.

   - Eliminating the trailing \ syntax will unblock work on improved
   failure source locations in lit. Right now, when the builders encounter a
   test RUN failure it's a matter of guesswork as to which RUN line is
   failing, and the cycle of commit-fix-and-watch-buildbots is laborious.
   We've all wasted time with this at some point and can totally do better.

While I would very much enjoy better failure reporting, I don't really

understand why it needs this. We have a in-python parser for the RUN: lines
which understands what lines a "command" spans?

Anyways, even though I would *really* like better failure reporting from
lit, not at the cost of less readable tests. That's the tail wagging the
dog IMO.

So, my contention is that the \ is not making the long lines more
readable, just pasting over the complexity and hiding bugs.

After all, long pipelines aren't how people use the LLVM tools in the real
world

Zero of the long *lines* that I care about involve long *pipelines*. They
all involve exactly one pipeline from Clang to FileCheck.

They are absolutely how Clang is used in the real world: as a compiler
accepting a huge number of flags from a build system.

Another option is to use a different break marker and require RUN-NEXT: on
continuation lines. But my view is that long RUN lines could do with
simplification anyway, so removing the feature is a better way forward.

I'll throw the ball in your court to see if you have a better solution
going forward?

Uh, no, the ball is in your court to demonstrate that a pervasive change to
the testing infrastructure of LLVM is the right direction going forward. So
far you've presented the following as I see it:

1) Supposed compatibility, but I don't see *any* compatibility issues in
practice and I don't know why this is a priority.
2) Risk of programmer error resulting in false-pass tests. Legitimate
concern, still looking to quantify how bad this problem is.
3) Simplicity of the RUN-lines themselves. I disagree with your assessment,
so there at least doesn't appear to be clear agreement here.
4) Ability to provide accurate source locations. However, it doesn't seem
necessary to me as lit already correctly understands the set of lines
concatenated.

Of these, #2 is the one I really agree with. However, there are a *large*
number of ways to mess this up. It's not clear that this is the most common
and should thus be the priority.

On the other hand, I've presented a couple of reasons why the status quo
seems a good thing:

a) We need the ability to wrap long RUN lines for readability. This is a
real need in Clang's test suite, and I've seen it used well in LLVM's as
well.
b) The '\' character is widely used in shell, and RUN lines are (for better
or worse) a small subset of shell, so it seems reasonable to use them for
familiarity.

Alp_Toker · December 8, 2013, 2:04pm

          * Removing trailing \ will introduce the neat property that
            one RUN line corresponds precisely to one command that's
            executed. This is good for humans and will enable
            simplifications in the test runner.

    FWIW, I've never really had a problem that needed this. The RUN:
    forms a prefix of a shell script in my head, and I know how to
    read shell scripts including multiple lines.

    The transformations lit does are really too complex and there's at
    least one known bug to do with closed pipes that's contributing to
    no-op tests (think the discussion thread was on cfe-dev).

    In a nutshell, the script output lit forms right now is not likely
    not the pipeline you had in your head

I understand that you think this is too complex, but I'm suggesting that this particular aspect of lit does not seem too complex to at least one other developer, and thus you shouldn't assume it to be true.

It's great if we've made it all look simple to you. Unfortunately for the developers there's an ongoing problem with algorithmic complexity in lit hiding problems that lead to broken tests.

In particular, there are constructs that would error out in a shell but get silently accepted by the lit runner.

See Sean Silva's observation of one of these cases in the thread on no-op tests on cfe-dev:

Although it doesn't eliminate the hassle of having to manually fix this, it seems like at least one issue worth fixing in its own right is the fact that RUN lines ending with a | are silently accepted by our test infrastructure.

This proposal is about resolving a class of these problems, and I'd like to see if we can get it done early in the 3.5 cycle given there's a degree of churn.

It's ultimately about weighing up the cost/benefit of wrapped RUN lines. The benefit you pointed out so far has been visual wrapping in the editor, but the cost of that is very high. Perhaps your editor has a virtual wrapping mode?

With one-to-one mapping, it becomes possible to use simple tools like grep to validate common mistakes like %clang / %clang_cc1 mixups, a missing -o flag and so on.

Right now there's no obvious way to do those checks and we've ended up without an easy way to lint for broken tests as a result. Each broken test has a high cost so we need to continually look at ways to improve the situation.

Alp.

ddunbar · December 8, 2013, 4:41pm

I agree with Chandler here, I think the existing \ support is very useful and do not see a compelling argument to eliminate it.

Please also keep in mind that lit is used for a number of other test suites which are not C/C++ based, so language specific arguments don’t get a lot of weight from me.

I also think you are too quick to dismiss the value of being able to split very long run lines into separate easy to read and perhaps conceptually grouped sections.

Daniel

ddunbar · December 8, 2013, 4:43pm

           * Removing trailing \ will introduce the neat property that

            one RUN line corresponds precisely to one command that's
            executed. This is good for humans and will enable
            simplifications in the test runner.

    FWIW, I've never really had a problem that needed this. The RUN:
    forms a prefix of a shell script in my head, and I know how to
    read shell scripts including multiple lines.

    The transformations lit does are really too complex and there's at
    least one known bug to do with closed pipes that's contributing to
    no-op tests (think the discussion thread was on cfe-dev).

    In a nutshell, the script output lit forms right now is not likely
    not the pipeline you had in your head

I understand that you think this is too complex, but I'm suggesting that
this particular aspect of lit does not seem too complex to at least one
other developer, and thus you shouldn't assume it to be true.

It's great if we've made it all look simple to you. Unfortunately for the
developers there's an ongoing problem with algorithmic complexity in lit
hiding problems that lead to broken tests.

In particular, there are constructs that would error out in a shell but
get silently accepted by the lit runner.

Then we should just fix that bug. I would be very happy to see lit's shell
support improved to more closely match the shell for validation.

See Sean Silva's observation of one of these cases in the thread on no-op
tests on cfe-dev:

Although it doesn't eliminate the hassle of having to manually fix this,
it seems like at least one issue worth fixing in its own right is the fact
that RUN lines ending with a | are silently accepted by our test
infrastructure.

This proposal is about resolving a class of these problems, and I'd like
to see if we can get it done early in the 3.5 cycle given there's a degree
of churn.

It's ultimately about weighing up the cost/benefit of wrapped RUN lines.
The benefit you pointed out so far has been visual wrapping in the editor,
but the cost of that is very high. Perhaps your editor has a virtual
wrapping mode?

With one-to-one mapping, it becomes possible to use simple tools like grep
to validate common mistakes like %clang / %clang_cc1 mixups, a missing -o
flag and so on.

Right now there's no obvious way to do those checks and we've ended up
without an easy way to lint for broken tests as a result. Each broken test
has a high cost so we need to continually look at ways to improve the
situation.

Can you elaborate on exactly what kind of checks you want to do?

- Daniel

Alp_Toker · December 8, 2013, 7:47pm

Basically, it’s for asking questions about what’s currently tested… … experimenting with new features or refactoring existing tests… and fixing broken tests… I committed the results of this run just a few minutes ago in r196729 and r196730. One was a test that had been disabled since 2007. Losing 80 columns is a small price to pay if it’ll help make tests easier to write, understand and validate in my opinion. (The two editors I use, vim and XCode, both do a good job of virtual line wrapping so it turns out long RUN lines aren’t that much of a big deal in practice. But I can see why it may be a bigger problem with editors that don’t support wrapping. Is this still a problem in 2013?) Alp.

ddunbar · December 9, 2013, 4:59pm

Ok, that makes sense. I don’t see this as a good enough argument to remove backslash support though.

For problems like the clang_cc1 substitution mistakes, it would be much better to just improve the substitution support so that those cause immediate test failures. I would definitely support a move to make lit’s substitution machinery more strict. For problems like your first search, I can appreciate how wrapped lines make this less trivial, but this isn’t enough of a use case to motivate removing the feature in my opinion.

My opinion can be summarized as:

If you are worried about bogus or broken tests, but you have an automatic way to detect the brokenness, then the right way to attack this is to build the check into the test runner so no one ever can write the test to pass again.
If you are trying to do more fuzzy searches that are going to no be fully automatic, the individual developer can deal with the existence of wrapped lines. Or we could add a way to have lit run a regexp over all of the (collapsed) test script lines if this is a really important use case for you. Either way, its not a good enough argument to remove a useful feature.

Daniel

Alp_Toker · December 9, 2013, 6:09pm

Ok, that makes sense. I don't see this as a good enough argument to remove backslash support though.

For problems like the clang_cc1 substitution mistakes, it would be much better to just improve the substitution support so that those cause immediate test failures. I would definitely support a move to make lit's substitution machinery more strict.

Totally -- I have a patch for this. Unfortunately the work to add BusyBox support for native Windows testing, ninja test driver and lit enhanced diagnostics patches all got squashed into a single commit due to a screwup and need to be split up before posting :-/

For problems like your first search, I can appreciate how wrapped lines make this less trivial, but this isn't enough of a use case to motivate removing the feature in my opinion.

My opinion can be summarized as:

1. If you are worried about bogus or broken tests, but you have an automatic way to detect the brokenness, then the right way to attack this is to build the check into the test runner so no one ever can write the test to pass again.

Already doing this where possible, like r194919 which forbids direct use of the frontend in the driver tests.

This is great because it educates patch contributors rather than fixing up after the fact but we're obviously limited in the checks we can automate without preventing legitimate testing.

2. If you are trying to do more fuzzy searches that are going to no be fully automatic, the individual developer can deal with the existence of wrapped lines. Or we could add a way to have lit run a regexp over all of the (collapsed) test script lines if this is a really important use case for you. Either way, its not a good enough argument to remove a useful feature.

This hits on the problem precisely.

For any given common mistake 'hypothesis', there are a large number of false positives which legitimately test incorrect usage. So the kind of validations I'm running are inherently fuzzy.

You're right that it's trivial to /detect/ these by putting a regex just after the directives are combined.

The crux of the problem is that detection isn't sufficient to achieve real fixes on this scale.

There are 26,631 test runs between LLVM and clang. The only practical workflow I've found to fix these is to do in-place bulk edit using an assortment of command-line tools, then running and review the changes in context using word diff. Before this, we had no workflow so it's something of a step forward

Some background: I've been working my way down a list of potential problems and committed ~150 fixes or cleanups to clang tests in the last few weeks, several of which were hiding crasher or invalid codegen regressions.

To put that in context, I only reached the letter 'a' in that list of potential issues a few days ago -- it's clear that there are more tests in a state of disrepair than anyone imagined and a lot of tests we though are fine just aren't working.

I like my 80 columns as much as the next guy, but in this case they're blocking important work that's already detected hundreds of issues in limited trial runs.

So in this light I think we need to take a close look at the cost/benefit of continuing to support split RUN lines.

Alp.

_sean_silva · December 10, 2013, 4:20am

           * Removing trailing \ will introduce the neat property that

            one RUN line corresponds precisely to one command that's
            executed. This is good for humans and will enable
            simplifications in the test runner.

    FWIW, I've never really had a problem that needed this. The RUN:
    forms a prefix of a shell script in my head, and I know how to
    read shell scripts including multiple lines.

    The transformations lit does are really too complex and there's at
    least one known bug to do with closed pipes that's contributing to
    no-op tests (think the discussion thread was on cfe-dev).

    In a nutshell, the script output lit forms right now is not likely
    not the pipeline you had in your head

I understand that you think this is too complex, but I'm suggesting that
this particular aspect of lit does not seem too complex to at least one
other developer, and thus you shouldn't assume it to be true.

It's great if we've made it all look simple to you. Unfortunately for the
developers there's an ongoing problem with algorithmic complexity in lit
hiding problems that lead to broken tests.

In particular, there are constructs that would error out in a shell but
get silently accepted by the lit runner.

See Sean Silva's observation of one of these cases in the thread on no-op
tests on cfe-dev:

Although it doesn't eliminate the hassle of having to manually fix this,
it seems like at least one issue worth fixing in its own right is the fact
that RUN lines ending with a | are silently accepted by our test
infrastructure.

This proposal is about resolving a class of these problems, and I'd like
to see if we can get it done early in the 3.5 cycle given there's a degree
of churn.

It's ultimately about weighing up the cost/benefit of wrapped RUN lines.
The benefit you pointed out so far has been visual wrapping in the editor,
but the cost of that is very high. Perhaps your editor has a virtual
wrapping mode?

With one-to-one mapping, it becomes possible to use simple tools like grep
to validate common mistakes like %clang / %clang_cc1 mixups, a missing -o
flag and so on.

Right now there's no obvious way to do those checks and we've ended up
without an easy way to lint for broken tests as a result. Each broken test
has a high cost so we need to continually look at ways to improve the
situation.

The classic way to do this sort of checking is by hacking into the tool
that actually interprets it (i.e. lit in this case). Considering that lit
is Python, it should be pretty easy to insert an ad-hoc regex check (or
even something substantially more sophisticated). E.g. insert code into
TestRunner.py's parseIntegratedTestScript function.

-- Sean Silva

Alp_Toker · December 10, 2013, 5:40am

Hi Sean,

Let's try to work on that then. It'll be brilliant if we can make this work without churn in the test suite.

I had some progress trying to get that working by building a source line table and moving substitutions to later in the processing pipeline so all the source information to map changes back are actually available now.

The difficulty is in getting the regular expression engine to use that to preserve source locations so we can apply the changes back to the original source.

I've been digging into the depths of the Python regex implementation and tried inserting line break markers together with the multiline option but it didn't work out -- if can help find a way to do that then I think all the information we need is there to map back to the original lines and apply changes..

Alp.

James_Grosbach · December 10, 2013, 6:03pm

It’s an issue if you try to run the clang tests against other compilers, say to check compatibility with MSVC. The problem is that “the trailing backslash on a continued line is commonly referred to as a backslash-newline” – ie. it’s handled by the preprocessor, so has significance rather than being part of the comment.

It’s translation phase 1. If you’re seeing differences in behavior about this between compilers, that’s a huge bug in those compilers. Can you cite a specific example?

That causes dissonance between what the compiler sees and what lit.py sees for no particularly good reason. One of the nice properties of lit tests is that they’re also valid compiler inputs, so trailing slash is a bit unfortunate.

How does the backslash break this in any way?

Alp_Toker · December 10, 2013, 7:26pm

The backslash is interpreted by lit and the compiler in different and incompatible ways. Spot the problem in this reduced example… It is easier to get this wrong than it is to get it right – in effect, lit is encouraging use of backslash line continuations which are guaranteed to change the meaning of the following line in C/C++ with a silent failure mode. This is concerning for a C/C++ compiler test suite It’s problematic because the feature hides test issues in and of itself by being incompatible with C, but moreover because it breaks stock tools used to check code and comments due to the complex sub-grammar introduced. It’s really important to me that there’s buy-in for this though – the last thing I want is for people to say “Alp made my tests hard to read” after the fact(!) Cheers, Alp.

James_Grosbach · December 10, 2013, 7:47pm

The backslash is interpreted by lit and the compiler in different and incompatible ways.

I disagree that this is different or incompatible.

In any case, you didn’t answer the more important of my two questions. What compilers interpret this code differently?

Spot the problem in this reduced example…

// expected-no-diagnostics

// RUN: %clang_cc1 %s -verify -emit-llvm -o - | \
__attribute__ ((packed)) // no warning

int x;

// RUN: FileCheck %s
// CHECK: @x = common global i32 0, align 4

The attribute line is a comment, which any reasonable syntax highlighting editor will show you if you’re not used to looking for these sorts of things.

lit should generate a diagnostic (probably an error) for this code, however. It’s malformed.

It is easier to get this wrong than it is to get it right – in effect, lit is encouraging use of backslash line continuations which are guaranteed to change the meaning of the following line in C/C++ with a silent failure mode. This is concerning for a C/C++ compiler test suite

It’s problematic because the feature hides test issues in and of itself by being incompatible with C, but moreover because it breaks stock tools used to check code and comments due to the complex sub-grammar introduced.

It’s really important to me that there’s buy-in for this though – the last thing I want is for people to say “Alp made my tests hard to read” after the fact(!)

If you necessitate test run lines longer than 80 characters, you will be making tests much harder for me to read.

This isn’t a problem with the lit RUN line syntax. It’s a problem with lit being far too permissive and not screaming when it sees obviously malformed code.

Alp_Toker · December 10, 2013, 9:24pm

Hi Jim, There are lots of ways line continuations are interpreted differently between compilers and even different versions from the same vendor. This is inherent because each frontend has a different take on fundamental issues like where lines and comments begin and end, and even the semantics of what translation phases are vary between compilers. Here’s one quick example of how compilers interpret this code differently:

James_Grosbach · December 10, 2013, 9:30pm

Hi Jim, There are lots of ways line continuations are interpreted differently between compilers and even different versions from the same vendor.\

This is inherent because each frontend has a different take on fundamental issues like where lines and comments begin and end, and even the semantics of what translation phases are vary between compilers.

No, they really don’t, modulo bugs. This is standards compliance territory. If the compilers aren’t conformant implementations, I have zero sympathy.

Here’s one quick example of how compilers interpret this code differently:
`$ printf ‘//\ \nint x=0;\nint x=0;’ > f.c```
```**$** cat f.c```
```//\ ```
```int x=0;```
```int x=0;```
**$** clang -fsyntax-only f.c
```**$** gcc-4.9 -fsyntax-only f.c```
**$** cl f.c
Microsoft (R) C/C++ Optimizing Compiler Version 18.00.21005.1 for x64
Copyright (C) Microsoft Corporation. All rights reserved.
``**`f.c(3) : `****`error C2374`****`: 'x' : redefinition; multiple initialization`****``**
**``****`f.c(2) : see declaration of 'x`****’**

That’s a bug in either MSVC or in whatever you’re using to get a bash prompt on windows. Probably line-ending related. It’s incorrectly not recognizing the continuation character at all.

-Jim

Caldarale_Charles_R · December 10, 2013, 9:43pm

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]
On Behalf Of Alp Toker
Subject: Re: [LLVMdev] lit: deprecating trailing \ in RUN lines

//\
int x=0;
int x=0;

$ gcc-4.9 -fsyntax-only f.c

Try gcc with -Wall, and you'll see the appropriate warning. Also try the MS compiler after removing the trailing space after the backslash; I don't have one available to play with, so I don't know if it makes any difference.

- Chuck

Alp_Toker · December 10, 2013, 9:53pm

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu]
On Behalf Of Alp Toker
Subject: Re: [LLVMdev] lit: deprecating trailing \ in RUN lines
//\
int x=0;
$ gcc-4.9 -fsyntax-only f.c

Try gcc with -Wall, and you'll see the appropriate warning. Also try the MS compiler after removing the trailing space after the backslash; I don't have one available to play with, so I don't know if it makes any difference.

Exactly. EDG and MSVC parse this one way, and gcc/clang do it another way depending on various factors like whitespace.

It may be a bug like Jim says but to me the EDG/MSVC handling is closer to the spec. Not a big deal either way.

For directives in a C/C++ test suite to rely on something contentious like trailing newlines that have a dual meaning depending on whether they're being parsed by the compiler or the testing tool is problematic though.

Alp.

James_Grosbach · December 10, 2013, 10:15pm

Doh. I missed the trailing space. That makes it a bit odd, to say the least. If we have any files with that construct in it, we should totally just run a regex over them to fix it. That’s just broken.

This would also totally have been prevented if we had a post-commit hook to strip trailing whitespace. </trollchris>

Topic		Replies	Views
lit improvement LLVM Dev List Archives	3	156	August 6, 2015
misc CVS patches LLVM Dev List Archives	26	132	April 22, 2005
RFC: Clang test runner changes Clang Frontend	23	161	August 4, 2009
How to write a regression test case? LLVM Dev List Archives	34	243	September 3, 2012
Is there room for another build system? LLVM Dev List Archives	19	116	August 5, 2008

lit: deprecating trailing \ in RUN lines

Related topics