I'd like to propose deprecating and shortly thereafter removing the lit
test runner feature that concatenates RUN lines ending in a trailing \
I'm really opposed to this. Especially for the Clang test suite where
run lines are often *very* long and hard to organize, read, and edit
without this feature.
*Rationale:*
- Trailing \ has a special meaning in various language standards that
we support nowadays. In the C preprocessor, for example, it's handled
_before_ comments. Various compilers handle this differently and it
introduces needless incompatibilities.
What incompatibilities? I've never had this be an issue.
It's an issue if you try to run the clang tests against other compilers,
say to check compatibility with MSVC.
The problem is that "the trailing backslash on a continued line is
commonly referred to as a backslash-newline" -- ie. it's handled by the
preprocessor, so has significance rather than being part of the comment.
That causes dissonance between what the compiler sees and what lit.py sees
for no particularly good reason. One of the nice properties of lit tests is
that they're also valid compiler inputs, so trailing slash is a bit
unfortunate.
AFAIK, the only interesting pattern of RUN lines remains valid compiler
input:
// RUN: ... \
// RUN: ...
This is fine because while the '\' may "surprisingly" make continue the
prior comment line, the next line consisted entirely of a comment, so
whatever. Clang's warning is even silenced here, and while MSVC and GCC may
warn, they still are required to accept the code.
That said, I don't think that we should make tests harder to read or write
just to work around problems in other compilers.
It less of a problem if you're just consuming the suite with make
check-all, more of a problem for authoring.
- Forgetting the trailing \ between two RUN lines causes the lines to
be run individually. People have checked in tests which they believed were
getting run whereas the features being tested were actually silently
broken. I've been committing fixes for some of these but it's exceptionally
time-consuming to hunt them down after the fact.
I'd like to understand the rate at which this happens (per RUN-line?
per test-file?). It's never been a problem for me, but that is in part
because I check that my tests fail without my change in addition to passing
with my change.
If only everyone did check their changes.
See r194071 where trailing backslash caused a test to always succeed, and
r194073 for the kind of long-term a broken test has on code quality.
Looking at the SVN log for test/ in clang, and to a lesser extent LLVM
core, I've been doing nearly all the no-op test fixes in the last few
months. Not all of them are related to trailing \ but those are the most
pernicious kinds.
It's a bad idea to gamble that I or someone else will always be around
taking the time to manually verify old tests to see if they do what they're
meant to do.
This mostly seems to address the challenge of fixing existing tests, which
certainly is hard, but I'm more interested in the challenge of writing a
new test. That is, I'm not worried about the rate at which we clean this
up, but the rate at which we mess this up. if there are only a few cases of
this today after 10 years, then I actually think the problem isn't very
bad. Put another way, if this only happens once or twice a year, is this
really the biggest problem with our test suite?
- Removing trailing \ will introduce the neat property that one RUN
line corresponds precisely to one command that's executed. This is good for
humans and will enable simplifications in the test runner.
FWIW, I've never really had a problem that needed this. The RUN: forms
a prefix of a shell script in my head, and I know how to read shell scripts
including multiple lines.
The transformations lit does are really too complex and there's at least
one known bug to do with closed pipes that's contributing to no-op tests
(think the discussion thread was on cfe-dev).
In a nutshell, the script output lit forms right now is not likely not the
pipeline you had in your head 
I understand that you think this is too complex, but I'm suggesting that
this particular aspect of lit does not seem too complex to at least one
other developer, and thus you shouldn't assume it to be true.
We need to simplify this stuff to fix no-op test issues, and also to
achieve improved source line information.
- Eliminating the trailing \ syntax will unblock work on improved
failure source locations in lit. Right now, when the builders encounter a
test RUN failure it's a matter of guesswork as to which RUN line is
failing, and the cycle of commit-fix-and-watch-buildbots is laborious.
We've all wasted time with this at some point and can totally do better.
While I would very much enjoy better failure reporting, I don't really
understand why it needs this. We have a in-python parser for the RUN: lines
which understands what lines a "command" spans?
Anyways, even though I would *really* like better failure reporting from
lit, not at the cost of less readable tests. That's the tail wagging the
dog IMO.
So, my contention is that the \ is not making the long lines more
readable, just pasting over the complexity and hiding bugs.
After all, long pipelines aren't how people use the LLVM tools in the real
world
Zero of the long *lines* that I care about involve long *pipelines*. They
all involve exactly one pipeline from Clang to FileCheck.
They are absolutely how Clang is used in the real world: as a compiler
accepting a huge number of flags from a build system.
Another option is to use a different break marker and require RUN-NEXT: on
continuation lines. But my view is that long RUN lines could do with
simplification anyway, so removing the feature is a better way forward.
I'll throw the ball in your court to see if you have a better solution
going forward?
Uh, no, the ball is in your court to demonstrate that a pervasive change to
the testing infrastructure of LLVM is the right direction going forward. So
far you've presented the following as I see it:
1) Supposed compatibility, but I don't see *any* compatibility issues in
practice and I don't know why this is a priority.
2) Risk of programmer error resulting in false-pass tests. Legitimate
concern, still looking to quantify how bad this problem is.
3) Simplicity of the RUN-lines themselves. I disagree with your assessment,
so there at least doesn't appear to be clear agreement here.
4) Ability to provide accurate source locations. However, it doesn't seem
necessary to me as lit already correctly understands the set of lines
concatenated.
Of these, #2 is the one I really agree with. However, there are a *large*
number of ways to mess this up. It's not clear that this is the most common
and should thus be the priority.
On the other hand, I've presented a couple of reasons why the status quo
seems a good thing:
a) We need the ability to wrap long RUN lines for readability. This is a
real need in Clang's test suite, and I've seen it used well in LLVM's as
well.
b) The '\' character is widely used in shell, and RUN lines are (for better
or worse) a small subset of shell, so it seems reasonable to use them for
familiarity.