LIT Verbose

Folks,

Some of our CMake buildbots are failing to timeout, and I believe it's
something to do with how the output comes from the LIT tests, even
though we add -v to LIT_ARGS.

When the "check-all" stage runs, the output stops at the "Running
tests" message and only prints the rest of the output (including all
tests that pass, fail, etc) at the end.

I believe this has to do with ninja holding off the stream until the
job actually finishes, but I'm not seeing this when I run ninja
manually, which hints as to how the terminal is setup for buildbot.

Does anyone have a good hint for me to fix this?

cheers,
--renato

I think this will help one facet of your problem: http://reviews.llvm.org/D6584

Jon

If I remember correctly, there was some talk in ninja's github issues about not buffering when there's only one command to run. Is it possible that your manual runs are using a newer version of ninja?

Also, I remember there being some talk about printing 'still waiting...' messages when a command hasn't finished for a while. That would stop the timeouts but it would also remove the protection against infinite loops.

I don't think so, because the tests don't time out, it's a buffering issue...

--renato

If I remember correctly, there was some talk in ninja's github issues about not buffering when there's only one command to run. Is it possible that your manual runs are using a newer version of ninja?

My laptop version is actually older... :slight_smile: 1.3.0.git against 1.3.4

It's strange, as LIT actually prints the whole thing, so I can't fix
it by adding more verbosity to lit, I have to see what ninja is doing
wrong.

I'll have a look on github, thanks!

Also, I remember there being some talk about printing 'still waiting...' messages when a command hasn't finished for a while. That would stop the timeouts but it would also remove the protection against infinite loops.

Yeah, I'd rather fix the buffering issue, not create another one. :slight_smile:

cheers,
--renato

I think this will help one facet of your problem:
http://reviews.llvm.org/D6584

I don't think so, because the tests don't time out, it's a buffering issue...

I gather that the tests don't time out, but you want them to because they're taking to long, and whatever you're currently using to do that doesn't work. That patch I mentioned causes tests that take too long to be killed, so I still think that solves part of your problem.

OTOH, you observe a buffering issue. Why do you think that gets in the way of timeouts killing your tests? ISTM that buffering should be orthogonal to timeouts... What are the buildbots currently using to do timeouts?

Jon

+pcc

Renato Golin <renato.golin@linaro.org> writes:

Folks,

Some of our CMake buildbots are failing to timeout, and I believe it's
something to do with how the output comes from the LIT tests, even
though we add -v to LIT_ARGS.

When the "check-all" stage runs, the output stops at the "Running
tests" message and only prints the rest of the output (including all
tests that pass, fail, etc) at the end.

IIUC, the problem is that cmake+ninja doesn't actually output anything
while tests are running. I think what you're hoping for is r222181,
which teaches cmake that the test targets output arbitrary stuff on the
terminal, and that ninja should show that output.

Unfortunately, this needs a really new (unreleased?) version of cmake to
work. If you don't have a new enough cmake, you might be better off
running the lit command directly for now, though I realize this is a bit
of a pain for check-all since the list of targets depends on what
exactly you've checked out.

IIUC, the problem is that cmake+ninja doesn't actually output anything
while tests are running. I think what you're hoping for is r222181,
which teaches cmake that the test targets output arbitrary stuff on the
terminal, and that ninja should show that output.

Hum, I'm using CMake 2.8...

Unfortunately, this needs a really new (unreleased?) version of cmake to
work. If you don't have a new enough cmake, you might be better off
running the lit command directly for now, though I realize this is a bit
of a pain for check-all since the list of targets depends on what
exactly you've checked out.

I don't mind updating the cmake by hand (compile+install) if that's
what it takes. I'll try.

thanks!
--renato

My understanding was the other way around. The tests timeout but they shouldn't. However, re-reading the original email I see that my mind inserted a word that isn't there.
Renato, just to double check: Is it failing _due_ to timeout? Or failing to timeout?

Each test runs correctly and successfully, but all together take
longer to run than the overall timeout.

Unlike the make version, ninja doesn't print each test as they run,
but after they have all finished, so you only have one output for
several minutes "Running tests".

The buildmaster kills the slave after a long time without receiving
input, but that doesn't mean the slave is locked. It just means that
the slave is not sending progress updates as it works through the
tests.

What I'm trying to enable is ninja to print each test as they run, not
all at the end.

Though, I'll be out until January without my laptop, so that'll be
after the holidays. :slight_smile:

cheers,
--renato

> My understanding was the other way around. The tests timeout but they
shouldn't. However, re-reading the original email I see that my mind
inserted a word that isn't there.
> Renato, just to double check: Is it failing _due_ to timeout? Or failing
to timeout?

Each test runs correctly and successfully, but all together take
longer to run than the overall timeout.

Unlike the make version, ninja doesn't print each test as they run,
but after they have all finished, so you only have one output for
several minutes "Running tests".

A recent change (to lit? to cmake? to ninja?) causes this not to be the
case, at least on interactive builds. I actually now get the lit progress
bar UI & everything.

Piping that output to a file... I get this:

Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..

So maybe that's working too? (though it's not printing out every file, it's
true)

Hi David,

Just checked and it still doesn’t print the percents during the run, and still prints the test results only after the run is finished.

I’ll fiddle with it more when I come home from holidays.

Cheers,
Renato

I’ve looked at it a bit more and an updated ninja is part of it. You also need build.ninja to contain ‘pool = console’ on the lit rules. I added it manually to try it out but presumably a cmake update is needed to add this automatically.

Hi Daniel,

What do you mean by "pool = console"? There are no pools named
console, is that some kind of special rule that you created?

I'd want to get CMake to generate a "pool = linkers" on all link jobs,
and populate it from an argument's value, irrespective of -j argument.
This is very useful for distcc. But that's for later.

cheers,
--renato

From: Renato Golin [renato.golin@linaro.org]
Sent: 07 January 2015 10:13
To: Daniel Sanders
Cc: David Blaikie; LLVM Dev
Subject: Re: [LLVMdev] LIT Verbose

> I've looked at it a bit more and an updated ninja is part of it. You also
> need build.ninja to contain 'pool = console' on the lit rules. I added it
> manually to try it out but presumably a cmake update is needed to add this
> automatically.

Hi Daniel,

What do you mean by "pool = console"? There are no pools named
console, is that some kind of special rule that you created?

I'd want to get CMake to generate a "pool = linkers" on all link jobs,
and populate it from an argument's value, irrespective of -j argument.
This is very useful for distcc. But that's for later.

cheers,
--renato

Ninja 1.5 added a pre-defined pool named console. See http://martine.github.io/ninja/manual.html#_the_literal_console_literal_pool

Oh, right... Ubuntu still uses Ninja 1.3.4... :frowning:

I could install the newest one by hand, but changing CMake would have
to be conditional on an argument.

Thanks,
--renato

Hi Daniel,

So, I got back looking at it, and I have a question.

It seems that CMake only added support for creating ninja pools in the
3.0 version, while most arches use 2.8 or less. But worse, it seems,
that CMake only understands numeric pools, so far, ruling out the
console pool.

I'm not sure we would be able to add the console pool to the lit rules
from CMake, and I don't want to add an extra step on all CMake builds,
but I could do so on the buildbot (some smart sed one-liners on the
rules.ninja file).

Do you know of any alternative? If not, I may have to do that... :confused:

cheers,
--renato

From: Renato Golin [mailto:renato.golin@linaro.org]
Sent: 22 January 2015 11:37
To: Daniel Sanders
Cc: David Blaikie; LLVM Dev
Subject: Re: [LLVMdev] LIT Verbose

> I've looked at it a bit more and an updated ninja is part of it. You also
> need build.ninja to contain 'pool = console' on the lit rules. I added it
> manually to try it out but presumably a cmake update is needed to add this
> automatically.

Hi Daniel,

So, I got back looking at it, and I have a question.

It seems that CMake only added support for creating ninja pools in the
3.0 version, while most arches use 2.8 or less. But worse, it seems,
that CMake only understands numeric pools, so far, ruling out the
console pool.

Support for the console pool should appear in cmake 3.2 (http://www.cmake.org/gitweb?p=cmake.git;a=commitdiff;h=444f61e0) and our CMakeLists.txt's should be ready for it (http://llvm.org/viewvc/llvm-project?view=revision&revision=222181). I'm on cmake 3.0.2 so I haven't tried it for myself yet.

I'm not sure we would be able to add the console pool to the lit rules
from CMake, and I don't want to add an extra step on all CMake builds,
but I could do so on the buildbot (some smart sed one-liners on the
rules.ninja file).

Do you know of any alternative? If not, I may have to do that... :confused:

cheers,
--renato

I'm afraid the sed approach is the only thing that I can think of.

We're using a script with a background process printing dots to keep
the channel open, but that's ugly. Will do for now, though.

cheers,
--renato