Behaviour of outs()?

I was just fixing a bug that was caused by `stdout` being closed
before the runtime has done `fflush(stdout)` [or however this is
implemented in the runtime].

The cause of this seems to be that `outs()` returns a static object
created from `raw_fd_stream(STDOUT_FILENO, true)` - the `true` being
the `shouldClose` parameter.

Surely LLVM is not supposed to close `stdout` as part of its operations?

I was just fixing a bug that was caused by `stdout` being closed
before the runtime has done `fflush(stdout)` [or however this is
implemented in the runtime].

The cause of this seems to be that `outs()` returns a static object
created from `raw_fd_stream(STDOUT_FILENO, true)` - the `true` being
the `shouldClose` parameter.

Surely LLVM is not supposed to close `stdout` as part of its operations?

Looks like this was added in r111643:

commit 5d56d9df928c48571980efe8d4205de8ab557b7c
Author: Dan Gohman <gohman@apple.com>

    Make outs() close its file when its stream is destructed, so that
    pending output errors are detected.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@111643 91177308-0d34-0410-b5e6-96231b3b80d8

Certainly looks intentional. Comment in the code is:

    // Delete the file descriptor when the program exits, forcing error
    // detection. If you don't want this behavior, don't use outs().

I don't really have an opinion on whether this is a good idea, but it
does suggest that any use of `outs()` in `lib/` is a bug.

$ git grep -l 'outs()' -- lib/
lib/ExecutionEngine/Interpreter/ExternalFunctions.cpp
lib/IR/GCOV.cpp
lib/Support/CommandLine.cpp
lib/Support/FormattedStream.cpp
lib/Support/TargetRegistry.cpp
lib/Support/raw_ostream.cpp
lib/Target/CppBackend/CPPBackend.cpp

All of these cases seem to be for outputting help text of some sort. But
probably they should be passed a reference to a `raw_ostream`, so that the
caller gets to choose whether or not to use `outs()`.

Were you hitting one of these?

We’ve had this argument before. IMO LLVM should not be in the business of closing stdout, and no code in lib/ should print to stdout because users may expect output there (-o -).

So, I'm only "fixing the compiler code" (by setting `shouldClose =
false`), but in essence the code we have is something like this (much
simplified):

int main(...)
{
      printf("Some stuff\n");
      ...
      use_clang_to_compile_opencl();
      run_opencl();
      ..
      printf("Some more stuff\n");
}

and then `myprog > file.txt`

In this case, clang and llvm do not output anything (as far as I'm
aware), but the "Some stuff" and "Some more stuff" does not appear in
file.txt. Using strace, my colleague found the problem as `close(1)`
followed by `write(1, "...") = EBADF`.

[The ACTUAL code is a benchmark that is part of our test-code, and the
test is failing]

I'm fairly sure that whether llvm actually uses `outs` or not, it will
be constructed and destroyed based on it's existance. I will have a
look at this in more detail, but the general question still remains:

Should LLVM close `STDOUT_FILENO` in a destructor? I don't think so. If so, why?

I can confirm that llvm/clang does not output anything to stdout, but
the fix of setting shoulClose to false solves the problem. When I
dissassembled my clang, it shows 137 occurrences of _ZN4llvm4outsEv (a
few are the actual `outs` function itself, but I thing 134 actual
calls are made - I did not check if any of those are done in a scope
that doesn't actually write to the resulting stream, but I expect that
is what happens).

Exact strace:

... thousands of other lines removed...
munmap(0xb46e4000, 3149824) = 0
close(1) = 0
munmap(0xb1524000, 528384) = 0
write(1, "<elided as irrelevant>", 391) = -1 EBADF (Bad file descriptor)

As you can see, it's not "far off" that it works - only two system
calls apart... And it is the ONLY write with filenumber 1.

Actually, no it won't. It's only constructed (and destroyed) if `outs()`
is called.

You can confirm this with toy programs using the same logic.

Both of those SGTM.

I'm fairly sure that whether llvm actually uses `outs` or not, it will
be constructed and destroyed based on it's existance. I will have a
look at this in more detail, but the general question still remains:

Not quite.

raw_ostream.cpp:
00704 raw_ostream &llvm::outs() {
00705 // Set buffer settings to model stdout behavior.
00706 // Delete the file descriptor when the program exits, forcing error
00707 // detection. If you don't want this behavior, don't use outs().
00708 static raw_fd_ostream S(STDOUT_FILENO, true);
00709 return S;
00710 }

S's constructor is called once (and only once) on the first call to llvm::outs(), and then the destructor will get called along with all the other globals' around atexit() time. If llvm::outs() doesn't get called, then the destructor should never be invoked.

Cheers,

Jon

I too tried replicating this, by running gdb on a unchanged [but quite
old] clang (-cc1 and stuff), with a breakpoint on `outs`, and it is
never called called.

So, I will have to investigate if something in our code is calling
llvm::outs(), without actually writing anything.

(By the way, at some point in the past, someone [1] had altered this
code to have shouldClose to false, I changed it back to true as that
is how upstream is, and then found out "the hard way" that this is not
a good change - it passed all sorts of local testing and regression
tests, only some time later someone discovered that some benchmarks
were no longer producing output)

[1] the history is murky due to changes in our process&version control
system, so can't say who, and it's quite possibly someone that is no
longer working here anyway, so can't really ask "why is it this way".

I've fixed (ie, removed) the uses of outs() in lib/IR/GCOV.cpp in
r226961.

"Duncan P. N. Exon Smith" <dexonsmith@apple.com> writes: