Need help reproducing a sanitizer buildbot failure

I recently broke a sanitizer buildbot but I am unable to reproduce the failure. The buildbot that failed is

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2959

I looked around in the logs, looking for the config/build commands that would reproduce the failure. The problem is that the bot seems to be using a script which I don’t have access to (buildbot_bootstrap.sh). So, I looked for cmake/ninja calls. AFAICT, the bot does a bootstrap and then runs the testsuite with the final build configured with -DLLVM_USE_SANITIZER=Memory.

I tried that locally, but the build dies very early in tblgen:

FAILED: cd /ssd/dnovillo/llvm/bld/tools/clang/include/clang/Driver && /ssd/dnovillo/llvm/bld/bin/llvm-tblgen -gen-opt-parser-defs -I /ssd/dnovillo/llvm/llvm/tools/clang/include/clang/Driver -I /ssd/dnovillo/llvm/llvm/lib/Target -I /ssd/dnovillo/llvm/llvm/include /ssd/dnovillo/llvm/llvm/tools/clang/include/clang/Driver/Options.td -o /ssd/dnovillo/llvm/bld/tools/clang/include/clang/Driver/Options.inc.tmp
==12630== WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f1ea9b01729 (/ssd/dnovillo/llvm/bld/bin/llvm-tblgen+0x8f729)
#1 0x7f1ea9a9cbe0 (/ssd/dnovillo/llvm/bld/bin/llvm-tblgen+0x2abe0)
#2 0x7f1ea9f7b69c (/ssd/dnovillo/llvm/bld/bin/llvm-tblgen+0x50969c)
#3 0x7f1ea82356ff (/lib/x86_64-linux-gnu/libc.so.6+0x216ff)
#4 0x7f1ea9b01280 (/ssd/dnovillo/llvm/bld/bin/llvm-tblgen+0x8f280)

SUMMARY: MemorySanitizer: use-of-uninitialized-value ??:0 ??
Exiting

I’m not sure how to proceed from here. The bot is clearly building things in a different way, but I don’t know how to duplicate it. Is there a way for me to use the same script that the bot is using? What is the general advice on reproducing buildbot failures?

Thanks. Diego.

msan isn’t usable without an instrumented C++ standard library.

The script in question is here:
https://code.google.com/p/address-sanitizer/source/browse/trunk/build/scripts/slave/buildbot_bootstrap.sh

They appear to use a prebuilt libstdc++ shared object.

I think the bot usually gives a readable error report, but it doesn’t work for this test because the test is passing stderr to FileCheck. Lots of tests do that, and we should find a way to make that work. We might want to pass MSAN_OPTIONS=log_path=/tmp/something.log and then cat that file from lit if it’s non-empty.

msan isn't usable without an instrumented C++ standard library.

The script in question is here:

https://code.google.com/p/address-sanitizer/source/browse/trunk/build/scripts/slave/buildbot_bootstrap.sh

Thanks, Reid. I've gotten the script and I'm now running it locally. It's
running into trouble in llvm_build2_asan, so I'll have to kick it a bit
first.

I think the bot usually gives a readable error report, but it doesn't work

for this test because the test is passing stderr to FileCheck. Lots of
tests do that, and we should find a way to make that work. We might want
to pass MSAN_OPTIONS=log_path=/tmp/something.log and then cat that file
from lit if it's non-empty.

Yeah. From the output, I can see part of the msan failure, but I can't see
the whole thing. Worse, I can't even replicate it.

Diego.

OK, so now I've gotten a build but the output from asan is less than
helpful:

$ llvm/x/llvm_build_asan/./bin/opt
llvm/x/llvm/test/Other/optimization-remarks-inline.ll
-inline -pass-remarks=inline -S

OK, so now I've gotten a build but the output from asan is less than
helpful:

$ llvm/x/llvm_build_asan/./bin/opt llvm/x/llvm/test/Other/optimization-remarks-inline.ll
-inline -pass-remarks=inline -S

==6791==ERROR: AddressSanitizer: heap-use-after-free on address
0x6040000016a8 at pc 0x1e70553 bp 0x7fff29de4fb0 sp 0x7fff29de4fa8
READ of size 13 at 0x6040000016a8 thread T0
    #0 0x1e70552 (/ssd/dnovillo/llvm/x/llvm_build_asan/bin/opt+0x1e70552)
    #1 0x1e6f3d3 (/ssd/dnovillo/llvm/x/llvm_build_asan/bin/opt+0x1e6f3d3)
    #2 0x7ab722 (/ssd/dnovillo/llvm/x/llvm_build_asan/bin/opt+0x7ab722)
    #3 0x19eacc5 (/ssd/dnovillo/llvm/x/llvm_build_asan/bin/opt+0x19eacc5)
    #4 0x1892f92 (/ssd/dnovillo/llvm/x/llvm_build_asan/bin/opt+0x1892f92)
[ ... ]

Is there an option for asan to show me source file locations? Or at least
function names. I'm not sure what to do with this.

You need one of these:
https://code.google.com/p/address-sanitizer/wiki/CallStack

(Hmm. I thought that on our bot the reports are symbolized... No?)

You need llvm-symbolizer in PATH.

Ah, I see.
The bot actually symbolizes the asan’s output just fine:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2959/steps/annotate/logs/stdio

 #1 0x2888c1d in llvm::raw_ostream::write(char const*, unsigned long) /home/dtoolsbot/build/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/Support/raw_ostream.cpp:332

But since this is a lit test, we don’t see the whole output and so this requires a local rebuild and rerun
(and then you need llvm-symbolizer in PATH).

I wonder if we can configure the lit test runner to print the (tail of) test output on failure.

–kcc

Thanks. I've now found and fixed the bug. Another thing that may help in
the future is to have asan display a warning/note when it can't find
llvm-symbolizer. This would help with figuring out why it's giving an
unreadable trace.

Diego.

You'd have to teach FileCheck, actually, since it's the one that consumes
stderr in this case. Rather than doing that, why not use
*SAN_OPTIONS=log_file=blah.txt, and teach lit to dump that on failure?

I wished in the recent past that FileCheck would print its input upon failure. It currently only prints a single line with “possible intended match: blah” which may or may not be the line you’re interested in. I can work on this if people think that this would be a good feature.

Adam

That would be great. I often find myself cutting and pasting the failed
command without the FileCheck pipe so that I can see exactly what failed.

Diego.