Adding asan poison to Recycler and ArrayRecycler

Hi all,

I intend to add address sanitizer (un)poison calls to Recycler and ArrayRecycler since I spent a few hours tracking down a bug in the AMDGPU backend that turned out to be a use-after-free that would have been detected by asan if it weren't for the Recycler. See https://reviews.llvm.org/D25313.

Naturally, such a change exposes a bunch of bugs or things that are dodgy and happen not to be problematic today, but might easily break in the future, across all backends. I have prepared patches to fix all the issues in the CodeGen/AMDGPU lit tests; all the non-AMDGPU fixes are rolled into the patch above (although it probably makes sense to commit them first as a separate change before switching asan on). But it's clearly not my job to fix the inevitable build bot regressions in other backends.

How do you want to handle this?

Cheers,
Nicolai

Hi all,

I intend to add address sanitizer (un)poison calls to Recycler and ArrayRecycler

That’s a great thing to do! Looking forward for it.

since I spent a few hours tracking down a bug in the AMDGPU backend that turned out to be a use-after-free that would have been detected by asan if it weren't for the Recycler. See https://reviews.llvm.org/D25313.

Naturally, such a change exposes a bunch of bugs or things that are dodgy and happen not to be problematic today, but might easily break in the future, across all backends. I have prepared patches to fix all the issues in the CodeGen/AMDGPU lit tests; all the non-AMDGPU fixes are rolled into the patch above (although it probably makes sense to commit them first as a separate change before switching asan on). But it's clearly not my job to fix the inevitable build bot regressions in other backends.

Unfortunately, that does not sound as clear to me as it sounds to you apparently.

If you want to change the Recycler/ArrayRecycler in a way that it is exposing bugs with ASAN in other places in LLVM, you’ll have first to fix these bugs.

Of course the alternative is to fill an entry in Bugzilla for each bug you find and wait for them to be fixed before you can land your patch.

Nicolai Hähnle via llvm-dev <llvm-dev@lists.llvm.org> writes:

Hi all,

I intend to add address sanitizer (un)poison calls to Recycler and
ArrayRecycler since I spent a few hours tracking down a bug in the
AMDGPU backend that turned out to be a use-after-free that would have
been detected by asan if it weren't for the Recycler. See
https://reviews.llvm.org/D25313.

Naturally, such a change exposes a bunch of bugs or things that are
dodgy and happen not to be problematic today, but might easily break
in the future, across all backends. I have prepared patches to fix all
the issues in the CodeGen/AMDGPU lit tests; all the non-AMDGPU fixes
are rolled into the patch above (although it probably makes sense to
commit them first as a separate change before switching asan on). But
it's clearly not my job to fix the inevitable build bot regressions in
other backends.

How do you want to handle this?

Changes like this are kind of tricky to stage, as you've noticed :wink:

You have a few options, ranging from least work to fastest:

1. File bugs for the other backends and attach the patch that adds the
   annotations.

2. Include the ASAN backtraces in those bugs (Presumably there are a lot
   of duplicates, so you can attach one backtrace per backend and
   possibly have to repeat, or spend some time trying to pick out the
   duplicates and file a bug each)

3. Fix everyone else's backends yourself

Unfortunately, both (1) and (2) tend to take a depressingly long time,
so it's kind of hard to actually do these things without resorting to
(3). A lot of the failures look like they're in X86 and ARM though, and
there are a lot of contributors to those, so (2) might not be too bad in
this case.

The point is that the bugs are already there, it's just that nobody notices them. I'm totally behind a rule that says not to introduce any regressions, but what I'm proposing doesn't do that.

What I'm proposing is basically equivalent to introducing a buildbot with a previously untested configuration. Actually, that makes me think that there should be an XFAIL-based approach to this.

Anyway, I'm going to go with Justin's (1) + (2) bug-filing suggestion first, and see how quick people are at fixing the issues.

Cheers,
Nicolai