Unifying TSan blacklist and no_sanitize_thread

Hi,

I consider reducing the usage of blacklist in sanitizer instrumentation passes and doing the necessary work in frontend (Clang) instead.

Some of it is already implemented: e.g. Clang will attach an attribute “sanitize_address” to function definition only if this function is not blacklisted. In this case we won’t instrument the memory accesses in this function in ASan instrumentation pass, so there’s no need to looking at blacklist once again.

TSan pass does the following:

  1. instruments plain memory accesses
  2. instruments atomic accesses
  3. instruments memory intrinsics calls.
  4. adds function entry/exit callbacks.

If a function doesn’t have sanitize_thread attribute (e.g. it was annotated with attribute((no_sanitize_thread)) ), then it still does (2), (3) and (4). If a function is blacklisted,
TSan pass does nothing. Shouldn’t the behavior be the same? I think, we must always do (4) to get
good stack traces, and probably do (2) (we may report races on atomics in this case, but otherwise
we may miss synchronization). Thoughts?

Hi,

I consider reducing the usage of blacklist in sanitizer instrumentation
passes and doing the necessary work in frontend (Clang) instead.

Some of it is already implemented: e.g. Clang will attach an attribute
"sanitize_address" to function definition only if this function is not
blacklisted. In this case we won't instrument the memory accesses in this
function in ASan instrumentation pass, so there's no need to looking at
blacklist once again.

TSan pass does the following:

1) instruments plain memory accesses
2) instruments atomic accesses
3) instruments memory intrinsics calls.
4) adds function entry/exit callbacks.

If a function doesn't have sanitize_thread attribute (e.g. it was annotated
with __attribute__((no_sanitize_thread)) ), then it still does (2), (3) and
(4). If a function is blacklisted,
TSan pass does nothing. Shouldn't the behavior be the same? I think, we must
always do (4) to get
good stack traces, and probably do (2) (we may report races on atomics in
this case, but otherwise
we may miss synchronization). Thoughts?

I don't think we should do (2).

Sounds like your plan would let us drop blacklist from *SanitizerPass
arguments, right? That's great.

We don't know how this functionality must work because we don't use it.
It's possible to invent use cases for all 16 permutations of 1/2/3/4.
I think we need to stick with omitting instrumentation of memory
accesses (and intrinsics, at least with best effort), but keep
instrumentation of atomics and func enter/exit. This covers probably
the most common use cases of suppressing known races and improving
performance. Atomics are generally needed to prevent false positives,
so it's risky to remove them. Func enter/exit are needed for good
reports.

I agree. Implemented this in r209940.

> Hi,
>
> I consider reducing the usage of blacklist in sanitizer instrumentation
> passes and doing the necessary work in frontend (Clang) instead.
>
> Some of it is already implemented: e.g. Clang will attach an attribute
> "sanitize_address" to function definition only if this function is not
> blacklisted. In this case we won't instrument the memory accesses in this
> function in ASan instrumentation pass, so there's no need to looking at
> blacklist once again.
>
> TSan pass does the following:
>
> 1) instruments plain memory accesses
> 2) instruments atomic accesses
> 3) instruments memory intrinsics calls.
> 4) adds function entry/exit callbacks.
>
> If a function doesn't have sanitize_thread attribute (e.g. it was
annotated
> with __attribute__((no_sanitize_thread)) ), then it still does (2), (3)
and
> (4). If a function is blacklisted,
> TSan pass does nothing. Shouldn't the behavior be the same? I think, we
must
> always do (4) to get
> good stack traces, and probably do (2) (we may report races on atomics in
> this case, but otherwise
> we may miss synchronization). Thoughts?

I don't think we should do (2).

Sounds like your plan would let us drop blacklist from *SanitizerPass
arguments, right? That's great.

Yes. We can already do that for TSan and MSan - sanitize_memory and
sanitize_thread
attrs are not added to functions by Clang if the functions are blacklisted.
There's still some
stuff to do in ASan w.r.t. global variables.

BTW, are we still interested in having "-mllvm -msan-blacklist" and "-mllvm
-tsan-blacklist",
or we may kill these flags as well, and rely only on a frontend Clang
compiler flag?

Now that the blacklist implementation has been moved to the frontend, will there be any change in behavior w.r.t. name mangling? Previously only the mangled names were matched against the blacklist. I think it makes more sense to use the demangled names, or both.

Now that the blacklist implementation has been moved to the frontend, will
there be any change in behavior w.r.t. name mangling? Previously only the
mangled names were matched against the blacklist. I think it makes more
sense to use the demangled names, or both.

Yes, we will be able to match unmangled (and, hopefully, fully-qualified)
names from the blacklist. But first we need to match the correct source
file names (which we should get from Clang's SourceManager) instead of LLVM
module identifiers, which can be arbitrary.