TL;DR
I propose that in cases where TBAA metadata is incorrect, TypeSanitizer should follow the correct behaviour rather than reporting this implementation error to the user as though it were a genuine, standards breaking strict alias violation.
Problem
Clang sometimes emits incorrect TBAA metadata(example TBAA issue). TypeSanitizer instrumentation is created from the TBAA Clang emits, therefore sometimes this causes false positives (example TySan issue). However it is this incorrect TBAA that optimization pass’s alias analysis runs on.
The benefit of TySan instrumentation being based off TBAA metadata, is that if an optimization pass makes an assumption your code violates, you will certainly get a warning. The downside is that in cases where the emitted TBAA is incorrect, you get what appears to be a false positive:
float *f = new float(0.0f);
int *i = new (f) int(0); // Current TBAA believes i is a float
*i = 5; //ERROR: TypeSanitizer: type-aliasing-violation
// WRITE of size 4 at ... with type int accesses an existing object of type float
Altering TySan’s instrumentation to remove these false positives means that a user may have code which gives no TySan warnings, but which does have a difference in behaviour when optimized.
The ideal scenario is to implement something like @atrick’s proposal in the TBAA issue, where we add something like a @llvm.tbaa.rebind intrinsic, making TBAA data temporal rather than static, fixing this problem while keeping TySan’s and LLVM’s views on type aliasing the same. This would be quite the lengthy and disruptive change though…
Proposal
I propose that we default to emitting this extra information. This means no apparent “false positives” are emitted. This stops TySan users having to care about LLVM TBAA implementation details. It also means that genuine bugs in Clang aren’t misdiagnosed as user error, and can instead be reported on github.
Ideally for LLVM, these TBAA changes would happen, and TySan’s instrumentation can go back to being based only on TBAA metadata. But for TySan users this is a good stopgap change.
For the sake of debugging and helping future work, something like a -fsanitize-type-clang-tbaa-only to instrument based only on TBAA metadata could be added