Uninstrumented LLVM build with AddressSanitizer instrumented user

llvm/include/llvm/Support/Allocator.h and a few other files call __asan_poison_memory_region in AddressSanitizer builds and __msan_poison in MemorySanitizer builds.
These identifiers are defined as empty macros in llvm/include/llvm/Support/Compiler.h.

To be compliant with the One Definition Rule, the values must match when building LLVM itself and when building a downstream project.
Put it in another way, (a) uninstrumented LLVM build with instrumented user and (b) instrumented LLVM build with uninstrumented user are not supported (ODR violations).

[Support] Bugprone usage of __has_feature(address_sanitizer) in Allocator · Issue #83566 · llvm/llvm-project · GitHub describes that a downstream group wants to do the following, which is currently unsupported:

We build LLVM + MLIR without sanitizers enabled, but have a downstream project that we build with ASAN.

[CMake][ASAN] Add support for ADDRESS_SANITIZER_BUILD compile option by Dinistro · Pull Request #83595 · llvm/llvm-project · GitHub attempts to support this scenario by recording the LLVM build time __has_feature() value (through LLVM_ADDRESS_SANITIZER_BUILD) to the generated llvm-config.h.
(Note: this mix-and-match scenario doesn’t work with MemorySanitizer/ThreadSanitizer. False positives would occur, as these sanitizers generally cannot work with partially instrumented programs.)

How do folks feel about this LLVM_USE_SANITIZER change? Are we comfortable with the increased support surface?

2 Likes

I’m not sure just changing this for LLVM works. The standard libraries currently also don’t support this, which has bitten some libc++ early adopters recently.

When properties like this become “infectious” across dependency boundaries it becomes significantly harder to maintain supported configurations.

Let’s take NDEBUG as an example. It will often be the case that mixing and matching (Debug project linked against Release LLVM) will cause technical ODR violations (inline functions in headers compiled with asserts into the client, but compiled without them into LLVM). In some projects it will sometimes also cause practical ODR violations (structs changing their layout).

I need to have a debug configuration for my project, just as I need to have a sanitizers configuration for it. Must that debug-with-sanitizers build link against a debug-with-sanitizers build of LLVM (and libstdc++ or libc++ for that matter)?

Which build parameters have to become part of the public interface (ABI-affecting, like rtti/exceptions, which at least seem to cause link errors on mismatch), and which do not?

1 Like

ASan is generally understood (citation required) to support partial program instrumentation. This is a big part of what makes it a much more adoptable, practical tool than MSan and TSan, which require full program instrumentation. I think it makes sense for LLVM to collectively support partially instrumented builds of ASan, both in libc++ and LLVM.

I’m usually in favor of holding the line against adding config settings of all kinds, but I think this one is justified. The other direction we could go here is to use weak symbols to sink the allocation strategy and poison/unpoison logic out of line into the LLVM build, but that would require careful consideration to ensure it doesn’t negatively impact allocator performance.

Historically, we have done various cleanups across LLVM & Clang headers to ensure that NDEBUG only controls asserts, and does not impact struct layout or ABI compatibility. This is why we have the separate LLVM_ENABLE_ABI_BREAKING_CHECKS variable.

1 Like

Are you referring to a case where libc++ was instrumented and user code lacked instrumentation, or the other case?

As @rnk mentioned, I was as well under the impression that the case of uninstrumented library code + instrumented user code is expected to work (not making any statement about the mirror case).

In the case that we want to explicitly not support these cases, I suggest to add some kind of check in LLVMConfig.cmake that warns on faulty usage.

I’m talking about the general case where some TUs are compiled with ASan and some without. Since most standard library code is in the headers, this results in some TUs where the memory of standard library containers are (un)poisoned but others aren’t. This can result in memory not being properly unpoisoned, e.g. when growing a vector.

Thanks for the clarification. As far as I understand, this is a more fundamental issue than I initially anticipated. Doesn’t this force one to link against an instrumented libc++ build when building arbitrary TUs, or does libc++ not use any of it’s templates in code that ends up in libc++.so?

Doesn’t that imply that if I install and use libc++ from e.g. apt.llvm.org, I am completely precluded from using ASan in C++ code that uses the stdlib? That I only get to use ASan when I compile libc++ myself? That seems very difficult.

This also reminds me of MSVC Debug Iterators, which is a safety feature that is probably exposed to the same ODR caveats as this one.

Still, I don’t think this should prevent LLVM from supporting the additional use-case, since not everyone uses libc++ and the problems seem orthogonal enough to me. A libLLVM that can be sanitized independently of client code seems a strict improvement over the current state of affairs, libc++ notwithstanding.

Until recently that wasn’t a problem at all, since only vector was annotated and that isn’t part of the dylib ABI. We’ve recently added string annotations, but they are only enabled when the dylib was built with ASan. I hope we can transparently link against a different dylib when using -fsanitize=address (or a similar flag) at some point.

I have no knowledge about libc++ ABI management at all, so forgive my potentially nonsensical question, but if “built with ASan” becomes part of the dylib ABI, doesn’t that property need to be baked into _LIBCPP_ODR_SIGNATURE for instance? Wouldn’t that force one to “freeze” the ASan-ness of the library itself into the exposed ABI, thereby not forcing the user to match the ASan-ness of the library?

2 Likes

There is a simple solution to that: don’t guarantee ABI stability for sanitized builds. ASan shouldn’t be used in production AFAIK, so we can simply require that all the code is built against the same library. This is already pretty much a requirement because of the aforementioned issues.

1 Like

While I understand the concerns, I do not yet understand why this should stop us from encoding this information in llvm-config.h?

As a downstream user, I would like to act upon this information, and throw a proper error in case of a instrumentation mismatch. Currently, this mismatch just leads to a non-deterministic instability, which I would rather avoid explaining to any more of my coworkers.

Currently, mix-and-matching can lead to silent breakage, which is undesirable.
To address this, I agree we should encode sanitizer information in llvm-config.h.

Error reporting vs. allowance:

I believe reporting an error at compile time is preferable for the LLVM_BUILD_LLVM_DYLIB=off configuration.
For LLVM_BUILD_LLVM_DYLIB=on, I believe many issues could be more benign, so perhaps we should just report a configure-time warning.


libc++ refers to the per-translation-unit ABI insulation as the ability to safely link two relocatable object files built with different versions of libc++ into the same executable or DSO.
This requires careful symbol annotations:

__attribute__((__visibility__("hidden"))) __attribute__((__exclude_from_explicit_instantiation__)) __attribute__((__abi_tag__(...)))

To achieve similar compatibility for sanitizers, we could either encode sanitizer in abi_tag or leverage always_inline.

I believe that even if we put aside the libc++ issue, the LLVM complexity alone could be overly intricate and challenging for most contributors to understand and maintain.
While the current [CMake][ASAN] Add support for ADDRESS_SANITIZER_BUILD compile option by Dinistro · Pull Request #83595 · llvm/llvm-project · GitHub could make certain mix-and-match scenarios work, it remains fragile.

Should I attempt to add this to my PR then? Note that reporting an error might lead to issues with many downstream projects, but this is not supported, so it might be fine?

I don’t think this is actually feasible. It is too easy to construct a std::vector in instrumented code, poison the container memory between [size, capacity), pass it to uninstrumented code, destroy it, and fail to unpoison it, leading to subsequent false positive reports. Maybe this particular edge case works in practice, but I think you can construct other examples where even if you hide the conflicting inline function definitions, the data moves between instrumentation modes, and invariants get broken.

Ultimately, I think container poisoning must be a publicly visible libc++ configuration setting.

I don’t think it’s sufficient to live with the status quo (or what I think the status quo is, correct me if wrong) of silent ABI incompatibility between instrumented and uninstrumented third party code when using a shared libc++. I don’t think that’s an acceptable user experience.

If we want to support the ASan project goal of supporting mixing instrumented and uninstrumented code, libc++ needs to have a global configuration setting to control poisoning. It can’t use a naked #if __has_feature check in headers to control poisoning.

Yes, passing a std::vector around could lead to false positives.
The fragility of different mix-and-matching configurations varies.
In practice the troublesome code patterns (like passing std::vector around) might be rare enough that the project appears to work in practice.

We have limited experience with the potential issues of inherently brittle configurations which often work in practice.
When using the llvm-config.h marker to identify unsupported configurations, we must carefully consider what to reject.

The ideal outcome is perhaps to default to rejecting mix-and-matching unless users explicitly specify an option to accept the potential risks.

For LLVM_BUILD_LLVM_DYLIB=off and msan/tsan, I believe we can definitely start with an error.
For LLVM_BUILD_LLVM_DYLIB=on asan, I am unclear whether an error could be disruptive.

I think the best solution would be to add a -finstrument-library=address flag (or something like that) to use a sanitized build of the library. That could also change the inline namespace to something different to make the ABI break at least a bit more explicit. The idea would be to use the multilib features for this.