Memory sanitizer porting

Hello,

I am currently porting memory sanitizer to a custom platform, and discovered some strange things in the existing implementation.

  1. clang/llvm currently hardcode the list of supported platforms and disallow the use of a standalone msan implementation.
    I suppose the solution here is to submit a patch similar to https://reviews.llvm.org/D18865, which will provide the necessary arguments to configure the layout.
    I have it ready here. Will this approach be fine for the llvm dev team, and may I post it for the review then?

  2. There exists a concept of -fsanitize-blacklist argument, which is supposed to exclude source locations from being instrumented by the sanitizers.
    Normally this works without issues, but I discovered that you cannot compile compiler-rt msan implementation with -fsanitize=memory even if the whole location is blacklisted (that much I can confirm from the invocations of CodeGenModule::isInSanitizerBlacklist function).
    What happens is that for some reason memory sanitizer:
    — still tries to partially instrument the blacklisted code;
    — does not check whether its global memory storage variables are already defined.
    The second issue, present in MemorySanitizer::initializeCallbacks, adds a second copy of storage global variables when compiling msan.cc (e.g. __emutls_v.__msan_retval_tls.63, __emutls_v.__msan_param_tls.65), and this results in an undefined reference during the linkage.

Here the question is what was initially intended to be done. I know that compiling blacklisted asan runtime code with -fsanitize=address is just fine, and this is what Apple actually does in XNU KASAN implementation.
I kind of expected it to be the right way to do for msan as well, is this just a bug? If it is not, should I compile msan runtime without -fsanitize=memory in this case, and in fact asan runtime too?

  1. Other than that, I see that memory sanitizer is currently implemented only for 64-bit platforms. While I am aware of the issues behind requiring a lot of memory to use msan, are there any other issues for not supporting 32-bit?

Best regards,
Vit

Hi,

1. This patch adds an internal (-mllvm) option, which is basically
meant for debugging. If your custom platform has a target triple, you
could submit changes to llvm, clang and compiler-rt to specify any
platform-specific offsets and other details.
2. Blacklist is meant to disable checking for bugs in certain
functions, not to remove all instrumentation. With ASan, these are the
same. With MSan, it places instrumentation in a "safe" mode where, for
example, a function that reads from A and stores to B will (1) not
check A and (2) make B fully initialized even if data being stored
comes from an uninitialized location.

Building MSan runtime library with MSan is not going to work, blacklist or not.

3. No, not really. If you manage to allocate shadow and origin, which
must be the same size as the application-accessible region each, and
define a mapping function between them, it should be possible to make
MSan work.

Hi,

1. No, there is no custom triple for the platform. It currently uses Linux triple, and I do not think there is a possibility of upstreaming not so many changes in such a way. On the other side Apple uses the mllvm asan option to implement KASAN in XNU, so I think it will be fine to upstream a similar option, which I guess, could also be used for debugging, and may be helpful to other people prototyping their runtimes.

2. Thank you for this clarification. It makes good sense now.

3. That’s what I supposed to be the case, thanks.

Best regards,
Vit

Hi,

1. No, there is no custom triple for the platform. It currently uses Linux triple, and I do not think there is a possibility of upstreaming not so many changes in such a way. On the other side Apple uses the mllvm asan option to implement KASAN in XNU, so I think it will be fine to upstream a similar option, which I guess, could also be used for debugging, and may be helpful to other people prototyping their runtimes.

I see. So you want a flag or a set of LLVM flags to specify custom
MemoryMapParams. This is fine, feel free to send the patch for review.