LLVMdev Digest, Vol 89, Issue 60

Daniel, Kostya,

We had a meeting with the Clang people (Chris, Doug, Ted) on Thursday before the llvmdev meeting about adding dynamic checking tools into Clang -- IOC for undefined integer behaviors, and SAFECode for memory safety. SAFECode has similar goals to AddressSanitizer, though at least for now it has more checks, but is slower.

The main conclusion was that we should have a common run-time API for various tools to report errors. This API can be shared by tools like IOC, SAFECode, and (we hope) Address Sanitizer. The benefits are that users have to learn only one error reporting style; and the code base can use a common set of mechanisms to track and report execution information, so that improvements by any one project will benefit the other projects as well.

We're happy to keep the API simple for now, and to wrap the SAFECode and IOC run-times to use it. Let us know if you'd like to help define and use this API.

--Vikram
Professor, Computer Science
University of Illinois at Urbana-Champaign
http://llvm.org/~vadve

Daniel, Kostya,

We had a meeting with the Clang people (Chris, Doug, Ted) on Thursday before the llvmdev meeting about adding dynamic checking tools into Clang – IOC for undefined integer behaviors, and SAFECode for memory safety. SAFECode has similar goals to AddressSanitizer, though at least for now it has more checks,

[off topic, could you please educate me on the kinds of bugs that SAFECode finds and asan does not? Maybe in a separate thread. ]

but is slower.

The main conclusion was that we should have a common run-time API for various tools to report errors. This API can be shared by tools like IOC, SAFECode, and (we hope) Address Sanitizer. The benefits are that users have to learn only one error reporting style; and the code base can use a common set of mechanisms to track and report execution information, so that improvements by any one project will benefit the other projects as well.

We’re happy to keep the API simple for now, and to wrap the SAFECode and IOC run-times to use it. Let us know if you’d like to help define and use this API.

I would like to have some level of sharing between all these tools (add ThreadSanitizer to the mix).
But as I just discussed with John Regehr in a separate thread, the tools (asan and IOC) are quite different and sharing is limited.

Let me describe what asan (AddressSanitizer) and tsan (ThreadSanitizer) need and also my understanding of IOC’s needs. Then we’ll see how much sharing is possible between these and SAFECode.

asan:
– unwind stack on malloc/free (should be fast, libunwind is too slow, currently uses custom unwinder based on frame pointers)
– symbolize stack. Currently uses external script and add2line, which sucks. Maybe could reuse some of lldb code.
– intercept functions (malloc, pthread, etc)
– small set of utility functions (e.g. printf, mmap, strlen) that does not use libc.
– the functions that are called on error take strictly one parameter and never return (for speed and code size). Another option is to report an error by causing SIGILL using a call-free instruction sequence (5% faster).
– dies on first error.
– reports to stderr. Someone may need to customize it, but we did not have such requests yet.
– report style: see example at http://code.google.com/p/address-sanitizer/wiki/AddressSanitizer#Introduction . The format of the reports is similar to that of tsan, which in turn is similar to that of valgrind. I am reluctant to change the format because quite a few existing users of asan and tsan (and memcheck) rely on it.
– no suppressions
– supports blacklist file (and we also may need something like attribute(no-address-sanitizer)
– major user-visible controls: unwind depth, redzone size (passed via env. var ASAN_OPTIONS)

tsan:
– keep shadow call stack (unwinding is too slow)
– symbolize stack (same problem as with asan). gcc-based variant uses libbfd, but I don’t like that code. valgrind-based variant uses valgrind’s symbolizer. (ditto for PIN-based)
– intercept functions (malloc, pthread, etc)
– small set of utility functions (e.g. printf, mmap, strlen) that does not use libc.
– suppressions (syntax is similar to valgrind’s)
– ignore file (aka blacklist of functions/files)
– reports to stderr, may change to another stream.
– report style: see above. Examples are here: http://code.google.com/p/data-race-test/wiki/UnderstandingThreadSanitizerReports#The_report

– major user-visible controls: stack depth, sensitivity options, resource usage options. compiler-based variant gets the options via an env. var.

IOC:

  • does not need unwinding (does not show the stack)
  • does not need symbolization (adds all info at compile time to the error reporting call. questionable from performance point of view)
  • may need suppression, removal of duplicates, etc
  • (John, what else is important here?)

Currently, I can see that SAFECode can share these items with asan/tsan:

  • unwinding
  • symbolizing
  • function interception mechanism
  • option parsing
  • maybe style of error reports
  • maybe some low-level code (e.g. asan’s tiny libc replacement)

–kcc