Modifying address-sanitizer to prevent threads from sharing memory

Hi llvm-dev!

I'm writing my master's thesis on sandboxing/isolation of plugins
running in a multithreaded environment. These plugins run in a real-time
environment where the cost of IPC/context switching and being at the
scheduler's mercy is not an option. There can be a lot of plugin
instances running and all have to perform some computations and return
the result to the main thread on an audio-buffer callback.

These need to be isolated, primarily from the main thread, but
preferably from eachother as well. I'm thinking that modifying
address-sanitizer for this purpose could be feasible.

The shadow byte could also be split to contain a part with a 'short-id'
associated with to which thread/plugin the memory belong. This would of
course limit the plugin/thread short-ids available, and in theory some
false negatives could arise if two plugins are given the same short-id,
and then access eachother's memory, though this error would be detected
if it occurs another time, when they don't have the same short-id.

Modification would be slightly different if n threads drive n plugins,
or if a thread pool of n threads drive m plugins.

This would of course not work for globals, which naturally would be
owned by the main thread. Also any kind of communication between plugins
or back to the main thread would have to be mediated by the main thread
or using uninstrumented/unsafe code.

It's intended that the main code and plugin code are instrumented with
different asan passes.

From what I have thought of yet (but I'd love feedback!), these changes

are needed:

1. Storing+checking thread/plugin id in shadow byte.
2. Modified stack instrumentation to set up these shadow bytes.
3. Graceful shutdown of plugins preferred, free'ing heap and signaling
   back to main thread instead of shutting down.

Also, an optional compile flag could be used to modify the
instrumentation's granularity, whether to assume memory blocks are
allocated in multiples of 8, giving less code-blowup. The shadow bytes
would essentially be booleans then. Though this isn't directly related
to the changes I'd require doing.

=== Heap part ===

shadow_byte k: 0 0 0 0 0 0 0 0

short-id part: 0: main thread
             1-30: plugin/thread short-ids
             31 = 0x1F, all bits set: unallocated

shadow part: 0-7, same encoding as original.

== Original instrumentation code (ASan USENIX2012 paper) ==

* All instrumented code:

  ShadowAddr = (Addr >> 3) + offset;
  k = *ShadowAddr;

  if (k != 0 && (Addr & 7) + AccessSize > k)

== Concept code (code blowup, though) ==

  ShadowAddr = (Addr >> 3) + offset;
  k = *ShadowAddr;
  alloc_id = k >> 3;
  shadow = k & 0x0F;

* Thread/plugin code:

  if (alloc_id != my_short_id || // alloc belongs to other thread
      shadow && (Addr & 7) + AccessSize > shadow)

* Main code:

  if (alloc_id == 0x1F || // unallocated memory
      shadow != 0 && (Addr & 7) + AccessSize > shadow)

== Less granularity: assume+enforce multiples of 8, quicker/smaller ==

shadow byte = short-id: 0 = main id
                        1-254: short-ids
                        255 = 0xFF: unallocated

  ShadowAddr = (Addr >> 3) + offset;
  k = *ShadowAddr;

* Thread/plugin code:

  if (k != my_short_id) // allocated/set from different thread

* Main code:

  if (k == 0xFF) // unallocated memory

=== Stack part ===

This part would be different depending on whether there's a 1-to-1
mapping between threads and plugins.

* 1-to-1 mapping:

  Since the plugin owns the thread stack, all of the corresponding
  shadow can be initially filled with the shadow byte indicating that
  that thread can access all of it.

  Poisoning the redzones would have to be done still, but unpoisoning
  (and initial setup) would not set the shadow to zero(except for the
  main stack), but rather each byte (memset) back to (short_id << 3),
  which would indicate that the plugin with that short_id can read/write
  all corresponding bytes.

* n-to-m mapping:

  If the stack is shared, it can't be poisoned/unpoisoned back to a
  state readable by the next plugin using that stack space. When
  allocating stack variables, all corresponding shadow bytes have to be
  set to readable by that stack. Though it may be possible to have
  different stacks for each plugin, and use the same mapping as above.

Hi Peter,

Have you looked at ThreadSanitizer (-fthread-sanitizer)?
It does not exactly the thing you want, but something similar. It will
detect data races between threads (when a data is accessed w/o proper