Contributing a new sanitizer for pointer casts

Hi all,

Some of you might remember that at EuroLLVM last year in Barcelona,
Chris Diamand and I gave a talk about Clang/libcrunch, a run-time
checking system which can be thought of as another flavour of sanitizer.
It checks pointer casts, using run-time type information. Roughly the
check is that the pointer really points to an instance of the target
type, though there are refinements to deal with various idioms violating
that. <http://www.llvm.org/devmtg/2016-03/#presentation9>

(I dropped a mention of this in the recent TBAA sanitizer thread, but
consensus was that on balance it's a different enough tool to want
both.)

My current research funding has some room for tech transfer activity, so
I've been spending some time on improving the code, with a hope of
eventually contributing it to LLVM.

This mail is just to get a handle on two questions: how much interest is
there in this, and what changes are most important in order to get
something contributable?

The system is a bit complex, so let me give you an overview of how it
currently works. (If you want full technical details, there are a couple
of research papers you could read -- see the bottom.)

- Instrumentation: this adds checks on (most) pointer casts, and in a
few other places. It also does a little source-level analysis to dump
information about allocation sites. We have both (my original) CIL and
(Chris's) Clang/LLVM implementations of this. The Clang version is not
too pretty at present: it uses -include'd inline helper functions
written in C and shared with the CIL implementation. It also requires a
bit of a hack to propagate certain type info (in uses of "sizeof")
onwards to LLVM so it can be used in a data-flow analysis.

- Hints from the programmer: these are necessary to declare allocation
functions, besides standard ones (malloc etc.). This is currently done
with an environment variable (LIBALLOCS_ALLOC_FNS) though I've thought
of adding a command-line option too. These declarations have effect at
both compile and link time.

- Compiler wrapper and helper tools: currently a mixture of shell,
Python and C++ helpers building on a pile of my own libraries
(libdwarfpp, dwarfidl, liballocstool), for DWARF analysis and
postprocessing. Roughly, these are responsible for generating and
linking the run-time type information itself.

- Type information. This is autogenerated uniqued / COMDAT'd instances
of a moderately complex (but compact) C struct for each distinct data
type. The model of type info (but not the representation) is somewhat
DWARF-inspired.

- Runtime. This is a preloadable shared library which does the
dispatching of the checks. It also gets its hooks into various places to
load type info as necessary, and to observe various kinds of allocation
happening within the process. Again it builds on a pile of my other
stuff (liballocs, which builds on trap-syscalls, mallochooks,
libdlbind).

Currently, my plan in a nutshell is to eliminate the C inline helpers in
favour of fully IR-level instrumentation, and also eliminate the
compiler wrapper in favour of a gold plugin (and maybe a bit of help in
the clang driver). This should result in a contributable diff that adds
a new sanitizer option (currently "-fsanitize=crunch", but name
negotiable :-). Binaries built this way will also require the gold
plugin and runtime (both out-of-tree) to do useful checking.

I don't intend to port the runtime. Although in principle this could
share code with the sanitizer runtimes, that's a lot of work and I don't
have the resource to visit this right now... barring major rewrites, the
runtime pretty much has to be GPL-licensed anyway, since it borrows code
from glibc and Xen (for purposes I'm pretty sure are not covered by the
sanitizer runtimes).

So my questions for you are whether this contribution would be welcome,
and in particular any red lines about how to do instrumentation, how to
factor everything, and how to deal with the external dependencies. As I
currently envisage things, the gold plugin must live out-of-tree since
it will require my libraries to build; I don't believe equivalent
library support exists within LLVM. This being out-of-tree seems not a
huge loss given that the runtime also will be.

Oh, and runtime support exists for x86-64/Linux only at the moment,
though there is a bit of code for FreeBSD.

For the interested, here are the research papers I mentioned.

"Dynamically diagnosing run-time type errors in unsafe code" (OOPSLA '16)
http://www.cl.cam.ac.uk/~srk31/#oopsla16a

"Towards a dynamic object model within Unix processes" (Onward! '15)
http://www.cl.cam.ac.uk/~srk31/#onward15

Code: <https://github.com/stephenrkell/liballocs>
<https://github.com/stephenrkell/libcrunch>
<https://github.com/stephenrkell/clangcrunch>.

All thoughts appreciated... let me know if you see any obstacles to
contribution, or if you're able to help, or just if you have questions.
Much obliged,

Stephen.

Hi Stephen,

I enjoyed reading through your EuroLLVM slides and OOPSLA paper. Detecting the creation of contract-violating pointers is an interesting idea, and your paper demonstrates that the checking can be comprehensive and effective.

However, I have concerns about the quality of diagnostics, the complexity of the driver, and about maintaining an out-of-tree runtime.

First, I'm concerned about libcrunch diagnostics which are morally correct, but which do not point out imminent correctness problems (for convenience I'll refer to these as false positives). You address this in the "Accommodating Sloppy C" section of your paper with the __like_a() runtime check. In my experience, users are reluctant to deploy the type of suppressions described here, and are more likely to abandon the sanitizer instead. What has your experience been with FPs? The results from your SPEC testing look promising, but I'm curious about any FP-related feedback you've gotten from other developers, or about the results of a libcrunch-enabled world build of FreeBSD. I've certainly seen code at Apple which would fail libcrunch's "is-a" checks, but which "works fine" according to the code owners. Would you be open to reducing the number of cases libcrunch is able to diagnose in order to reduce FP rates?

Second, do you see any way to simplify the wrappers required to invoke libcrunch? E.g is the runtime preload step necessary? Apart from the runtime shared object, can the other tools required to use libcrunch be consolidated into the clang binary?

Third, how do you plan on keeping the libcrunch integration into clang in-sync with the runtime, if the runtime is to live outside of the llvm tree? How will the feature be tested? Currently, the sanitizer runtimes ship with official LLVM downloads: what would the deployment story for libcrunch be (how would users get it)?

I don't mean any of this as a "do not proceed" message. These are simply some of the issues which I think are worth discussing early on.

best,
vedant

Hi Vedant,

I enjoyed reading through your EuroLLVM slides and OOPSLA paper.
Detecting the creation of contract-violating pointers is an
interesting idea, and your paper demonstrates that the checking can
be comprehensive and effective.

Glad you enjoyed them. :slight_smile:

However, I have concerns about the quality of diagnostics, the
complexity of the driver, and about maintaining an out-of-tree
runtime.

First, I'm concerned about libcrunch diagnostics which are morally
correct, but which do not point out imminent correctness problems
(for convenience I'll refer to these as false positives). You address
this in the "Accommodating Sloppy C" section of your paper with the
__like_a() runtime check. In my experience, users are reluctant to
deploy the type of suppressions described here, and are more likely
to abandon the sanitizer instead. What has your experience been with
FPs? The results from your SPEC testing look promising, but I'm
curious about any FP-related feedback you've gotten from other
developers, or about the results of a libcrunch-enabled world build
of FreeBSD.

Great question. The truth is that I don't yet have enough experience to
answer this, and part of the point of contributing this to LLVM would
be to get more people to try it out and contribute improvements.
However, I would say that I am already pleasantly surprised with how
on-point the warnings are... although there are false positives, they
tend to be about genuinely nasty/naughty code.

I take your point about the danger that "people would turn it off".
As an aside -- more below -- one thing I've worked quite hard on is
to ensure that you don't need to rebuild to turn it off, which at least
prevents a hard exit.

Beyond that, there is quite a long tail of mitigation measures against
false positives, including some not mentioned in the paper (and, er,
some not implemented). The big one, and one that isn't in the paper, is
"trap pointers" -- I outlined this in response to Hal in the TBAA
sanitizer thread. In short, it delays warnings from the time of cast to
the time of use. I only implemented this later, to support bounds
checking rather than cast checking, but it applies to the latter too.
(Sneak preview: I'm working on bounds checker I believe will offer a
different, possibly better, precision/performance trade-off than ASan,
but that's still in progress.)

Another possible mitigation is to use "__like_a" by default in some
circumstances, perhaps helped by an up-front static analysis pass to
find things like structure prefixing. That's pure idea-stuff right now.

Doing a build-world of FreeBSD or similar, as you suggest, is exactly
the sort of experience that is required, and inevitably, in doing so
there will be a lot of handle-turning to fix bugs and the like. I'd
rather go through this on an implementation that has a chance of
getting adopted... so I think the LLVM port has to happen first, though
I understand that the contribution per se may have to wait.

I've certainly seen code at Apple which would fail
libcrunch's "is-a" checks, but which "works fine" according to the
code owners. Would you be open to reducing the number of cases
libcrunch is able to diagnose in order to reduce FP rates?

I'm curious -- can you give me a sketch of what this code is doing? If
it's just a case of "creating wrong-typed pointers that aren't
deref'd", then I think we're good. In general I'd rather fix the FPs
precisely, rather than introduce underapproximation, and I've yet to
see cases where that really can't be done.

Second, do you see any way to simplify the wrappers required to
invoke libcrunch? E.g is the runtime preload step necessary? Apart
from the runtime shared object, can the other tools required to use
libcrunch be consolidated into the clang binary?

Great question. The plan is to consolidate the tools, yes -- but
probably mostly into the gold plugin. Only a couple of steps work on
the individual compilation units, and they can be deferred until link
time anyhow. It does mean that each compilation unit's allocation site
information has to get dumped somewhere, to be picked up at link time.
This is currently in a .i.allocs text file, but that is a bit untidy...
I've thought about a DWARF extension (DW_TAG_allocation_site or similar)
instead, which would be cleaner.

About preloading -- I happen to like this design, modulo the
clunky invocation, because it means you can easily turn the checking on
or off. For executables, you could simply link the preload stuff in,
but that would lose the on/off switch, and doesn't work for shared
libraries. Another idea I've had is to use a customised loader (ld.so)
instead of a preload library. So you would do

$ ./myprogram # runs normally

or

$ /path/to/crunchld.so ./myprogram # runs +crunch

This would more-or-less replicate the usage model of Valgrind tools.
(In general, my feeling is that extending the ld.so is more sane than
using LD_PRELOAD, although I realise it will seem a bit off-the-wall.)

Third, how do you plan on keeping the libcrunch integration into
clang in-sync with the runtime, if the runtime is to live outside of
the llvm tree?

I think that part is no problem. The interface between instrumented
code and the runtime is pretty narrow (basically just the __is_a(),
__like_a() and similar functions) and does not embed any LLVM stuff, so
it would be stable under LLVM churn. The same would hold for the gold
plugin... LLVM/clang's involvement is mostly just to do the
instrumentation, which is a pretty well-isolated task. The allocation
site metadata I just mentioned is probably the only change-prone thing.

Of course, there is a problem of breakage/churn in the
libcrunch/liballocs stuff... in particular, things like new DWARF
features can break things. I have no good answer to that, except that
more hands make light(er) work.

How will the feature be tested?

On that, I have no idea, and I'd have to take your lead on that. To
make an analogy with another out-of-tree dependency, how do you test
LTO against GNU binutils? I'm hoping, perhaps naively, that it could
work similarly.

Currently, the
sanitizer runtimes ship with official LLVM downloads: what would the
deployment story for libcrunch be (how would users get it)?

Initially they would build the libcrunch tree, which would build for
them the gold plugin and runtime. Again, I think it would be rather
like the LTO story (a web page with "follow these instructions"). And
of course distributors could build binary packages... again, it helps
that the interface between compiler and plugin/runtime is fairly simple.

(Full disclosure: currently you have to build both the liballocs tree
and the libcrunch tree, separately. Eventually the former will be a
submodule of the latter. I have too many levels of submodule going
on....)

I don't mean any of this as a "do not proceed" message. These are
simply some of the issues which I think are worth discussing early on.

Of course. :slight_smile: Thanks for the detailed comments! Hoping my answers make
sense, and happy to follow up,

Stephen.

Hi Vedant,

I enjoyed reading through your EuroLLVM slides and OOPSLA paper.
Detecting the creation of contract-violating pointers is an
interesting idea, and your paper demonstrates that the checking can
be comprehensive and effective.

Glad you enjoyed them. :slight_smile:

However, I have concerns about the quality of diagnostics, the
complexity of the driver, and about maintaining an out-of-tree
runtime.

First, I'm concerned about libcrunch diagnostics which are morally
correct, but which do not point out imminent correctness problems
(for convenience I'll refer to these as false positives). You address
this in the "Accommodating Sloppy C" section of your paper with the
__like_a() runtime check. In my experience, users are reluctant to
deploy the type of suppressions described here, and are more likely
to abandon the sanitizer instead. What has your experience been with
FPs? The results from your SPEC testing look promising, but I'm
curious about any FP-related feedback you've gotten from other
developers, or about the results of a libcrunch-enabled world build
of FreeBSD.

Great question. The truth is that I don't yet have enough experience to
answer this, and part of the point of contributing this to LLVM would
be to get more people to try it out and contribute improvements.
However, I would say that I am already pleasantly surprised with how
on-point the warnings are... although there are false positives, they
tend to be about genuinely nasty/naughty code.

What it would take to get to a point where there are no crunch "FPs" by design? E.g given the following code:

  A *a = malloc(sizeof(A));
  UnrelatedToA *uta = (UnrelatedToA *)a;
  LikeA *la = (LikeA *)uta;
  deref(la);

Could the diagnostic for the assignment to "uta" be suppressed, since "uta" isn't dereferenced?

I take your point about the danger that "people would turn it off".
As an aside -- more below -- one thing I've worked quite hard on is
to ensure that you don't need to rebuild to turn it off, which at least
prevents a hard exit.

How does the ability to disable sanitization without rebuilding help prevent hard exits?

Beyond that, there is quite a long tail of mitigation measures against
false positives, including some not mentioned in the paper (and, er,
some not implemented). The big one, and one that isn't in the paper, is
"trap pointers" -- I outlined this in response to Hal in the TBAA
sanitizer thread. In short, it delays warnings from the time of cast to
the time of use.

Oh, neat! Would it handle the example above?

I only implemented this later, to support bounds
checking rather than cast checking, but it applies to the latter too.
(Sneak preview: I'm working on bounds checker I believe will offer a
different, possibly better, precision/performance trade-off than ASan,
but that's still in progress.)

Another possible mitigation is to use "__like_a" by default in some
circumstances, perhaps helped by an up-front static analysis pass to
find things like structure prefixing. That's pure idea-stuff right now.

I'm not sure what it would take to prototype and evaluate this, but it sounds like a promising mitigation.

Doing a build-world of FreeBSD or similar, as you suggest, is exactly
the sort of experience that is required, and inevitably, in doing so
there will be a lot of handle-turning to fix bugs and the like. I'd
rather go through this on an implementation that has a chance of
getting adopted... so I think the LLVM port has to happen first, though
I understand that the contribution per se may have to wait.

I've certainly seen code at Apple which would fail
libcrunch's "is-a" checks, but which "works fine" according to the
code owners. Would you be open to reducing the number of cases
libcrunch is able to diagnose in order to reduce FP rates?

I'm curious -- can you give me a sketch of what this code is doing?

It's the example I provided above. It might also take some work to teach crunch about type relationships in ObjC, or at least to make crunch ignore ObjC types.

If it's just a case of "creating wrong-typed pointers that aren't
deref'd", then I think we're good. In general I'd rather fix the FPs
precisely, rather than introduce underapproximation, and I've yet to
see cases where that really can't be done.

Makes sense.

Second, do you see any way to simplify the wrappers required to
invoke libcrunch? E.g is the runtime preload step necessary? Apart
from the runtime shared object, can the other tools required to use
libcrunch be consolidated into the clang binary?

Great question. The plan is to consolidate the tools, yes -- but
probably mostly into the gold plugin.

Would it be feasible to introduce a clang tool that runs after -cc1 to do some of the heavy lifting? This might be more portable than a linker plugin.

Only a couple of steps work on the individual compilation units, and they can be deferred until link
time anyhow. It does mean that each compilation unit's allocation site
information has to get dumped somewhere, to be picked up at link time.

Sorry for the naive question, but could you explain how the allocation site information is generated? Is this something that could be deferred until link time?

This is currently in a .i.allocs text file, but that is a bit untidy...
I've thought about a DWARF extension (DW_TAG_allocation_site or similar)
instead, which would be cleaner.

About preloading -- I happen to like this design, modulo the
clunky invocation, because it means you can easily turn the checking on
or off. For executables, you could simply link the preload stuff in,
but that would lose the on/off switch, and doesn't work for shared
libraries. Another idea I've had is to use a customised loader (ld.so)
instead of a preload library. So you would do

$ ./myprogram # runs normally

or

$ /path/to/crunchld.so ./myprogram # runs +crunch

This would more-or-less replicate the usage model of Valgrind tools.
(In general, my feeling is that extending the ld.so is more sane than
using LD_PRELOAD, although I realise it will seem a bit off-the-wall.)

I have a preference for building separately when enabling sanitizers. I feel this way because I don't have reasons to disable sanitization once I've enabled it in one build tree, and because the performance of the sanitized binary is impacted. Having this as the supported workflow may be more convenient if it makes it possible to skip the preload step.

Third, how do you plan on keeping the libcrunch integration into
clang in-sync with the runtime, if the runtime is to live outside of
the llvm tree?

I think that part is no problem. The interface between instrumented
code and the runtime is pretty narrow (basically just the __is_a(),
__like_a() and similar functions) and does not embed any LLVM stuff, so
it would be stable under LLVM churn. The same would hold for the gold
plugin... LLVM/clang's involvement is mostly just to do the
instrumentation, which is a pretty well-isolated task. The allocation
site metadata I just mentioned is probably the only change-prone thing.

Of course, there is a problem of breakage/churn in the
libcrunch/liballocs stuff... in particular, things like new DWARF
features can break things. I have no good answer to that, except that
more hands make light(er) work.

How will the feature be tested?

On that, I have no idea, and I'd have to take your lead on that. To
make an analogy with another out-of-tree dependency, how do you test
LTO against GNU binutils? I'm hoping, perhaps naively, that it could
work similarly.

There are some in-tree tests which use libLTO.dylib and ld64. The tests depend on those tools being available. For end-to-end testing of crunch, you could do something similar. The instrumentation and driver changes could be tested in isolation.

My preference would be to have the sanitizer runtime live in compiler-rt, which is where end-to-end testing of the other sanitizers currently happens.

Currently, the
sanitizer runtimes ship with official LLVM downloads: what would the
deployment story for libcrunch be (how would users get it)?

Initially they would build the libcrunch tree, which would build for
them the gold plugin and runtime. Again, I think it would be rather
like the LTO story (a web page with "follow these instructions"). And
of course distributors could build binary packages... again, it helps
that the interface between compiler and plugin/runtime is fairly simple.

(Full disclosure: currently you have to build both the liballocs tree
and the libcrunch tree, separately. Eventually the former will be a
submodule of the latter. I have too many levels of submodule going
on....)

IMO a model where all of the sanitizer components live in-tree would make testing and deploying the sanitizer easier. It may also make maintenance easier, e.g if it's ever necessary to change the libcrunch <-> clang interface.

I know you mentioned up front that you don't intend to port the runtime to compiler-rt, as this would require a major rewrite. However, I think that would be the right way to go, and I'd be willing to help out with reviews etc.

I'm curious to hear what others think about having an out-of-tree sanitizer runtime. I'm not sure that the situation with LTO and gold is precedent for this, but others may feel differently.

vedant

What it would take to get to a point where there are no crunch "FPs"
by design? E.g given the following code:

  A *a = malloc(sizeof(A));
  UnrelatedToA *uta = (UnrelatedToA *)a;
  LikeA *la = (LikeA *)uta;
  deref(la);

Could the diagnostic for the assignment to "uta" be suppressed, since
"uta" isn't dereferenced?

Yes, it could. I'm sticking my neck out slightly since this isn't
currently implemented. But what should happen is

- uta is trapped as "invalid / do-not-use" in the normal __is_a() cast
  check

- on the cast to LikeA*, the (de-trapped) pointer is re-checked against
  the underlying object. Assuming "LikeA" is in the list of types for
  which __like_a() has been turned on, this would pass and yield a
  valid pointer.

How to get LikeA into that list automatically is a matter for thought,
and probably need a fuller example to work through.

If we're talking about structs, code that needs __like_a() is almost
always code that requires -fno-strict-aliasing. (An exception is
sockaddr-like types with padding in.)

The other relaxation of __like_a() is for signed/unsigned sloppiness,
which is more common. That could probably be turned on by default; I
just personally like being stricter.

For completeness, I should say the other thing that messes with
libcrunch is casting to a completely unrelated aggregate type (i.e.
wouldn't satisfy __like_a()) just for the sake of using some particular
member to do an access. So if I have

struct S { int x; ... };

some code might cast any old int* to S*, if it only wants to do ->x on
the pointer. With a lot of gymnastics I could support this, either by
rewriting certain field accesses as casts or by some trap-side logic. I
don't know any code that does this (though I'd love to be pointed at
some) so it hasn't seemed worth supporting so far.

> I take your point about the danger that "people would turn it off".
> As an aside -- more below -- one thing I've worked quite hard on is
> to ensure that you don't need to rebuild to turn it off, which at
> least prevents a hard exit.

How does the ability to disable sanitization without rebuilding help
prevent hard exits?

My poor wording; I meant "hard exit" of the user, i.e. it's easy for
the user to turn it back on later, without rebuilding. There's not much
performance drawback to building with the checks in, if you don't
preload the runtime.

Though also, I forgot to say that failed checks in libcrunch don't
(unless you want them to) cause termination of the program. That is
another thing that makes occasional false positives easier to deal with
(-- not that I believe there need to be any false positives, but
still...).

> Beyond that, there is quite a long tail of mitigation measures
> against false positives, including some not mentioned in the paper
> (and, er, some not implemented). The big one, and one that isn't in
> the paper, is "trap pointers" -- I outlined this in response to Hal
> in the TBAA sanitizer thread. In short, it delays warnings from the
> time of cast to the time of use.

Oh, neat! Would it handle the example above?

Yes, modulo the complexity of turning on __like_a() automagically.

> I only implemented this later, to support bounds
> checking rather than cast checking, but it applies to the latter
> too. (Sneak preview: I'm working on bounds checker I believe will
> offer a different, possibly better, precision/performance trade-off
> than ASan, but that's still in progress.)
>
> Another possible mitigation is to use "__like_a" by default in some
> circumstances, perhaps helped by an up-front static analysis pass to
> find things like structure prefixing. That's pure idea-stuff right
> now.

I'm not sure what it would take to prototype and evaluate this, but
it sounds like a promising mitigation.

Indeed. From my perspective, the blocker to developing and evaluating
this stuff is it's already assuming uncommon code, and often the
properties are not syntactically obvious... so getting a good set of
evaluation cases together is like gathering needles from a haystack.
Maybe it could be useful to collect a corpus of "hard to sanitize" code
(e.g. perl would count :-). Alternatively (and less convincing but...)
there are already synthetic tests, like the Cerberus de-facto tests,
and maybe older things like C-torture. I have run libcrunch on the
Cerberus tests before and from an initial scan did not see any
surprises (i.e. it complains at the things I expected) although it has
acquired more tests since then, and I should reserve final judgement
til I go through more carefully.

> Doing a build-world of FreeBSD or similar, as you suggest, is
> exactly the sort of experience that is required, and inevitably, in
> doing so there will be a lot of handle-turning to fix bugs and the
> like. I'd rather go through this on an implementation that has a
> chance of getting adopted... so I think the LLVM port has to happen
> first, though I understand that the contribution per se may have to
> wait.
>
>> I've certainly seen code at Apple which would fail
>> libcrunch's "is-a" checks, but which "works fine" according to the
>> code owners. Would you be open to reducing the number of cases
>> libcrunch is able to diagnose in order to reduce FP rates?
>
> I'm curious -- can you give me a sketch of what this code is doing?

It's the example I provided above. It might also take some work to
teach crunch about type relationships in ObjC, or at least to make
crunch ignore ObjC types.

Ah yes. I don't know Objective-C well enough to say what's best there.

> If it's just a case of "creating wrong-typed pointers that aren't
> deref'd", then I think we're good. In general I'd rather fix the FPs
> precisely, rather than introduce underapproximation, and I've yet to
> see cases where that really can't be done.

Makes sense.

>> Second, do you see any way to simplify the wrappers required to
>> invoke libcrunch? E.g is the runtime preload step necessary? Apart
>> from the runtime shared object, can the other tools required to use
>> libcrunch be consolidated into the clang binary?
>
> Great question. The plan is to consolidate the tools, yes -- but
> probably mostly into the gold plugin.

Would it be feasible to introduce a clang tool that runs after -cc1
to do some of the heavy lifting? This might be more portable than a
linker plugin.

There is quite a bit of stuff that is movable, so yes, such a tool
could take over some of the work. I suppose you could even generate
type information on a per-CU basis, at the cost of generating more
comdat linker input... now I think about it, doing the dedup there
may be faster.

The main things that have to happen at link time are wrapping of
allocator functions (each is basically linked with --wrap, together
with some generated code) and emitting DSOs with the type metadata. I
suppose the "separate metadata objects" model could be ditched
entirely, though it is quite useful to be able to generate metadata for
binaries you didn't build yourself (e.g. libc).

> Only a couple of steps work on the individual compilation units,
> and they can be deferred until link time anyhow. It does mean that
> each compilation unit's allocation site information has to get
> dumped somewhere, to be picked up at link time.

Sorry for the naive question, but could you explain how the
allocation site information is generated? Is this something that
could be deferred until link time?

No problem. Logically, the allocation site analysis is a data flow
analysis finding where "sizeof"-derived values reach the "size" argument
positions of allocation calls. That's why Chris (Diamand) had to
come up with the hack that preserves sizeof into the LLVM IR (via dummy
function calls).

This information comes out in a simple bunch of tuples (file, line,
type). At link time, the debugging information is used to extend these
tuples with the actual address of the allocation call. This goes into
the loadable metadata object and is used by the runtime.

So you can defer it only as long as you have "sizeof" around... of
course if you don't mind re-reading the source, you can even do it
way after build time, and that is sometimes useful (again, thinking
libc).

A radical approach might be to give up classifying allocation sites
altogether, and just say "the first cast determines the type". I am
uncomfortable with this, but it would simplify things.

(I'm uncomfortable because I feel that allocation sites have meaning,
from a debugging and comprehension point of view -- not limited to
their type info, but thinking e.g. about their use as alias analysis
classes -- so there is value in reifying them at debug/sanitize time
and having allocation site info available for diagnostics/inspection.)

> This is currently in a .i.allocs text file, but that is a bit
> untidy... I've thought about a DWARF extension
> (DW_TAG_allocation_site or similar) instead, which would be cleaner.
>
> About preloading -- I happen to like this design, modulo the
> clunky invocation, because it means you can easily turn the
> checking on or off. For executables, you could simply link the
> preload stuff in, but that would lose the on/off switch, and
> doesn't work for shared libraries. Another idea I've had is to use
> a customised loader (ld.so) instead of a preload library. So you
> would do
>
> $ ./myprogram # runs normally
>
> or
>
> $ /path/to/crunchld.so ./myprogram # runs +crunch
>
> This would more-or-less replicate the usage model of Valgrind tools.
> (In general, my feeling is that extending the ld.so is more sane
> than using LD_PRELOAD, although I realise it will seem a bit
> off-the-wall.)

I have a preference for building separately when enabling sanitizers.
I feel this way because I don't have reasons to disable sanitization
once I've enabled it in one build tree, and because the performance
of the sanitized binary is impacted. Having this as the supported
workflow may be more convenient if it makes it possible to skip the
preload step.

That's fair -- I can see both ways. My preference has been to minimise
the "reasons to disable" in the first place. In this case, the slowdown
is usually minimal if you don't load the library, so it's job mostly
done. That model doesn't work for all sanitizers, of course.

(Still, long-term, I feel my job is only really done when deployed
binaries have the checks compiled in -- at least outside of HPC-style
workloads.)

>> Third, how do you plan on keeping the libcrunch integration into
>> clang in-sync with the runtime, if the runtime is to live outside
>> of the llvm tree?
>
> I think that part is no problem. The interface between instrumented
> code and the runtime is pretty narrow (basically just the __is_a(),
> __like_a() and similar functions) and does not embed any LLVM
> stuff, so it would be stable under LLVM churn. The same would hold
> for the gold plugin... LLVM/clang's involvement is mostly just to
> do the instrumentation, which is a pretty well-isolated task. The
> allocation site metadata I just mentioned is probably the only
> change-prone thing.
>
> Of course, there is a problem of breakage/churn in the
> libcrunch/liballocs stuff... in particular, things like new DWARF
> features can break things. I have no good answer to that, except
> that more hands make light(er) work.
>
>> How will the feature be tested?
>
> On that, I have no idea, and I'd have to take your lead on that. To
> make an analogy with another out-of-tree dependency, how do you test
> LTO against GNU binutils? I'm hoping, perhaps naively, that it could
> work similarly.

There are some in-tree tests which use libLTO.dylib and ld64. The
tests depend on those tools being available. For end-to-end testing
of crunch, you could do something similar. The instrumentation and
driver changes could be tested in isolation.

My preference would be to have the sanitizer runtime live in
compiler-rt, which is where end-to-end testing of the other
sanitizers currently happens.

I can see why that's desirable. And it could ultimately be achieved, no
doubt, although it feels like more work than I can reasonably get done
myself, at least on this occasion.

I should add that the libcrunch runtime is not really just that; it's
mostly liballocs, which is designed as a general-purpose way to
add run-time type information to a Unix-like process -- with
applications beyond sanitizer-like tools. I see it as a run-time
service in its own right, not "just" an adjunct to a compiler-based
tool. Some of the funkier experimental applications of liballocs have
been on avoiding FFIs (in a V8 extension), adding type info to
memory-mapped files, and (internally) doing JIT-like dynamic code
generation in a way that is transparent to the dynamic linker.

Conversely, if one focuses only on this one use case, it would
probably be straightforward-ish to reimplement a stripped-down
replacement specialised to type checking -- especially if you don't
mind forgoing things like nested/custom allocators, resuming after
using a trap pointer (some hairy instruction-decoding gymnastics using
a lot of borrowed GPL'd code) or really high coverage of the address
space (e.g. liballocs can answer queries on pointers pointing into the
auxiliary vector, or sometimes into ld.so/libc-internal structures,
etc.). This replacement would be a better fit for compiler-rt. I'm
still wary of saying I would have the resource to implement it....

>> Currently, the
>> sanitizer runtimes ship with official LLVM downloads: what would
>> the deployment story for libcrunch be (how would users get it)?
>
> Initially they would build the libcrunch tree, which would build for
> them the gold plugin and runtime. Again, I think it would be rather
> like the LTO story (a web page with "follow these instructions").
> And of course distributors could build binary packages... again, it
> helps that the interface between compiler and plugin/runtime is
> fairly simple.
>
> (Full disclosure: currently you have to build both the liballocs
> tree and the libcrunch tree, separately. Eventually the former will
> be a submodule of the latter. I have too many levels of submodule
> going on....)

IMO a model where all of the sanitizer components live in-tree would
make testing and deploying the sanitizer easier. It may also make
maintenance easier, e.g if it's ever necessary to change the
libcrunch <-> clang interface.

I know you mentioned up front that you don't intend to port the
runtime to compiler-rt, as this would require a major rewrite.
However, I think that would be the right way to go, and I'd be
willing to help out with reviews etc.

I can understand this preference. Thanks for being willing to help on
your side.

I'm curious to hear what others think about having an out-of-tree
sanitizer runtime. I'm not sure that the situation with LTO and gold
is precedent for this, but others may feel differently.

(Of course it's different, but I'd argue in the same spirit as e.g.
using the host system linker until the LLVM project found the
time/resources to implement a replacement.)

Stephen