[RFC] Extending and improving Clang's undefined behavior checking

Hi,

There are three different (and mostly orthogonal, design-wise) areas where I would like to make improvements to Clang’s undefined behavior checking:

  1. Completeness of checks. There are integer undefined behaviors which -ftrapv and -fcatch-undefined-behavior don’t catch, and there is almost no checking available for any other undefined behaviors outside of those and the ones caught by {Address,Thread,Memory} Sanitizer.

I would like to add support for checking the following additional undefined behaviors, which Clang does not currently appear to be able to check for:

  • Using a null pointer or empty glvalue (a dereferenced null pointer) as the object expression in a class member access expression or member function call.
  • Using an object of polymorphic class type where either the static and dynamic types are incompatible or the object’s lifetime has not started or has ended, in a delete-expression, class member access expression, member function call, derived-to-virtual-base conversion, dynamic_cast, or typeid, or using the value of a reference bound to such a glvalue.
  • Base-to-derived cast on wrong polymorphic type
  • Binding a reference to an inappropriately-aligned address or to an empty lvalue
  • VLA size evaluates to a nonpositive value
  • Reaching the end of a function with a non-void return type, in C++ (in C, this is only UB if the caller uses the return value)
  • Overflow in conversion from floating point to smaller floating point or to integral type
  • Overflow in conversion from integral type to floating point (int128 to float, or when converting to __fp16)
  • Converting an out-of-range integer to an enumerated type
  • Floating-point arithmetic overflow
  • Pointer arithmetic which leaves the bounds which can be determined by llvm.objectsize
  • Pointer subtraction between pointers which can be determined to be in different objects
  • Signed << where the LHS is negative, or where the result doesn’t fit in the result type (C99, C11) or doesn’t fit in the corresponding unsigned type (C++11)
  • Assignment where the LHS and RHS overlap but are not equal

Regehr et al’s IOC has code to handle a few of these checks already, which should be straightforward to apply to Clang trunk (assuming the IOC guys are still happy with that?). Low overhead would be an explicit goal of all of these checks.

  1. Command-line interface. We currently have the following options to enable various flavors of undefined behavior checks:
    -ftrapv
    -fcatch-undefined-behavior
    -faddress-sanitizer
    -fthread-sanitizer

I would like for us to have a single argument which enables all undefined behavior checking which can be performed cheaply and with no changes to address space layout or ABI; I propose we extend -fcatch-undefined-behavior to be that argument. IOC adds the -ftrapv checks to -fcatch-undefined-behavior (unless we’re in -fwrapv mode), so we can easily pick up that part of this change.

IOC also adds:
-fcatch-undefined-ansic-behavior
-fcatch-undefined-c99-behavior
-fcatch-undefined-cxx0x-behavior
-fcatch-undefined-cxx98-behavior
-fcatch-undefined-nonarith-behavior

I think we should support this kind of configuration through a mechanism similar to warning flags: -fcatch-undefined-behavior=c++11 -fno-catch-undefined-behavior=null-reference, for instance.

Also, I think we should consider renaming this switch (and/or possibly the -f*-sanitizer switches) to provide a consistent set of names, but I don’t have a concrete proposal for that.

  1. Handling and reporting of undefined behavior, once caught. The sanitizers produce decent information (which is both useful and detailed), but it could be prettier. The other checks just emit @llvm.trap(), which is far from ideal.

IOC is in some ways quite ambitious here, and provides options to not only diagnose a problem, but also to call a function which can produce a value for the operation and recover. I’m not proposing adding such a feature to Clang at this time (I’ll leave that to the IOC folks).

I would like to propose adding a runtime library for the -fcatch-undefined-behavior checks, and emit calls to relevant diagnostic functions instead of calls to @llvm.trap. I’m planning on having one library entry point per flavor of check, named _ubsan_report<check_name>, which would be passed relevant operand values, and a pointer to a structure containing information about the context (file, line and column number, operand types as strings, and possibly a code snippet to display with the diagnostic).

As a strawman proposal for runtime error output, we could aim to mirror the formatting style of compile-time diagnostics (since we’ve already got one way of beautifully presenting diagnostic information, why invent another?). For instance:

$ clang++ bad.cc -o bad -fcatch-undefined-behavior
$ ./bad
test.cc:4:12: runtime error: left shift count 33 >= width of type ‘int’ (32 bits)
return i << n;
^ ~
33
test.cc:7:30: note: in call to ‘f’ here
Illegal instruction
$

It would seems sensible to provide a common library for producing such diagnostic output and also use it in the sanitizers. Again, strawman output:

/usr/include/c++/4.6.x-google/bits/stl_algobase.h:192:10: runtime error: global buffer overflow reading address 0x2b29a5e899f4
if (__b < __a)
^
:0x2b29a5e899f4: note: contents of nearby memory:
… e8 a5 29 2b 00 00 01 00 00 00 gg gg gg gg gg gg gg gg gg gg gg gg gg gg …

...ace)::GetBucket(int)::threshold read
**<memory>:0x2b29a5e899e0: note:** read address is 0 bytes after global variable '(anonymous namespace)::GetBucket(int)::threshold'
... gg gg gg gg 08 00 00 00 01 00 00 00 f4 99 e8 a5 29 2b 00 00 01 00 00 00 ...
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(anonymous namespace)::GetBucket(int)::threshold
**foo/bar.cc:863: note:** in call to 'std::min<int>(const int &, const int &)' here
**foo/bar.cc:894: note:** in call to '(anonymous namespace)::GetBucket(int)' here
[...]
**gunit/gunit_main.cc:59: note:** in call to 'testing::UnitTest::Run()' here
**<__libc_start_main>:0x2b29cb224d5d: note:** in call to 'main' here

This proposal intends to incorporate some, but not all, of IOC into mainline Clang. The parts which I'm not currently intending to address as part of this work are:
* -fuse-intrinsic. This allows the choice of whether to use LLVM's overflow intrinsics. I intend to use such intrinsics unconditionally.
* -fuse-random-value. This allows recovery from detected undefined behavior by providing a value for the operation, which is not part of my proposal.
* -fhandler-null. This selects the usage of an alternate runtime library.
* -fchecks-num. This provides logging output when checks are inserted, and would not be valuable to my target audience.
* -fcatch-non-ubc-type. This is designed to catch unsigned integer overflow, and some other cases which don't have undefined behavior, and so is not directly connected to this work.

Does this sound good?
Richard

This sounds positively awesome. And the explicit goal of low overhead is a definite plus. I could perfectly imagine using those checks as parts of our test builds at work if it turns out fast enough.

– Matthieu

I like the idea. I would recommend a minor addition. I think it'd be
nice to have documentation that lists undefined behaviors, whether
they are supported or not and how to enable checking, but also
undefined behaviors that aren't being checked and ideally, a blurb on
what makes it prohibitive/impossible to check it.

Richard,

I think adding the runtime undefined behavior checking and unifying the diagnostic output format is a great idea.

This would probably be of interest to the LLVM Dev list as well.

Anna.

There are three different (and mostly orthogonal, design-wise) areas where I would like
to make improvements to Clang's undefined behavior checking:

This is all great and will be a huge benefit to Clang users.

Regehr et al's IOC has code to handle a few of these checks already, which should be
straightforward to apply to Clang trunk (assuming the IOC guys are still happy with
that?). Low overhead would be an explicit goal of all of these checks.

Will Dietz, a grad student working with Vikram, is currently hacking IOC into shape for inclusion. Will, can you make sure Richard has access to our github?

Richard, if you're willing to help us get this into the trunk we'd be super happy.

IOC also adds:
-fcatch-undefined-ansic-behavior
-fcatch-undefined-c99-behavior
-fcatch-undefined-cxx0x-behavior
-fcatch-undefined-cxx98-behavior
-fcatch-undefined-nonarith-behavior

We are currently revising the command line interface to IOC. We very much want it to fit into the bigger undefined behavior checker picture.

IOC is in some ways quite ambitious here, and provides options to not only diagnose a
problem, but also to call a function which can produce a value for the operation and
recover. I'm not proposing adding such a feature to Clang at this time (I'll leave that
to the IOC folks).

IOC is fairly modular in the sense that the checking logic doesn't care that much about the trap handling logic.

Our experience with integer overflow checking is that it is totally useless without verbose diagnostics because nobody (including us) can understand what is going on given just a line number.

The logic for recovering could be very handy, for example to implement AIR integers. But our priority is to just get the basic checking + trap handling added.

test.cc:4:12: runtime error: left shift count 33 >= width of type 'int' (32 bits)
return i << n;
^ ~
33
test.cc:7:30: note: in call to 'f' here
Illegal instruction

This is a good level of verbosity, I think.

* -fuse-intrinsic. This allows the choice of whether to use LLVM's overflow intrinsics.
I intend to use such intrinsics unconditionally.

Yes, good, though I'll note that last time we measured, the intrinsics were not uniformly a performance win over protecting maybe-undefined operations using explicit checks. As far as we could tell, this is because an insufficient number of LLVM passes know what to do with the overflow intrinsics. Chris encouraged us to file bugs but it's very hard to track down these missed optimizations.

* -fuse-random-value. This allows recovery from detected undefined behavior by providing
a value for the operation, which is not part of my proposal.

Good.

* -fhandler-null. This selects the usage of an alternate runtime library.

Good.

* -fchecks-num. This provides logging output when checks are inserted, and would not be
valuable to my target audience.

You mean to always terminate the program on undefined behavior? This is probably reasonable as long as you are aware that basically zero C/C++ programs will actually finish normally when you turn this on. Many developers will be so annoyed by this that they will simply not use the checker.

* -fcatch-non-ubc-type. This is designed to catch unsigned integer overflow, and some
other cases which don't have undefined behavior, and so is not directly connected to this
work.

We like these checks but it is totally reasonable to not put them in at first. Once the rest of the infrastructure is in the trunk, it should be easy for us to push this in separetely.

Thanks again.

John

Regehr et al’s IOC has code to handle a few of these checks already, which should be

straightforward to apply to Clang trunk (assuming the IOC guys are still happy with
that?). Low overhead would be an explicit goal of all of these checks.

Will Dietz, a grad student working with Vikram, is currently hacking IOC into shape for inclusion. Will, can you make sure Richard has access to our github?

Richard, if you’re willing to help us get this into the trunk we’d be super happy.

Absolutely. Assuming no-one objects to the direction, I’d be more than happy to provide patch review and other assistance to get the relevant pieces of IOC committed.

IOC is in some ways quite ambitious here, and provides options to not only diagnose a

problem, but also to call a function which can produce a value for the operation and
recover. I’m not proposing adding such a feature to Clang at this time (I’ll leave that
to the IOC folks).

IOC is fairly modular in the sense that the checking logic doesn’t care that much about the trap handling logic.

Our experience with integer overflow checking is that it is totally useless without verbose diagnostics because nobody (including us) can understand what is going on given just a line number.

This was what I expected, but it’s great that you have evidence to back up my intuition. Ideally, I’d like for most reported issues to be understandable and fixable without any need to descend into a debugger.

The logic for recovering could be very handy, for example to implement AIR integers. But our priority is to just get the basic checking + trap handling added.

Great. My concern here is primarily performance: adding the ability to resume after a failure is very likely to restrict the optimizations that can be performed on the generated IR. Adding recovery seems like a fine idea to me so long as it’s optional.

  • -fchecks-num. This provides logging output when checks are inserted, and would not be

valuable to my target audience.

You mean to always terminate the program on undefined behavior?

No, I mean logging at IR generation time indicating how many checks it has inserted. But I was also intending to always terminate the program on caught undefined behavior.

This is probably reasonable as long as you are aware that basically zero C/C++ programs will actually finish normally when you turn this on. Many developers will be so annoyed by this that they will simply not use the checker.

With the current proposal, there are two ways to mitigate this problem:

  1. A goal of this work (which I think I forgot to state explicitly) is ABI compatibility, so it will be possible to build a subset of TUs with -fcatch-undefined-behavior to exclude problematic (possibly third-party) code.
  2. The -fcatch-undefined-behavior=… options will allow individual checks to be disabled, where they are blocking the user from finding other issues.

As I mentioned above, I’m concerned about possible performance implications for the non-failing codepaths from allowing the program to resume after the checks. However, I don’t have concrete evidence to support this worry, and it would certainly seem wise to measure the impact of allowing resumption, and make the decision on that basis.

Thanks!
Richard

This was what I expected, but it's great that you have evidence to back up my
intuition. Ideally, I'd like for most reported issues to be understandable and
fixable without any need to descend into a debugger.

Absolutely. This was one of the IOC design points too. It makes a big difference in usability (though most people don't dislike interactive debuggers as much as I do...).

Great. My concern here is primarily performance: adding the ability to resume
after a failure is very likely to restrict the optimizations that can be
performed on the generated IR. Adding recovery seems like a fine idea to me so
long as it's optional.

Sounds good, though I'm not sure that there will be a measurable penalty from recovery. Either way the code is basically this:

(result, overflow) = op input1, input2
if (overflow) call handler
use result

The dominating effect is whether the overflow can be shown to never happen. If you can't do that, I don't see how the extra branch back from the handler matters. I could be wrong of course.

Anyway, I'm not arguing for executing past the handler in the short term.

If you're the one doing the work, it's your call anyway!

No, I mean logging at IR generation time indicating how many checks it has
inserted.

Got it. Yes, this is useless in general-- we had some esoteric reason for it.

With the current proposal, there are two ways to mitigate this problem:
1) A goal of this work (which I think I forgot to state explicitly) is ABI
compatibility, so it will be possible to build a subset of TUs with
-fcatch-undefined-behavior to exclude problematic (possibly third-party) code.
2) The -fcatch-undefined-behavior=... options will allow individual checks to
be disabled, where they are blocking the user from finding other issues.

Sure. It'll be interesting to see how this goes. It's even possible that taking a harder line on stuff like signed overflows will cause people to take them more seriously (as opposed to just ignoring log messages).

John

Hi,

There are three different (and mostly orthogonal, design-wise) areas where I would like to make improvements to Clang’s undefined behavior checking:

  1. Completeness of checks. There are integer undefined behaviors which -ftrapv and -fcatch-undefined-behavior don’t catch, and there is almost no checking available for any other undefined behaviors outside of those and the ones caught by {Address,Thread,Memory} Sanitizer.

I would like to add support for checking the following additional undefined behaviors, which Clang does not currently appear to be able to check for:

  • Using a null pointer or empty glvalue (a dereferenced null pointer) as the object expression in a class member access expression or member function call.
  • Using an object of polymorphic class type where either the static and dynamic types are incompatible or the object’s lifetime has not started or has ended, in a delete-expression, class member access expression, member function call, derived-to-virtual-base conversion, dynamic_cast, or typeid, or using the value of a reference bound to such a glvalue.
  • Base-to-derived cast on wrong polymorphic type
  • Binding a reference to an inappropriately-aligned address or to an empty lvalue
  • VLA size evaluates to a nonpositive value
  • Reaching the end of a function with a non-void return type, in C++ (in C, this is only UB if the caller uses the return value)
  • Overflow in conversion from floating point to smaller floating point or to integral type
  • Overflow in conversion from integral type to floating point (int128 to float, or when converting to __fp16)
  • Converting an out-of-range integer to an enumerated type
  • Floating-point arithmetic overflow
  • Pointer arithmetic which leaves the bounds which can be determined by llvm.objectsize
  • Pointer subtraction between pointers which can be determined to be in different objects
  • Signed << where the LHS is negative, or where the result doesn’t fit in the result type (C99, C11) or doesn’t fit in the corresponding unsigned type (C++11)
  • Assignment where the LHS and RHS overlap but are not equal

Regehr et al’s IOC has code to handle a few of these checks already, which should be straightforward to apply to Clang trunk (assuming the IOC guys are still happy with that?). Low overhead would be an explicit goal of all of these checks.

  1. Command-line interface. We currently have the following options to enable various flavors of undefined behavior checks:
    -ftrapv
    -fcatch-undefined-behavior
    -faddress-sanitizer
    -fthread-sanitizer

I would like for us to have a single argument which enables all undefined behavior checking which can be performed cheaply and with no changes to address space layout or ABI; I propose we extend -fcatch-undefined-behavior to be that argument. IOC adds the -ftrapv checks to -fcatch-undefined-behavior (unless we’re in -fwrapv mode), so we can easily pick up that part of this change.

IOC also adds:
-fcatch-undefined-ansic-behavior
-fcatch-undefined-c99-behavior
-fcatch-undefined-cxx0x-behavior
-fcatch-undefined-cxx98-behavior
-fcatch-undefined-nonarith-behavior

I think we should support this kind of configuration through a mechanism similar to warning flags: -fcatch-undefined-behavior=c++11 -fno-catch-undefined-behavior=null-reference, for instance.

Also, I think we should consider renaming this switch (and/or possibly the -f*-sanitizer switches) to provide a consistent set of names, but I don’t have a concrete proposal for that.

  1. Handling and reporting of undefined behavior, once caught. The sanitizers produce decent information (which is both useful and detailed), but it could be prettier.

Can you be more specific?

The other checks just emit @llvm.trap(), which is far from ideal.

IOC is in some ways quite ambitious here, and provides options to not only diagnose a problem, but also to call a function which can produce a value for the operation and recover. I’m not proposing adding such a feature to Clang at this time (I’ll leave that to the IOC folks).

I would like to propose adding a runtime library for the -fcatch-undefined-behavior checks, and emit calls to relevant diagnostic functions instead of calls to @llvm.trap. I’m planning on having one library entry point per flavor of check, named _ubsan_report<check_name>, which would be passed relevant operand values, and a pointer to a structure containing information about the context (file, line and column number, operand types as strings, and possibly a code snippet to display with the diagnostic).

Do you plan to have these library functions to be marked as never-return? (if not, it will cost quite a bit of performance)
Also note, that asan/tsan use the debug info to get file/line info. Do you plan to do the same, or you want to pass this info explicitly to the callback? (this will cost additional binary size and maybe some perf)

Great stuff overall! (Finally, someone else will be playing with the compiler-rt build system :slight_smile: )

–kcc

Hi,

There are three different (and mostly orthogonal, design-wise) areas where I would like to make improvements to Clang’s undefined behavior checking:

  1. Completeness of checks. There are integer undefined behaviors which -ftrapv and -fcatch-undefined-behavior don’t catch, and there is almost no checking available for any other undefined behaviors outside of those and the ones caught by {Address,Thread,Memory} Sanitizer.

I would like to add support for checking the following additional undefined behaviors, which Clang does not currently appear to be able to check for:

  • Using a null pointer or empty glvalue (a dereferenced null pointer) as the object expression in a class member access expression or member function call.
  • Using an object of polymorphic class type where either the static and dynamic types are incompatible or the object’s lifetime has not started or has ended, in a delete-expression, class member access expression, member function call, derived-to-virtual-base conversion, dynamic_cast, or typeid, or using the value of a reference bound to such a glvalue.
  • Base-to-derived cast on wrong polymorphic type
  • Binding a reference to an inappropriately-aligned address or to an empty lvalue
  • VLA size evaluates to a nonpositive value
  • Reaching the end of a function with a non-void return type, in C++ (in C, this is only UB if the caller uses the return value)
  • Overflow in conversion from floating point to smaller floating point or to integral type
  • Overflow in conversion from integral type to floating point (int128 to float, or when converting to __fp16)
  • Converting an out-of-range integer to an enumerated type
  • Floating-point arithmetic overflow
  • Pointer arithmetic which leaves the bounds which can be determined by llvm.objectsize
  • Pointer subtraction between pointers which can be determined to be in different objects
  • Signed << where the LHS is negative, or where the result doesn’t fit in the result type (C99, C11) or doesn’t fit in the corresponding unsigned type (C++11)
  • Assignment where the LHS and RHS overlap but are not equal

Regehr et al’s IOC has code to handle a few of these checks already, which should be straightforward to apply to Clang trunk (assuming the IOC guys are still happy with that?). Low overhead would be an explicit goal of all of these checks.

  1. Command-line interface. We currently have the following options to enable various flavors of undefined behavior checks:
    -ftrapv
    -fcatch-undefined-behavior
    -faddress-sanitizer
    -fthread-sanitizer

I would like for us to have a single argument which enables all undefined behavior checking which can be performed cheaply and with no changes to address space layout or ABI; I propose we extend -fcatch-undefined-behavior to be that argument. IOC adds the -ftrapv checks to -fcatch-undefined-behavior (unless we’re in -fwrapv mode), so we can easily pick up that part of this change.

IOC also adds:
-fcatch-undefined-ansic-behavior
-fcatch-undefined-c99-behavior
-fcatch-undefined-cxx0x-behavior
-fcatch-undefined-cxx98-behavior
-fcatch-undefined-nonarith-behavior

I think we should support this kind of configuration through a mechanism similar to warning flags: -fcatch-undefined-behavior=c++11 -fno-catch-undefined-behavior=null-reference, for instance.

Also, I think we should consider renaming this switch (and/or possibly the -f*-sanitizer switches) to provide a consistent set of names, but I don’t have a concrete proposal for that.

  1. Handling and reporting of undefined behavior, once caught. The sanitizers produce decent information (which is both useful and detailed), but it could be prettier.

Can you be more specific?

Sure. Here’s a few things which I think could be improved:

  • The point at which the error is trapped doesn’t include a column number. (I’d also like a code snippet with a caret, where possible, but there are tradeoffs to be carefully considered there…)

  • Symbols don’t get demangled (I know, this is WIP).

  • The shadow byte output is currently only really useful if you understand how ASAN works in a reasonable amount of detail. We don’t provide any explanation of what they mean in the output, and even the documentation doesn’t explain what the various redzone values mean. I think we can do a lot better on this particular point; I gave some sample output in my original message which shows the direction I’m thinking of.

  • The ‘stats’ output does not seem relevant in the case where ASAN traps an issue.

  • There seems to be a tension between producing extremely detailed output (including things like bp and sp), which would be useful for a failure in a long-running process, and producing readable output (omitting such things, which are usually not interesting, and are redundant if the problem can be reproduced in a debugger). I’m not really sure what the right tradeoff is here. I certainly don’t want to remove any useful information from the output.

In essence, I’d like the ASAN output to be as accessible as our compile-time diagnostics. And yes, I am volunteering to help resolve the above points, if you agree that we should improve these things.

IOC is in some ways quite ambitious here, and provides options to not only diagnose a problem, but also to call a function which can produce a value for the operation and recover. I’m not proposing adding such a feature to Clang at this time (I’ll leave that to the IOC folks).

I would like to propose adding a runtime library for the -fcatch-undefined-behavior checks, and emit calls to relevant diagnostic functions instead of calls to @llvm.trap. I’m planning on having one library entry point per flavor of check, named _ubsan_report<check_name>, which would be passed relevant operand values, and a pointer to a structure containing information about the context (file, line and column number, operand types as strings, and possibly a code snippet to display with the diagnostic).

Do you plan to have these library functions to be marked as never-return? (if not, it will cost quite a bit of performance)

Yes, that was certainly my initial plan (although John Regehr has suggested that there may be value in being able to continue after such a check failure). I would especially welcome your input into ways to emit the checks in IR, and how to design the runtime library interface, to minimize the performance burden of the checks, since I know you’ve done a lot of research into that.

Also note, that asan/tsan use the debug info to get file/line info. Do you plan to do the same, or you want to pass this info explicitly to the callback? (this will cost additional binary size and maybe some perf)

I would like to preserve more information than can easily be extracted from the debug info (in particular, line numbers and column ranges for operands of problematic operations), so I’m intending on passing that info explicitly to the callback, but naturally if the measured impact of this is too high, I’ll need to reconsider.

I think the perf impact can be mitigated by forcing the check call out of the hot path, either through just block placement, or by pushing the extra arg emission into a separate noinline function, which would be called on a check failure with a minimal set of arguments. Something like:

int f(int x, int y) {
return x << y;
}

=>
linkonce_odr unnamed_addr const char __ubsan_str_8dfd1c41b32683bab10e108ac2851f2a = " return x << y;\n";

attribute((noreturn, noinline)) static void __ubsan_check_failure_1(int a, int b) {

static const struct LocationInfo __ubsan_locs_1[] = {

FILE, 2, 11, 12, // ‘<<’
FILE, 2, 9, 9, // LHS
FILE, 2, 14, 14 // RHS

};

__ubsan_report_left_shift_out_of_range(a, b, 32, “int”, __ubsan_locs_1, str_8dfd1c41b32683bab10e108ac2851f2a);

}

int f(int x, int y) {
if (__builtin_expect((unsigned)y >= 32 || ((unsigned)x >> (31-y)), 0))
__ubsan_check_failure_1(x, y);

return x << y;
}

  1. Handling and reporting of undefined behavior, once caught. The sanitizers produce decent information (which is both useful and detailed), but it could be prettier.

Can you be more specific?

Sure. Here’s a few things which I think could be improved:

  • The point at which the error is trapped doesn’t include a column number. (I’d also like a code snippet with a caret, where possible, but there are tradeoffs to be carefully considered there…)

Agree. In some cases this could be very valuable, e.g. when reporting a bug on expressions like a->b->c = x->y->z;
With your help we may be able to implement this as part of our symbolizer work.

  • Symbols don’t get demangled (I know, this is WIP).

Yep.

  • The shadow byte output is currently only really useful if you understand how ASAN works in a reasonable amount of detail. We don’t provide any explanation of what they mean in the output, and even the documentation doesn’t explain what the various redzone values mean. I think we can do a lot better on this particular point; I gave some sample output in my original message which shows the direction I’m thinking of.

Yea… The main reason we output the shadow bytes is to provide some means of debugging the tool itself if something goes wrong, especially if we have a report but don’t have a way to reproduce it.
Still, there were a few cases where the shadow bytes allowed us to understand a tricky true positive report quicker.
Additional docs/explanations may help, but my experience that users don’t read long docs :frowning:
So, my intention is to have less but better docs.

Same for other bits of output (sp, bp, stats).

  • The ‘stats’ output does not seem relevant in the case where ASAN traps an issue.

  • There seems to be a tension between producing extremely detailed output (including things like bp and sp), which would be useful for a failure in a long-running process, and producing readable output (omitting such things, which are usually not interesting, and are redundant if the problem can be reproduced in a debugger). I’m not really sure what the right tradeoff is here. I certainly don’t want to remove any useful information from the output.

This tradeof is always tricky.
If you omit some info by default, you may get into a situation where you need it, but you can’t reproduce the report again.
We’ve been bitten by this a lot :frowning:

–kcc

  1. Handling and reporting of undefined behavior, once caught. The sanitizers produce decent information (which is both useful and detailed), but it could be prettier.

Can you be more specific?

Sure. Here’s a few things which I think could be improved:

  • The point at which the error is trapped doesn’t include a column number. (I’d also like a code snippet with a caret, where possible, but there are tradeoffs to be carefully considered there…)

Agree. In some cases this could be very valuable, e.g. when reporting a bug on expressions like a->b->c = x->y->z;
With your help we may be able to implement this as part of our symbolizer work.

FYI column numbers are already there in DWARF debug info, and we don’t have to do anything specific to fetch them. They are missing
in ASan error reports because addr2line doesn’t output them.

<catching up on old email>

There are three different (and mostly orthogonal, design-wise) areas where I would like to make improvements to Clang's undefined behavior checking:

This is truly awesome, I would really love to see this. Your design proposal is somewhat startlingly similar to an approach I pitched to John and Vikram sometime last year. :slight_smile: I really like separating out the orthogonal pieces and pulling together the various disparate pieces into something coherent.

1) Completeness of checks. There are integer undefined behaviors which -ftrapv and -fcatch-undefined-behavior don't catch, and there is almost no checking available for any other undefined behaviors outside of those and the ones caught by {Address,Thread,Memory} Sanitizer.

Yes. Nuno's bounds checking work can also be pulled into this eventually, as could stack canaries and "fortify source".

2) Command-line interface. We currently have the following options to enable various flavors of undefined

I would like for us to have a single argument which enables all undefined behavior checking which can be performed cheaply and with no changes to address space layout or ABI; I propose we extend -fcatch-undefined-behavior to be that argument.

+1

I think we should support this kind of configuration through a mechanism similar to warning flags: -fcatch-undefined-behavior=c++11 -fno-catch-undefined-behavior=null-reference, for instance.

+1

Also, I think we should consider renaming this switch (and/or possibly the -f*-sanitizer switches) to provide a consistent set of names, but I don't have a concrete proposal for that.

That would make sense to me.

3) Handling and reporting of undefined behavior, once caught. The sanitizers produce decent information (which is both useful and detailed), but it could be prettier. The other checks just emit @llvm.trap(), which is far from ideal.

Yes, this should be orthogonal from the checks and interface, and should be consistent across all the checks.

IOC is in some ways quite ambitious here, and provides options to not only diagnose a problem, but also to call a function which can produce a value for the operation and recover. I'm not proposing adding such a feature to Clang at this time (I'll leave that to the IOC folks).

I agree.

I would like to propose adding a runtime library for the -fcatch-undefined-behavior checks, and emit calls to relevant diagnostic functions instead of calls to @llvm.trap. I'm planning on having one library entry point per flavor of check, named __ubsan_report_<check_name>, which would be passed relevant operand values, and a pointer to a structure containing information about the context (file, line and column number, operand types as strings, and possibly a code snippet to display with the diagnostic).

As a strawman proposal for runtime error output, we could aim to mirror the formatting style of compile-time diagnostics (since we've already got one way of beautifully presenting diagnostic information, why invent another?).

This would be pretty cool.

This proposal intends to incorporate some, but not all, of IOC into mainline Clang. The parts which I'm not currently intending to address as part of this work are:
* -fuse-intrinsic. This allows the choice of whether to use LLVM's overflow intrinsics. I intend to use such intrinsics unconditionally.
* -fuse-random-value. This allows recovery from detected undefined behavior by providing a value for the operation, which is not part of my proposal.
* -fhandler-null. This selects the usage of an alternate runtime library.
* -fchecks-num. This provides logging output when checks are inserted, and would not be valuable to my target audience.
* -fcatch-non-ubc-type. This is designed to catch unsigned integer overflow, and some other cases which don't have undefined behavior, and so is not directly connected to this work.

Makes sense to me. I'd like to have these checks be simplified down, even if it means that we lose some fine-grained control. I'd really like to get to the point where we can recommend people to turn on -fcatch-undefined-behavior (thus getting some sane subset of checks) when they see strange behavior.

-Chris

<catching up on old email>

There are three different (and mostly orthogonal, design-wise) areas where I would like to make improvements to Clang's undefined behavior checking:

This is truly awesome, I would really love to see this. Your design proposal is somewhat startlingly similar to an approach I pitched to John and Vikram sometime last year. :slight_smile: I really like separating out the orthogonal pieces and pulling together the various disparate pieces into something coherent.

1) Completeness of checks. There are integer undefined behaviors which -ftrapv and -fcatch-undefined-behavior don't catch, and there is almost no checking available for any other undefined behaviors outside of those and the ones caught by {Address,Thread,Memory} Sanitizer.

Yes. Nuno's bounds checking work can also be pulled into this eventually, as could stack canaries and "fortify source".

IMHO, I don't think ad-hoc techniques like stack canaries are suitable for this particular application. Stack canaries are better suited for run-time protection against stack buffer-overflow attacks (if they're suited for anything at all). I don't think canaries really tell you where in the code the stack is being smashed.

For memory-related undefined behaviors, I think it would make sense to have various "levels" of checks in which each level adds more overhead but checks more things accurately. ASan would be a good first or second level; it finds invalid loads and stores and can catch some out-of-bounds array accesses and dangling pointers. Another level could be ASan with SAFECode's array checks and points-to set checking. A final level could be something like SoftBound + CETS which provides real dangling pointer detection in addition to the previously mentioned checks.

2) Command-line interface. We currently have the following options to enable various flavors of undefined
I would like for us to have a single argument which enables all undefined behavior checking which can be performed cheaply and with no changes to address space layout or ABI; I propose we extend -fcatch-undefined-behavior to be that argument.

+1

It may make sense to have an equivalent of the -O option in which each level of undefined behavior checking adds more checks. For example, -UD0 is no checks, -UD1 is some really fast checks, -UD2 is more stringent checks, etc, etc.

I think we should support this kind of configuration through a mechanism similar to warning flags: -fcatch-undefined-behavior=c++11 -fno-catch-undefined-behavior=null-reference, for instance.

+1

Also, I think we should consider renaming this switch (and/or possibly the -f*-sanitizer switches) to provide a consistent set of names, but I don't have a concrete proposal for that.

That would make sense to me.

3) Handling and reporting of undefined behavior, once caught. The sanitizers produce decent information (which is both useful and detailed), but it could be prettier. The other checks just emit @llvm.trap(), which is far from ideal.

Yes, this should be orthogonal from the checks and interface, and should be consistent across all the checks.

One feature that you may want is an option to log errors to a file. We've found that useful in SAFECode for when we want to benchmark some program that has memory safety errors in it. Valgrind has such an option as well, so it seems people find it useful.

-- John T.

1) Completeness of checks. There are integer undefined behaviors which -ftrapv and -fcatch-undefined-behavior don't catch, and there is almost no checking available for any other undefined behaviors outside of those and the ones caught by {Address,Thread,Memory} Sanitizer.

Yes. Nuno's bounds checking work can also be pulled into this eventually, as could stack canaries and "fortify source".

IMHO, I don't think ad-hoc techniques like stack canaries are suitable for this particular application. Stack canaries are better suited for run-time protection against stack buffer-overflow attacks (if they're suited for anything at all). I don't think canaries really tell you where in the code the stack is being smashed.

The current implementation doesn't, but that's because its reporting mechanism is currently hard coded to "abort()". It could (in principle, not saying this is important) be extended to report better source-level diagnostics.

For memory-related undefined behaviors, I think it would make sense to have various "levels" of checks in which each level adds more overhead but checks more things accurately. ASan would be a good first or second level; it finds invalid loads and stores and can catch some out-of-bounds array accesses and dangling pointers. Another level could be ASan with SAFECode's array checks and points-to set checking. A final level could be something like SoftBound + CETS which provides real dangling pointer detection in addition to the previously mentioned checks.

Right. Some would be on by default with -fcatch-undefined-behavior, some would be opt-inable with -fcatch-undefined-behavior=something-specific

-Chris

1) Completeness of checks. There are integer undefined behaviors which -ftrapv and -fcatch-undefined-behavior don't catch, and there is almost no checking available for any other undefined behaviors outside of those and the ones caught by {Address,Thread,Memory} Sanitizer.

Yes. Nuno's bounds checking work can also be pulled into this eventually, as could stack canaries and "fortify source".

IMHO, I don't think ad-hoc techniques like stack canaries are suitable for this particular application. Stack canaries are better suited for run-time protection against stack buffer-overflow attacks (if they're suited for anything at all). I don't think canaries really tell you where in the code the stack is being smashed.

The current implementation doesn't, but that's because its reporting mechanism is currently hard coded to "abort()". It could (in principle, not saying this is important) be extended to report better source-level diagnostics.

You're right about the call to abort() and adding better diagnostics, but the problem goes deeper than that. Most run-time checks (such as ASan's, SAFECode's, and SoftBound's) can usually tell the user *which* instruction is causing memory corruption and, in some cases, which variables are involved. Stack canaries, I think, can only report which return instruction detected the corruption; it can't tell you which previously executed instruction actually caused the corruption.

In fact, I think most stack canaries will only detect run-time errors that smash the return pointer or function arguments, and such an error is likely to be visible even when the canaries are disabled, so I don't see it helping much as an undefined behavior diagnostic. If you put canaries between stack objects, then that might detect otherwise latent errors.

Note that I'm making a distinction between run-time protection from attacks and detecting undefined behavior. The former is a hardening technique to automatically make software resistant to attack; the latter is a debugging technique to help developers ferret out bugs. Stack canaries are a hardening technique, and they have some success in doing that. I don't think they'd be very useful as a diagnostic technique for debugging.

For memory-related undefined behaviors, I think it would make sense to have various "levels" of checks in which each level adds more overhead but checks more things accurately. ASan would be a good first or second level; it finds invalid loads and stores and can catch some out-of-bounds array accesses and dangling pointers. Another level could be ASan with SAFECode's array checks and points-to set checking. A final level could be something like SoftBound + CETS which provides real dangling pointer detection in addition to the previously mentioned checks.

Right. Some would be on by default with -fcatch-undefined-behavior, some would be opt-inable with -fcatch-undefined-behavior=something-specific

Do you have any opinions on my "undefined behavior levels" idea? In most cases, I don't think users want to fiddle around with a whole bunch of options for a whole bunch of checks that they don't understand; I think a dial that trades off run-time performance for more strict checking (in the same way that -O is a dial that trades off compile-time for faster code) is easier to use.

That said, having separate options for each type of check would be useful for experts, but a "default" set of checks and then having specify a whole bunch of separate options for additional checks might be unnecessarily cumbersome.

-- John T.

The current implementation doesn't, but that's because its reporting mechanism is currently hard coded to "abort()". It could (in principle, not saying this is important) be extended to report better source-level diagnostics.

You're right about the call to abort() and adding better diagnostics, but the problem goes deeper than that. Most run-time checks (such as ASan's, SAFECode's, and SoftBound's) can usually tell the user *which* instruction is causing memory corruption and, in some cases, which variables are involved. Stack canaries, I think, can only report which return instruction detected the corruption; it can't tell you which previously executed instruction actually caused the corruption.

Sure, I understand how stack canaries work.

Right. Some would be on by default with -fcatch-undefined-behavior, some would be opt-inable with -fcatch-undefined-behavior=something-specific

Do you have any opinions on my "undefined behavior levels" idea? In most cases, I don't think users want to fiddle around with a whole bunch of options for a whole bunch of checks that they don't understand; I think a dial that trades off run-time performance for more strict checking (in the same way that -O is a dial that trades off compile-time for faster code) is easier to use.

I agree with you that only experts (or compiler hackers :slight_smile: should be expected to twiddle detailed options. I think that there are a few really important levels that we should focus on in order of priority:

1. The "on by default" configuration. Checks that are cheap enough and important enough to be included even in production builds, like certain cases of stack canaries and "fortify source" are now.

2. The default -fcatch-undefined-behavior configuration. This should be an obvious blend of checks that doesn't cause a tortuous performance hit (I think valgrind's 20x hit would be too much, but maybe 2-4x would be ok) but that find a lot of bugs that people care about. We probably don't want to include checks for implementation-defined behavior in this, because that is likely to find a lot of bugs that "people don't care about".

3. Specific modes of -fcatch-undefined-behavior that are intended to find certain classes of problems. If I've got a memory corruption issue that I'm trying to debug and can't figure out, I may be willing to throw arbitrary amounts of runtime and compile time at the problem. A 20x slowdown would be fine. We'd want a high level flag like "-fcatch-undefined-behavior=extensive-memory-checks" or something like that.

4. The hacker/tweaker case of wanting to enable/disable individual checks, tweak parameters of checks, etc.

-Chris

The current implementation doesn’t, but that’s because its reporting mechanism is currently hard coded to “abort()”. It could (in principle, not saying this is important) be extended to report better source-level diagnostics.

You’re right about the call to abort() and adding better diagnostics, but the problem goes deeper than that. Most run-time checks (such as ASan’s, SAFECode’s, and SoftBound’s) can usually tell the user which instruction is causing memory corruption and, in some cases, which variables are involved. Stack canaries, I think, can only report which return instruction detected the corruption; it can’t tell you which previously executed instruction actually caused the corruption.

Sure, I understand how stack canaries work.

Right. Some would be on by default with -fcatch-undefined-behavior, some would be opt-inable with -fcatch-undefined-behavior=something-specific

Do you have any opinions on my “undefined behavior levels” idea? In most cases, I don’t think users want to fiddle around with a whole bunch of options for a whole bunch of checks that they don’t understand; I think a dial that trades off run-time performance for more strict checking (in the same way that -O is a dial that trades off compile-time for faster code) is easier to use.

I agree with you that only experts (or compiler hackers :slight_smile: should be expected to twiddle detailed options. I think that there are a few really important levels that we should focus on in order of priority:

  1. The “on by default” configuration. Checks that are cheap enough and important enough to be included even in production builds, like certain cases of stack canaries and “fortify source” are now.

  2. The default -fcatch-undefined-behavior configuration. This should be an obvious blend of checks that doesn’t cause a tortuous performance hit (I think valgrind’s 20x hit would be too much, but maybe 2-4x would be ok) but that find a lot of bugs that people care about. We probably don’t want to include checks for implementation-defined behavior in this, because that is likely to find a lot of bugs that “people don’t care about”.

Related to this, in the company I work in we experimented with a lightweight memory leak detector approach (which relied on substituting the malloc implementation by LD_PRELOAD) and the consensus was that we could afford to run our test servers with a x2 penalty, but not much more. The problem is not that the servers answer a bit more slowly, it’s that if they are slower you need more of them to cope with the same traffic. We never quite really managed to get this level of performance though (got stuck at x4/x5) and the project aborted.

I think the approach suffered from not being tunable enough, since all malloc related traffic was kept around, even though most memory was managed by RAII constructs (std::string, std::vector…) and thus could not leak by design; I hope that by getting these checks into the compiler the optimizer (LLVM) will be able to discard a lot of them.

  1. Specific modes of -fcatch-undefined-behavior that are intended to find certain classes of problems. If I’ve got a memory corruption issue that I’m trying to debug and can’t figure out, I may be willing to throw arbitrary amounts of runtime and compile time at the problem. A 20x slowdown would be fine. We’d want a high level flag like “-fcatch-undefined-behavior=extensive-memory-checks” or something like that.

  2. The hacker/tweaker case of wanting to enable/disable individual checks, tweak parameters of checks, etc.

-Chris

– Matthieu

I have followed this discussion with great interest.

In the "find a lot of bugs that 'people don't care about'" category, might I suggest that floating point divide by zero be something that can be toggled on/off. See also:

<http://llvm.org/bugs/show_bug.cgi?id=11854>

Cheers,

Concrete proposal (based on ideas from Chandler): we run with the branding which -faddress-sanitizer and -fthread-sanitizer have established. Our new switch for all optional runtime checking is “-fsanitize=…”. This should behave analogously to -W, and in particular, we should have sanitizer groups (as discussed elsewhere on this thread) and -fsanitize=no-.

-faddress-sanitizer becomes a synonym for -fsanitize=address
-fthread-sanitizer becomes a synonym for -fsanitize=thread
-fcatch-undefined-behavior becomes a synonym for -fsanitize=undefined-fast
-ftrapv becomes a synonym for -fsanitize=signed-overflow

For compatibility with GCC and previous versions of Clang, these existing flags are retained. -fsanitize=address and -fsanitize=thread continue to be incompatible (a diagnostic will be produced for any attempt to combine them).

Thoughts?

Makes sense to me,

-Chris