-fcatch-undefined-behavior testing

If people want to try a neat new feature, go ahead and compile your software with -fcatch-undefined-behavior and then test your software. If there are any traps at runtime, investigate your code and see if it is trying to execute some undefined behavior. See http://clang.llvm.org/docs/UsersManual.html#codegen for some details on the checks that are inserted.

Questions, comments, concerns... let me know. If it catches any real bugs, that'd be interesting to know. If you find any bugs, I'd be interested in those as well.

Very cool, I've long wanted this feature.

Have you considered running this on an LLVM nightly test run? That
might be a good way to flush out either undefined behavior in the
tests (which would be nice to fix) or bugs in the code.

- Daniel

Hi,

This is a very interesting feature!

I tested it (after reverting r91380 locally) on ClamAV.
It found a shift > bitwidth error, and got a SIGILL. So far so good.

However I found no way to skip this error, to look for more errors.
Would it be possible to emit something else instead of __builtin_trap()?
Perhaps a debugger breakpoint, or something else that can be skipped.

Also would it be possible to warn about shift exceeding width only if
they can change the outcome?
Consider this code from ClamAV, which for b == 0 does a left shift by 32
on a 32-bit var:
#define CLI_ROR(a,b) a = (( a >> (b % (sizeof(a)<<3) )) | (a << (
(sizeof(a)<<3) - (b % (sizeof(a)<<3 )) ) ))

If the shift of 32 is implemented as shifting out all the bits, then the
result is a|0 -> a.
If the shift of 32 is implemented as a shift considering the low bits
only, then its a shift by 0, and the result is a|a -> a.
So regardless how the shift of 32 is implemented the outcome is a, right?
The intent of the macro is to emulate the x86 ror instruction, and both
llvm/gcc know to turn it into a ror when compiled
if b is constant. However neither knows to turn it into a ror if b is
not a constant...
The shift of 32 could be worked around by checking whether b is 0, but
that adds another branch.

Best regards,
--Edwin

Love to see the results; my email was to solicit others into trying the feature out on their favorite code.

Also, I guess I should say, if people have undefined behavior that they would just love to see runtime checks for, maybe to catch portability problems, or to catch security problems or just random bugs in code, I've love to hear the ideas.

This is a very interesting feature!

However I found no way to skip this error, to look for more errors.

Yeah, limitation of the current codegen. Trivially, one could imagine logging interface codegen. It could include column and line numbers, filenames and what rule was violated and how (the shift amount was 128 bits). I think filing a PR for a logging system would make a nice addition. Great for large codebases when build test cycle times are in the 1-3 day range.

Also would it be possible to warn about shift exceeding width only if they can change the outcome?

Well, the notion that some undefined behavior is defined and perfectly ok, is an odd notion. Rather, think of it this way, the runtime check is trying to tell you that the `portable' version of ROR doesn't work on ppc and that you need to fix your code to be portable. After it is fixed, there will be no runtime failure. In fact, one of the purposes of the check was to find exactly this type of non-portable code.

The shift of 32 could be worked around by checking whether b is 0, but that adds another branch.

A branch on a constant is relatively free. A smart optimizer, having a ror instruction on the target, would always eliminate the branch as well.

Consider what happens when a is constant, and the emulation for shift at compile time doesn't match the runtime behavior of shift, the wrong value is computed. Might not happen on a native compiler, but odds of this happening under cross compilation go up.

  

This is a very interesting feature!
    
However I found no way to skip this error, to look for more errors.
    
Yeah, limitation of the current codegen. Trivially, one could imagine logging interface codegen. It could include column and line numbers, filenames and what rule was violated and how (the shift amount was 128 bits). I think filing a PR for a logging system would make a nice addition. Great for large codebases when build test cycle times are in the 1-3 day range.
  
Ok. As a short term solution could it generate an int3 though instead of
an illegal instruction?

Also would it be possible to warn about shift exceeding width only if they can change the outcome?
    
Well, the notion that some undefined behavior is defined and perfectly ok, is an odd notion. Rather, think of it this way, the runtime check is trying to tell you that the `portable' version of ROR doesn't work on ppc and that you need to fix your code to be portable. After it is fixed, there will be no runtime failure. In fact, one of the purposes of the check was to find exactly this type of non-portable code.
  
For the case where the shift amount is constant, I agree that it is
undefined, for (uint32_t x=...; x << 32) the compiler can say that x is
42 (or some other random value).

However when the shift amount is not constant, the compiler has no
choice but to implement the shift with CPU instructions (it may or may
not recognize the ror pattern, currently it doesn't).
In that case the undefined behavior for the shift becomes an
implementation defined behavior of the CPU.

For the CLI_ROR case, the only scenario where the undefined behavior
might occur (and I don't consider it a bug) is if shift amount == bitwidth,
in which case it depends on the CPU what happens
(X86 appears to shift by 0, PPC shifts all bits out and result is 0). In
either case the end result of the CLI_ROR macro is the same, since a|a
== a|0.

For higher shift amounts the result may not be the same, but I think for
shift amount == bitwidth the result is still correct.

Indeed when the rotate amount passes to CLI_ROR is greater than
bitwidth, that should be fixed, see here:
https://wwws.clamav.net/bugzilla/show_bug.cgi?id=1778

The shift of 32 could be worked around by checking whether b is 0, but that adds another branch.
    
A branch on a constant is relatively free. A smart optimizer, having a ror instruction on the target, would always eliminate the branch as well.

Consider what happens when a is constant, and the emulation for shift at compile time doesn't match the runtime behavior of shift, the wrong value is computed. Might not happen on a native compiler, but odds of this happening under cross compilation go up.

Yeah the branch for the constant case would be fine, however it would
mean a penalty for the non-constant shift case.

Best regards,
--Edwin

Well some of the ideas I have would overlap with what SAFEcode already does.

Ideally it should catch all the undefined behavior that would cause
different behavior when program is compiled with -O0 vs compiled with
-O2. Or in other words, any undefined behavior that could break (the
user's expectations) due to compiler optimizations.
That "idea" is too vague to be useful, so below are some specific ideas,
some already crash at runtime but a better error message
would be very useful.

A general idea first: please allow for individual tests to be turned
on/off, for example I may want to test only for
array overflows, only for shift overflows, etc.
Or if I compile with -fno-strict-aliasing I wouldn't care about aliasing
violations, etc.

1. support something similar to gcc's "-fmudflapth":
  It is useful on occasion when valgrind can't find errors (such as
overrunning a stack allocated buffer). You already do something similar,
however I'd be interested to catch bugs when the buffer (of constant
size) is declared in one function, and the buggy store/load is in
another function.
2. alignment testing
This is something I haven't found a tool for. Especially important for
sparc, but thats needs a sparc CPU for testing.
It'd be nice if we could catch these bugs by running on x86, and
assuming sparc alignment rules. Even on x86, vectorized variables need
to be aligned, or it crashes.
3. pointer overflows, undefined behavior in userspace
4. aliasing rule violations via casts
Although LLVM doesn't have TBAA yet, clang could catch some TBAA
errors, when the dynamic type of a casted value doesn't match
5. other aliasing violations
- restrict qualifier not obeyed (i.e. a noalias pointer indeed aliases
something)
- two pointers have equal value, yet according to aliasing rules they
never alias (for example overflowing from one struct field to another)
6. pure/const qualifier checks
7. calling convention mismatches
8. noreturn that returns
9. division by zero due to overflow (32-bit division of -2147483648 by -1)
10. Access to a variable from multiple threads without proper
locking/atomic instructions
11. weak symbols with an initializer, overriden with another
initializer at link time (optimizer may assume the initial value it has
seen)
12. use of uninitialized variable
13. calling function using wrong prototype (for example library compiled
with certain flags, function has 5 params, user
compiles with other flags, function has 4 params. At runtime it will
crash/misbehave for C.
Even for C++ its a problem if structures have different
layout/elements). The types' structure should match, including pointed
to types

I had a list with some more ideas, but I can't find that file right now,
if I find it, I'll follow up.

Best regards,
--Edwin

This is a very interesting feature!

However I found no way to skip this error, to look for more errors.

Yeah, limitation of the current codegen. Trivially, one could imagine logging interface codegen. It could include column and line numbers, filenames and what rule was violated and how (the shift amount was 128 bits). I think filing a PR for a logging system would make a nice addition. Great for large codebases when build test cycle times are in the 1-3 day range.

Ok. As a short term solution could it generate an int3 though instead of
an illegal instruction?

Also would it be possible to warn about shift exceeding width only if they can change the outcome?

Well, the notion that some undefined behavior is defined and perfectly ok, is an odd notion. Rather, think of it this way, the runtime check is trying to tell you that the `portable' version of ROR doesn't work on ppc and that you need to fix your code to be portable. After it is fixed, there will be no runtime failure. In fact, one of the purposes of the check was to find exactly this type of non-portable code.

For the case where the shift amount is constant, I agree that it is
undefined, for (uint32_t x=...; x << 32) the compiler can say that x is
42 (or some other random value).

However when the shift amount is not constant, the compiler has no
choice but to implement the shift with CPU instructions (it may or may
not recognize the ror pattern, currently it doesn't).
In that case the undefined behavior for the shift becomes an
implementation defined behavior of the CPU.

This isn't the correct understanding of undefined behavior. The
compiler is perfectly free to optimize all subsequent code in your
program as if the shift value is in the range [0,32), since if it
wasn't the behavior would have be undefined. This has larger
implications than just the result of the shift.

OTOH I agree this is mostly a pedantic point, but overall I agree with
Mike -- the flag should catch undefined behavior. If the programmer
wants to say "oh, yeah, but no compiler will *ever* make me regret
that" then one reasonable way out would be to disable it with a pragma
or attribute.

- Daniel

14. malloc of possibly wrapped value: x = malloc(a+b); memset(x, 0, a);
memset(x+a, 0, b); // if a+b overflowed, then a+b < a, or a+b < b
15. incorrect buffer limit checks:
     if (tainted_signed_value <= (long) some_limit)
a[tainted_signed_value]; //<--- code should check for negative values as
well
     same situation if comparison becomes signed due to integer
promotion rules.
16. a warning (not undef behavior, but could be a mistake) for integer
promotion of signed value to unsigned when target variable is signed:
int x=-2; unsigned y=1; long a0 = x + y; value of a0 will be 4294967295,
though a0 is a signed variable!

Best regards,
--Edwin

Ah, this one we already have check for, in the case where the memory that backs a is known to the optimizer (after llvm implements more of the object_size builtin).

Ok. As a short term solution could it generate an int3 though instead of an illegal instruction?

Which llvm intrinsic is that?

Yeah the branch for the constant case would be fine, however it would mean a penalty for the non-constant shift case.

As I said, for an optimizing compiler, it could omit the penalty, if it cared.

  

Ok. As a short term solution could it generate an int3 though instead of an illegal instruction?
    
Which llvm intrinsic is that?
  
No target independent intrinsic, I only know of a target-dependent way
for x86:
tail call void asm sideeffect "int3", "~{dirflag},~{fpsr},~{flags}"()
nounwind

Yeah the branch for the constant case would be fine, however it would mean a penalty for the non-constant shift case.
    
As I said, for an optimizing compiler, it could omit the penalty, if it cared.

Ok.

Best regards,
--Edwin