ubsan: problem with detecting UB in shift operation with large RHS

Hello,

I’m using ubsan together with creduce to ensure that reduced test doesn’t have UB. And I’ve noticed that ubsan is not detecting UB in case of large shift amount, particularly when rhs of shift has larger type that lhs. To reproduce, rhs also has to have the value that is non-zero, but after truncation to lhs type the value becomes zero.

Consider the following example.

#include <stdio.h>

// 1 is in 33rd bit.
unsigned long long l = 0x100000000ULL;

int main() {
// Ubsan doesn’t fire
int res = 123 >> 0x100000000ULL;
// Ubsan doesn’t fire
res += 123 >> l;
// Ubsan does fire
res += 123ULL >> l;
printf(“result = %d\n”, res);
return 0;
}

Changing the constant to the value, which fit 32 bits makes ubsan firing.

I understand where the problem comes from - LLVM IR definition requires both operands of the shift to be of the same integer type. And ubsan actually check already truncated value. But it doesn’t match C/C++ standard definition of UB in shift operation.

Is it possible to fix in current infrastructure? Should I file a bug for this problem?

Note, it’s not a theoretical problem, it’s very practical one, which pops up during automatic test reduction relatively frequently for me.

Dmitry.

Is it actually undefined behavior?
What does the standard say about " int32_value >> int64_value" ?
If int64_value should be first cast to int32, then there is no UB in:

res += 123 >> l;

As for the first buggy line, there is a FE warning:

b.c:6:19: warning: shift count >= width of type [-Wshift-count-overflow]
int res = 123 >> 0x100000000ULL;

–kcc

FE warning is good, but the case with the constant is for demonstration purposes only :slight_smile:

The standard doesn’t make any connection between left and right hand side types, neither limitations for rhs type size. And it matches to how clang interprets this:

-VarDecl 0x7ff53d0de998 <col:5, col:22> col:9 used res 'int' cinit -BinaryOperator 0x7ff53d0dea38 <col:15, col:22> ‘int’ ‘>>’

-IntegerLiteral 0x7ff53d0de9f8 col:15 ‘int’ 123
`-IntegerLiteral 0x7ff53d0dea18 col:22 ‘unsigned long long’ 4294967296

Here’s 5.8.2 (from C++17):
The operands shall be of integral or unscoped enumeration type and integral promotions are performed. The type of the result is that of the promoted left operand. The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.

So I treat it as ubsan missing UB in this case by working on the IR with truncation for RHS, as the standard doesn’t assume this truncation.

Dmitry.

Is it actually undefined behavior?

Yes.

What does the standard say about " int32_value >> int64_value" ?
If int64_value should be first cast to int32, then there is no UB in:
    res += 123 >> l;

The standard says that for >>, "integral promotions are performed", which
basically has no effect here (assuming int is 32 bits), and then says "The
type of the result is that of the promoted left operand". It does not
specify any conversion or casting of the right operand, other than integral
promotion.

Hello,

I'm using ubsan together with creduce to ensure that reduced test doesn't
have UB. And I've noticed that ubsan is not detecting UB in case of large
shift amount, particularly when rhs of shift has larger type that lhs. To
reproduce, rhs also has to have the value that is non-zero, but after
truncation to lhs type the value becomes zero.

Consider the following example.

#include <stdio.h>

// 1 is in 33rd bit.
unsigned long long l = 0x100000000ULL;

int main() {
    // Ubsan doesn't fire
    int res = 123 >> 0x100000000ULL;
    // Ubsan doesn't fire
    res += 123 >> l;
    // Ubsan does fire
    res += 123ULL >> l;
    printf("result = %d\n", res);
    return 0;
}

Changing the constant to the value, which fit 32 bits makes ubsan firing.

I understand where the problem comes from - LLVM IR definition requires
both operands of the shift to be of the same integer type. And ubsan
actually check already truncated value. But it doesn't match C/C++ standard
definition of UB in shift operation.

Is it possible to fix in current infrastructure? Should I file a bug for
this problem?

Yes and yes. I would expect that we just need to reorder the check to
before we truncate the RHS.

Before filing the bug I’ve looked for existing bug reports and found one for exactly this problem: 27271.

Could someone have a look at this bug, please?

https://llvm.org/bugs/show_bug.cgi?id=27271