Unsigned Bitwise Shift for Bit-field Structure

Hi,

I have a question about unsigned bitwise shift.

According the C99 6.5.7.4

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated
bits are filled with zeros. If E1 has an unsigned type, the value of
the result is E1 × 2^E2, reduced modulo one more than the maximum
value representable in the result type.

So if

unsigned b = 0x80000000;
unsigned a = b << 1;
a will equal to 0 because a = (b << 1) mod (1<<32);
(1<< 32) is UINT_MAX+1

For the bit-field structure defined as
struct foo
{
  unsigned long long b:40;
} x;

According to C99 6.7.2.1
A bit-field is interpreted as a signed or unsigned integer type
consisting of the specified number of bits.
So the x.b will treat as 40 bit unsigned integer and it should follow 6.5.7.4.

if x.b = 0x80000000 00;
x.b << 1 = (x.b << 1) mod (1<<40)
So x.b << 1 should be 0, right ?

Please correct me if I miss understanding something.

Thanks,
Shiva

Adding the clang developer list and removing the llvm developer list. Clang folks would be better for asking a C++ question such as this.

This test program prints 0.

#include <stdio.h>

struct foo
{
unsigned long long b:40;
} x;

int main() {
foo f;
f.b = 0x8000000000ULL;
f.b <<= 1;
printf("%llx\n", f.b);
return 0;
}

This test program prints 0.

#include <stdio.h>

struct foo
{
  unsigned long long b:40;
} x;

int main() {
  foo f;
  f.b = 0x8000000000ULL;
  f.b <<= 1;
  printf("%llx\n", f.b);
  return 0;
}

Which does not answer the question. The question is whether (f.b << 1) is 0.

Test:
int printf(const char *, ...);
int main(void) {
  struct F { unsigned long long b : 40; } f;
  f.b = 0x8000000000ull;
  printf("0x%016llx\n", (unsigned long long)(f.b << 1));
}

GCC says 0. Clang says otherwise.
Online compiler: https://wandbox.org/permlink/MNTBbjv2F96N4boA

Now, the use of unsigned long long is itself a bit of an extension;
however, yes, the wording does make it so that (f.b << 1) does not perform
promotion of the bitfield type to unsigned long long.

Hi Hubert and Craig,

Thank you for making the case more clear.

"(f.b << 1) does not perform promotion to unsigned long long"
So (f.b << 1) is 40 bit unsigned integer and should be 0, right ?

But the bugzilla in
https://bugs.llvm.org/show_bug.cgi?id=17299
said it should be a GCC bug.

So I'm a little confused.

Here `E1` is your bit field and is an lvalue, so it is converted to an rvalue of type `unsigned long long` (see N1256 6.3.2.1/1), and left-shifting it by one yields an rvalue with the value `0x10000000000` and the type `unsigned long long`. The rvalue is then converted and stored in the bit field. Since the bit field is unsigned and said to have a 'pure binary notation' (see N1256 6.2.6.1/2), its value ranges from 0 to (2^40-1). So you are right, the result is stored modulo 2^40, yielding zero (see N1256 6.3.1.3/2).

    Hi,

    I have a question about unsigned bitwise shift.

    According the C99 6.5.7.4

    The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated
    bits are filled with zeros. If E1 has an unsigned type, the value of
    the result is E1 × 2^E2, reduced modulo one more than the maximum
    value representable in the result type.

Here `E1` is your bit field and is an lvalue, so it is converted to an
rvalue of type `unsigned long long` (see N1256 6.3.2.1/1), and
left-shifting it by one yields an rvalue with the value `0x10000000000` and
the type `unsigned long long`. The rvalue is then converted and stored in
the bit field. Since the bit field is unsigned and said to have a 'pure
binary notation' (see N1256 6.2.6.1/2), its value ranges from 0 to
(2^40-1). So you are right, the result is stored modulo 2^40, yielding zero
(see N1256 6.3.1.3/2).

It is hardly clear that the rvalue is of type unsigned long long (and not
some special bit-field type).
See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1260.htm.
I have been unable to find further discussion of the issue after
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1270.pdf.

You are right. Let's assume such an implementation exists, and despite that type, the result doesn't change: The (indeterminate) value is merely truncated earlier, yielding the same value after it is stored into the bit-field member.

I also doubt the existence of such an implementation: 6.7.2.1/4 doesn't say bit-field types need exist. Their occurrence also lead to problems: Would people benefit from it? What will happen if such an rvalue is passed to a function taking variable arguments? What will `sizeof(it+1)` where `it` has a bit-field type?

The type of the field is long long; the bitfield size modifier is not part of the type. x.b is a 64-bit integer, not a 40-bit integer.

-Eli

This discussion got moved to cfe-dev.

-Eli

It is hardly clear that the rvalue is of type unsigned long long (and
not some special bit-field type).
See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1260.htm.
I have been unable to find further discussion of the issue after
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1270.pdf.

You are right. Let's assume such an implementation exists, and despite
that type, the result doesn't change: The (indeterminate) value is merely
truncated earlier, yielding the same value after it is stored into the
bit-field member.

Where the implementation divergence occurs is when the value is not stored
back into the bit-field member.

I also doubt the existence of such an implementation: 6.7.2.1/4 doesn't
say bit-field types need exist. Their occurrence also lead to problems:
Would people benefit from it? What will happen if such an rvalue is passed
to a function taking variable arguments? What will `sizeof(it+1)` where
`it` has a bit-field type?

I agree that it leads to problems. This is the reason why I had the cast to
unsigned long long in my code example (which does demonstrate the existence
of such an implementation, namely GCC): because, without the cast, the
argument may not match with the conversion specification.

I think answering these questions is a job for WG 14 (the C committee).

Hi Hubert and Liu Hao,

Thanks for your guidance and make the question more clear.
When the value is not stored back into the bit-field member, it will
involve integer promotion.
The integer promotion rule for bit-filed defined in C99 6.3.1.1/2.
However, the integer promotion rule for other implementation-defined
bit-field type
(width wider than int such as unsigned long long bit field type) was not clear.
So the question become should the implementation-defined bit-field
type do the promotion.
GCC choose not to and Clang will.
There is some discussion in
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_315.htm
As Hubert said, the answer will determine until the C standard mention
the rule for implementation-defined bit-field type.

The important thing here is the implementation defined behaviour.

It is quite reasonable for GCC and Clang to have different behaviours. They should both document what behaviour has been implemented as an implementation is required to document how implementation defined behaviour has been implemented.

Basically, don't expect code that relies on implementation defined behaviour tonne portable.

This is one of the reasons why coding standards such as MISRA highlight such things.

Then I think the bugzilla [1] you mentioned before should be updated? After all, it’s not a bug,
it’s undefined behavior. If so, would you mind to update the link?

[1] https://bugs.llvm.org/show_bug.cgi?id=17299

Regards,
chenwj

The important thing here is the implementation defined behaviour.

It is quite reasonable for GCC and Clang to have different behaviours.
They should both document what behaviour has been implemented as an
implementation is required to document how implementation defined behaviour
has been implemented.

It is compliant, and perhaps reasonable on a case-by-case basis; however,
Clang "tries" to be GCC-compatible. At the level of implementation-defined
(as opposed to unspecified or undefined) behaviour, I would expect that
Clang would follow what GCC does.

In any case, the choice of whether to perform promotion on the bit-field
types with "base type" and width such that the wording indicates that
promotion does not occur is not implementation-defined behaviour.

What is implementation-defined is whether the use of, e.g., "unsigned long
long" as the "base type" of a bit-field is a constraint violation.

Basically, don't expect code that relies on implementation defined
behaviour tonne portable.

This is one of the reasons why coding standards such as MISRA highlight
such things.

Sent from my iPhone

>
> Hi Hubert and Liu Hao,
>
> Thanks for your guidance and make the question more clear.
> When the value is not stored back into the bit-field member, it will
> involve integer promotion.
> The integer promotion rule for bit-filed defined in C99 6.3.1.1/2.
> However, the integer promotion rule for other implementation-defined
> bit-field type
> (width wider than int such as unsigned long long bit field type) was not
clear.

It is unclear as to the intent of the committee; however, the wording has:
"All other types are unchanged by the integer promotions."

>
> Hi Hubert and Liu Hao,
>
> Thanks for your guidance and make the question more clear.
> When the value is not stored back into the bit-field member, it will
> involve integer promotion.
> The integer promotion rule for bit-filed defined in C99 6.3.1.1/2.
> However, the integer promotion rule for other implementation-defined
> bit-field type
> (width wider than int such as unsigned long long bit field type) was not
> clear.

It is unclear as to the intent of the committee; however, the wording has:
"All other types are unchanged by the integer promotions."

Agreed, there is default rule not promote other type.
But it is unclear that the committee intend not to promote for long
long bit field.
Therefore, there are some discussion in DR #315.
Thanks for brought me here ^^