Question about the old C back-end

Hello all,

When going through the internals of the old C back-end, I see that the CBE encapsulates arrays into a struct. The source code has the following comment to explain this behaviour.

      // Arrays are wrapped in structs to allow them to have normal
      // value semantics (avoiding the array "decay").

For example, the CBE translates:
   @a = common global [10 x i32] zeroinitializer, align 16

into:
   struct { unsigned int array[10]; } a;

However, the reason for this behaviour is not completely clear to me. Can anyone give me further explanation of that is meant by 'array decay' and why it is not possible (or easy) to generate normal C-style arrays?

Cheers,
  Roel

If I understand this correctly (I don't know what kind of code CBE
generates), the comment says that CBE generates code that wants to
treat arrays as values. But in C arrays don't have this property
(partly because array in most contexts "decays" to a pointer to the
first element):

  int a[10];
  int b[10];
  a = b; // error
  int *c = a; // ok

But this can be sidestepped if one wraps the array into a struct.

Dmitri

Hi Roel,

When going through the internals of the old C back-end, I see that the CBE
encapsulates arrays into a struct. The source code has the following comment to
explain this behaviour.

      // Arrays are wrapped in structs to allow them to have normal
      // value semantics (avoiding the array "decay").

For example, the CBE translates:
   @a = common global [10 x i32] zeroinitializer, align 16

into:
   struct { unsigned int array[10]; } a;

However, the reason for this behaviour is not completely clear to me. Can anyone
give me further explanation of that is meant by 'array decay' and why it is not
possible (or easy) to generate normal C-style arrays?

an IR function might return an array, or have a parameter of array type. Eg it
might be:
   declare [4 x i8] @foo([8 x i8] %x)
If you tried to turn that into the C
   char[4] foo(char[8] x);
then (1) the compiler would reject it, and (2) even if it didn't reject it it
would probably just return a pointer to the first element of the array rather
than the array elements themselves, in keeping with C's usual confusion between
arrays and pointers. Wrapping the array in a struct gets around these problems.

Ciao, Duncan.

Hi Duncan,

Hi Roel,

When going through the internals of the old C back-end, I see that the CBE
encapsulates arrays into a struct. The source code has the following comment to
explain this behaviour.

       // Arrays are wrapped in structs to allow them to have normal
       // value semantics (avoiding the array "decay").

For example, the CBE translates:
    @a = common global [10 x i32] zeroinitializer, align 16

into:
    struct { unsigned int array[10]; } a;

However, the reason for this behaviour is not completely clear to me. Can anyone
give me further explanation of that is meant by 'array decay' and why it is not
possible (or easy) to generate normal C-style arrays?

an IR function might return an array, or have a parameter of array type. Eg it
might be:
    declare [4 x i8] @foo([8 x i8] %x)
If you tried to turn that into the C
    char[4] foo(char[8] x);
then (1) the compiler would reject it, and (2) even if it didn't reject it it
would probably just return a pointer to the first element of the array rather
than the array elements themselves, in keeping with C's usual confusion between
arrays and pointers. Wrapping the array in a struct gets around these problems.

Ok, that clarifies things a bit although it seems that the current implementation of the struct wrappers also doesn't avoid this problem.

   declare [4 x i8] @foo([8 x i8] %x)

Currently gets translated into the following:

   struct { unsigned char array[4]; } foo(struct { unsigned char array[8]; } );

Which is not acceptable C as it is not allowed to define structures inside a function definition. I guess that I will have to make sure that there is a typedef for these arrays and that the typedef is used in stead of printing the struct again...

On a second note, what kind of code would I need to feed clang to actually produce the example IR?

Compiling the following example with clang:

   typedef struct {int array[4];} array_t;

   array_t f(array_t b) {
       return b;
   }

results in:

   %struct.array_t = type { [4 x i32] }

   define void @f(%struct.array_t* noalias sret %agg.result, %struct.array_t* byval align 4 %b) nounwind {
   entry:
     %0 = bitcast %struct.array_t* %agg.result to i8*
     %1 = bitcast %struct.array_t* %b to i8*
     call void @llvm.memcpy.p0i8.p0i8.i32(i8* %0, i8* %1, i32 16, i32 4, i1 false)
     ret void
   }

   declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind

Which does not provide the struct as a return value but moves the return value to the argument list. Could you show me an example clang input which does provide the function definition from your example?

Cheers,
  Roel

Hi Roel,

Hi Duncan,

Hi Roel,

When going through the internals of the old C back-end, I see that the CBE
encapsulates arrays into a struct. The source code has the following comment to
explain this behaviour.

       // Arrays are wrapped in structs to allow them to have normal
       // value semantics (avoiding the array "decay").

For example, the CBE translates:
    @a = common global [10 x i32] zeroinitializer, align 16

into:
    struct { unsigned int array[10]; } a;

However, the reason for this behaviour is not completely clear to me. Can anyone
give me further explanation of that is meant by 'array decay' and why it is not
possible (or easy) to generate normal C-style arrays?

an IR function might return an array, or have a parameter of array type. Eg it
might be:
    declare [4 x i8] @foo([8 x i8] %x)
If you tried to turn that into the C
    char[4] foo(char[8] x);
then (1) the compiler would reject it, and (2) even if it didn't reject it it
would probably just return a pointer to the first element of the array rather
than the array elements themselves, in keeping with C's usual confusion between
arrays and pointers. Wrapping the array in a struct gets around these problems.

Ok, that clarifies things a bit although it seems that the current
implementation of the struct wrappers also doesn't avoid this problem.

yes, Chris broke it when he introduced his new type system. It used to work.
This breakage was one of the straws that broke the camel's back and led to the
CBE being removed.

   declare [4 x i8] @foo([8 x i8] %x)

Currently gets translated into the following:

   struct { unsigned char array[4]; } foo(struct { unsigned char array[8]; } );

Which is not acceptable C as it is not allowed to define structures inside a
function definition. I guess that I will have to make sure that there is a
typedef for these arrays and that the typedef is used in stead of printing the
struct again...

On a second note, what kind of code would I need to feed clang to actually
produce the example IR?

Compiling the following example with clang:

   typedef struct {int array[4];} array_t;

   array_t f(array_t b) {
       return b;
   }

results in:

   %struct.array_t = type { [4 x i32] }

   define void @f(%struct.array_t* noalias sret %agg.result, %struct.array_t*
byval align 4 %b) nounwind {
   entry:
     %0 = bitcast %struct.array_t* %agg.result to i8*
     %1 = bitcast %struct.array_t* %b to i8*
     call void @llvm.memcpy.p0i8.p0i8.i32(i8* %0, i8* %1, i32 16, i32 4, i1 false)
     ret void
   }

   declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32,
i32, i1) nounwind

Which does not provide the struct as a return value but moves the return value
to the argument list. Could you show me an example clang input which does
provide the function definition from your example?

Why not just write the IR you want to test directly without bothering with
clang? That said, try an array of char rather than an array of int.

Ciao, Duncan.