Global variable-length array

Question about "Pascal-style" arrays as mentioned in the reference guide.

Suppose I have a global variable which points to a constant, variable length array.

The question is, how can I assign an array of type "{ i32, [5 x float]}" to a global of type "{ i32, [0 x float]}"? From my experimentation, it appears you can't bitcast or call GEP on a constant aggregate - the only I know of to get the address of a constant is to create a global, but I need the address to create the global.

The only way that I can think of to do it is to have two globals - an anonymous global which is of the same type as the actual array constant, and a second global which is assigned the bitcast of the GEP of the first global. However, this seems overly complicated - is there an easier way to do it?

-- Talin

Do you mean something like this:

struct foo {
   int x;
   char c;
};

struct foo F = { 4, {3, 2, 1, 0 }};

struct foo *G = &F;

$ llvm-gcc t.c -S -o - -emit-llvm

  %struct.foo = type { i32, [0 x i8] }
@F = global { i32, [4 x i8] } { i32 4, [4 x i8] c"\03\02\01\00" } ; <{ i32, [4 x i8] }*> [#uses=1]
@G = global %struct.foo* bitcast ({ i32, [4 x i8] }* @F to %struct.foo*)

?

-Chris

Suppose I have a global variable which points to a constant, variable length array.

This is somewhat contradictory. Do you want an 'vla g;’ in C? If so, your global will be of type %vla* (you’ll load a pointer to the array from the global). Its initializer would be a bitcast of a global which has some type similar to %vla. If not, and you indeed want a global of type %vla*, read on…

The question is, how can I assign an array of type “{ i32, [5 x float]}” to a global of type “{ i32, [0 x float]}”?

You can’t, but declarations and definitions can have differing types. For instance:

$ cat a.ll
@g = external global {i32, [0 x float]}

define {i32, [0 x float]}* @f() {
entry:
ret {i32, [0 x float]}* @g
}
$ cat b.ll
@g = constant {i32, [5 x float]} zeroinitializer

If this is for separate compilation, the linker will automatically resolve the type conflict. It does so by replacing all uses of the declaration with bitcasts of the definition. Using the same example:

$ llvm-as a.ll; llvm-as b.ll; llvm-link a.bc b.bc | llvm-dis
; ModuleID = ‘’
@g = constant { i32, [5 x float] } zeroinitializer ; <{ i32, [5 x float] }*> [#uses=1]

define { i32, [0 x float] }* @f() {
entry:
ret { i32, [0 x float] }* bitcast ({ i32, [5 x float] }* @g to { i32, [0 x float] }*)
}

Internally to your front end, you could do the same thing: Declare the global early with its abstract type, and later when the concrete storage type is known, use ConstantExpr::getBitCast, replaceAllUsesWith, and takeName to do the same thing the linker did above.

— Gordon

Chris Lattner wrote:

Question about "Pascal-style" arrays as mentioned in the reference guide.

Suppose I have a global variable which points to a constant, variable
length array.

The question is, how can I assign an array of type "{ i32, [5 x float]}"
to a global of type "{ i32, [0 x float]}"? From my experimentation, it
appears you can't bitcast or call GEP on a constant aggregate - the only
I know of to get the address of a constant is to create a global, but I
need the address to create the global.

The only way that I can think of to do it is to have two globals - an
anonymous global which is of the same type as the actual array constant,
and a second global which is assigned the bitcast of the GEP of the
first global. However, this seems overly complicated - is there an
easier way to do it?
    
Do you mean something like this:

struct foo {
   int x;
   char c;
};

struct foo F = { 4, {3, 2, 1, 0 }};

struct foo *G = &F;

$ llvm-gcc t.c -S -o - -emit-llvm

  %struct.foo = type { i32, [0 x i8] }
@F = global { i32, [4 x i8] } { i32 4, [4 x i8] c"\03\02\01\00" } ; <{ i32, [4 x i8] }*> [#uses=1]
@G = global %struct.foo* bitcast ({ i32, [4 x i8] }* @F to %struct.foo*)

?
  

Sort of. The example above presumes C semantics and type restrictions - I'm more interested in what's possible in the IR language.

Let me explain a bit more about the use case I am thinking of. Imagine I'm writing something like a Java or C# interpreter, where ever object's first member is a pointer to a TypeInfoBlock (TIB). The TIB has a fixed length header struct, followed by a variable-length vtable.

Part of the process of instantiating a new object consists of filling in the newly-allocated object's TIB pointer. For that reason, we want to declare the TIB as a global variable. Moreover, we'd like all TIBs to be the same type. One reason for this is that the TIB might be referred by another module, one that doesn't know how large the vtable is. So when importing the definition of the TIB as a GlobalVariable, we'd like to be able to declare it as a standard TIB type, rather than the actual TIB initializer type which may have an arbitrary number of vtable entries.

So in other words, we need the declared type of the TIB to be different than the type of it's initializer. Since you can't put a bitcast in-between the global and its initializer, the only way that I can think of to do this is to declare two globals, one for the externally visible type, and one for the initializer type.

As Gordon mentioned, you should define and initialize the global variable with its actual type, and any use of it should use the global variable bitcast to the pointer you want.

This is what my example was showing. The initializer for the G global is a bitcast version of F's address.

-Chris

All right, that solves my problem. Thanks :slight_smile:

Gordon Henriksen wrote: