Clang and packing/unmacking struts uses as arguments/return values

Hello list,

I am trying to interface some JIT’d code I generate (via the llvm-c API) with a module I’ve compiled alongside the jit, which contains certain runtime functions.

I generate runtime.bc with the same clang invocation as the rest of the application, except using -O0 and -emit-llvm. This is loaded at runtime and my I insert new functions into that module as I need them. For runtime functions that just use scalars, this works well.

I’m having problems with structs, and my clang-generated module seems to define the struct as I’d expect, but functions that either take this struct as an arg, or return it seem to fall apart.

I have:

typedef struct {
uint32_t value;
uint32_t carryOut;
uint32_t overflow;
} CarryOp_Result;

This was once upon a time defined at the top of the module as { i32, i32, i32 } as expected, but I’ve since changed things and now thats not emitted in my runtime.bc anymore. I guess it’s not used/needed.

I also have a function in this runtime.bc module:
CarryOp_Result addWithCarry( a, b, carryIn);

Which is defined to return { i64, i32 }. I suspect this is because clang knows the ABI of my host PC and bakes the calling convention into the IR (x86_64-apple-darwin11.4.2). I know enough not to expect to be able to use the same .bc on different arches for reasons such as this, at least :slight_smile:

Because two struct fields are merged, it’s not trivial to get the one I’m interested in. Even if I do emit the required masks and shifts, what if it all changes tomorrow when I want this to work after recompiling everything for say, 32b Linux, or ARM with a different calling convention?

So I had the bright idea to add some accessors to the runtime, which I could feed with the LLVMValue I have, regardless of the actual representation. This way for a given compile of the runtime everything would be self-consistent and “just work”, neatly sidestepping struct ABI packing issues …
uint32_t carryOpValue(CarryOpResult r) { return r.value; }

… or so I thought, anyway. This actually generates:
define i32 @CarryOpValue(i64 %v.coerce0, i32 %v.coerce1) #0 { … }

which takes two arguments, where I expected a single { i64, i32 } argument.

So now, even for my jit to emit
LLVMValueRef sum = CarryOpValue(addWithCarry(a, b, c));

The call site needs to know to unpack the single LLVMValue (representing the struct) into two arguments, in order to pass them to the accessor. So still I need to poke around in the innards of the struct, still knowing how the compiler felt like packing the struct on a given platform.

Is there a way I can have Clang simply use {i32, i32, i32} for that struct in a way that GEP/ExtractValue with indices 0, 1, 2 reliably works as expected?

Alternatively, could I cast between {i64, i32} and {i32, i32, i32} in a reliable way, would that be safe?
Or any other solution?

If worse comes to worst, I can end up writing addWithCarry either by hand or with the builder API to do exactly what I want, but that’s annoying enough to be the reason I went down the “runtime-module full of helpers” path in the first place and have clang do the annoying work for me!

Thanks in advance,
DavidM

Which is defined to return { i64, i32 }. I suspect this is because clang
knows the ABI of my host PC and bakes the calling convention into the IR
(x86_64-apple-darwin11.4.2). I know enough not to expect to be able to use
the same .bc on different arches for reasons such as this, at least :slight_smile:

Yup.

Because two struct fields are merged, it's not trivial to get the one I'm
interested in. Even if I do emit the required masks and shifts, what if it
all changes tomorrow when I want this to work after recompiling everything
for say, 32b Linux, or ARM with a different calling convention?

You'd have to re-compile the whole thing to a different arch. This is
a know issue and have somewhat been designed that way.

The complexity of implementing all that in the back-ends would be too
huge (at least a huge change, now), to do so effectively. Maybe with
some refactoring going on the selection dag, etc we might get that
streamlined, but I wouldn't hold my breath.

Is there a way I can have Clang simply use {i32, i32, i32} for that struct
in a way that GEP/ExtractValue with indices 0, 1, 2 reliably works as
expected?

Nope. You should talk to the NaCl guys, they have the same problem,
and AFAIK, they solved by having several architecture-specific bitcode
files hanging around.

Alternatively, could I cast between {i64, i32} and {i32, i32, i32} in a
reliable way, would that be safe?

You'd have to do that on every function call, so you'd have to track
all PCS-specific lowering on all modules that might interact with that
(or other) functions.

cheers,
--renato

We've talked in the past about having some ABI builders, factoring out the code in Clang so that you'd have a generic library that could construct LLVM types from C types, construct calls to LLVM functions from C types, and extract IR values that directly correspond to C types after calls.

As far as I know, no one has started doing this, but with GSoC season approaching it would make a very useful student project. I'd happily sign up to mentor it if we had a volunteer...

David

I think his needs are slightly different. What we talked about was to
have a PCSBuilder or to add PCS-aware function lowering in the
IRBuilder, but in his case, if I got it right, he has to interact with
already-built IR from his own functions, which happens to be compiled
to one target and been lowered a bit too much.

I wouldn't advise on keeping the IR on a higher level *after*
emission, ie. not a good idea to store half-lowered IR in bitcode
files, or we'll end up creating a new form of IR, which will open a
can of worms.

cheers,
--renato

As I understand it, his problem is that he has a set of runtime functions compiled with clang and a JIT compiler that needs to interface with them. I'd imagine the builder having an API something that would be useable for this case like this:

ABIBuilder B(ABI::x86_64);

PCSCStructType CarryOpResultTy = B.CreateStructTy(B.getUInt32Ty(), B.getUInt32Ty(), B.getUInt32Ty());

ABIFunctionType AddFnTy = B.getFunctionType(CarryOpResult, B.getUInt32Ty(), B.getUInt32Ty(), B.getUInt32Ty());
// Here, a, b, and carryOut are all i32. If they were structure types you'd need a B.PackStruct() or similar call)
ABIReturn AddResult = B.CreateCall(AddFn, AddFnTy, a, b, carryOut);

Value *value = B.ExtractValue(AddResult, 0);
Value *carryOut = B.ExtractValue(AddResult, 1);
Value *overflow = B.ExtractValue(AddResult, 2);

The resulting IR would be a bit messy, but we already have passes that can clean it up. Clang already generates some quite nasty IR in these cases and gets it cleaned up correctly. The author of the JIT, once it used the ABIBuilder for types that needed to interoperate with the platform ABI, would only need to pass a different initialiser value to the ABIBuilder's constructor.

It would be really great to have this functionality in LLVM, as currently every single front-end author has to do it independently if they want to be able to interoperate with C (and, by C, I mean any language that uses the target platform ABI, which is defined in terms of C types). I'd be very happy to mentor a GSoC student who wanted to implement this...

Of course, in this specific example, using the overflow-checked arithmetic builtins would make more sense than using runtime library functions, but the concept still stands.

David