But then, why refuse aggregates as input or output of a call? What is
the rationale?
Because LLVM has no notion of aggregates as "values" that can be
passed around as atomic units. This is a very important design point,
and has many useful values.
I see. You explained one of them in a message on the XL mailing list, which I think is worth repeating here:
This doesn't fit naturally with the way that LLVM does things: In
LLVM, each instruction can produce at most one value. This means that
a pointer to the instruction is as good as a pointer to the value,
which dramatically simplifies the IR and everything that consumes or
produces it.
An additional constraint you did not mention is that all the values must be first-class. But what is "first class" actually depends on the hardware and ABI. An i64, for instance, is first class on 64-bit CPUs, but not on 32-bit CPUs. Is the following legal on a 32-bit target?
declare i64 @foo(i128, i256)
The "getaggregatevalue" is a localized hack to work
around this for the few cases that return multiple values.
As a matter of fact, what annoys me the most with the getaggregatevalue proposal is precisely that it does not seem too localized to me. What about:
%Agg = call {int, float} %foo()
%intpart = getaggregatevalue {int, float} %Agg, uint 0
[insert 200 instructions here]
%floatpart = getaggregatevalue {int, float} %Agg, uint 1
What about a downstream IR manipulation turning that into:
%Agg = call {int, float} %foo()
%intpart = getaggregatevalue {int, float} %Agg, uint 0
br label somewhere
somewhere:
%floatpart = getaggregatevalue {int, float} %Agg, uint 1
I am afraid that the hack would not remain localized for too long
i.e. you probably will need to have stuff to keep the call and getaggregatevalue close together.
Unfortunately, this wouldn't solve the problem that you think it
does. For example, lets assume that LLVM allowed you to pass and
return structs by value. Even with this, LLVM would not be able to
directly implement all ABIs "naturally". For example, some ABIs
specify that a _Complex double should be returned in two FP registers,
but that a struct with two doubles in it should be returned in memory.
Even today, that must be special cased, i.e. the IR needs to be distinct between the two cases. As I understand it, the following is already legal, since vectors are first class:
declare <2 x double> @builtin_complex_add (<2 x double>, <2 x double>)
That would be the built-in complex type. The user-defined complex-in-struct type could be one of the following depending on the ABI:
declare void @user_complex_add (double, double, double, double, {double, double} *)
declare void @user_complex_add ({double, double} *, double, double, double, double)
declare void @user_complex_add ({double, double} *, {double, double} *, {double, double} *)
My proposal would not invalidate any of these, but allow the following, which would immediately be expanded to the appropriate choice of the above depending on the target calling conventions:
declare {double, double} @user_complex_add({double, double}, {double, double})
It's possible that you want to allow some parameter attributes, i.e. be able to distinguish:
declare sret {double, double} @user_complex_add({double, double}, {double, double})
declare inreg {double, double} @user_complex_add({double, double}, {double, double})
By the time you lower to LLVM, all you have is {double,double}. In
fact, there is no way, in general, to retain all the high level
information in LLVM without flavoring the LLVM IR with target info
Agreed.
Anyway, for the moment, I will generate what LLVM accepts as input.
Thanks
Christophe