This is my current understanding of the current situation and the
proposed solution.
Currently llvm-gcc compiles
This is my current understanding of the current situation and the
proposed solution.
Currently llvm-gcc compiles
What you described is not "the C way" of doing things, but the ABI way. How arguments are passed should not be source language dependent, but calling convention dependent. I think that if other source languages require other calling conventions, either new calling conventions should be added to LLVM, or the frontends for those languages should emulate their particular requirements using tricks like your third proposal (and e.g. by turning parameter types implicitly into pointer types).
Jonas
Hi Rafael,
2) add a "byref" mark in the pointer argument.
I think you mean "bycopy" or "byval" here.
3) Have llvm-gcc create a copy before calling the function.
Don't forget that the function may be called by code that
was not compiled by LLVM. That's why we have to pay attention
to the ABI! Solution (3) supposes we have control over both
callers and callees, but we need to handle the situation in
which we are calling code compiled by a different compiler,
or code compiled by a different compiler is calling us.
Ciao,
Duncan.
> 2) add a "byref" mark in the pointer argument.
I think you mean "bycopy" or "byval" here.
Yes, good catch.
> 3) Have llvm-gcc create a copy before calling the function.
Don't forget that the function may be called by code that
was not compiled by LLVM. That's why we have to pay attention
to the ABI! Solution (3) supposes we have control over both
callers and callees, but we need to handle the situation in
which we are calling code compiled by a different compiler,
or code compiled by a different compiler is calling us.
I now noticed that I done a similar mistake as the current
implementation. I have assumed that passing a structure with an
implicit pointer is equivalent as passing it with an explicit one.
That is not the case. In x86_64 for example, the explicit pointer is
passed on rdi, and the implicit one is computed directly from the
stack pointer.
I would like to hide as much of the ABI form llvm as possible, but
that would be a much bigger change then I thought.
So, I think that the only question left before I try to implement
Chris's solution is why it is better to have a "byval" attribute
instead of adding support for struct in arguments? That is, why
define void @f(%struct.cpp_num* %num byval)
is better then
define void @f(%struct cpp_num %num)
Just curious
Ciao,
Duncan.
Thanks a lot,
Rafael
2) add a "byref" mark in the pointer argument.
I think you mean "bycopy" or "byval" here.
Yes, good catch.
Yep, this is the right way to go.
I would like to hide as much of the ABI form llvm as possible, but
that would be a much bigger change then I thought.
Yep, the ultimate goal is to capture enough information that the code generator can implement the ABI correctly, without cluttering up the IR with lots of stuff. Passing the pointer to the argument and telling the code generator "this argument was really passed by value" is enough for many things, and we can refine it to capture other hard things in time.
So, I think that the only question left before I try to implement
Chris's solution is why it is better to have a "byval" attribute
instead of adding support for struct in arguments? That is, whydefine void @f(%struct.cpp_num* %num byval)
is better then
define void @f(%struct cpp_num %num)
Just curious
Good question! There are two answers:
1. LLVM has no support for aggregates as values. SSA values are currently
allowed only to have 'first class' types:
LLVM Language Reference Manual — LLVM 16.0.0git documentation
This means that there is no way to define a value that could be passed
into a function.
We would have to do significant extension to allow non-scalar values,
and that brings up a host of problems. In particular, SSA depends on
the notion that the SSA values are efficiently copyable (e.g. phi node
lowering turns phis into copies). If SSA values were aggregates, we'd
have to have phi nodes lowerable into memcpy's, etc. In the historical
development of SSA form, many early compilers tried to rewrite
everything into SSA form, including memory and aggregates, and had
significant problems with this or ran into significant complications.
LLVM sidesteps this problem by not trying to do this, but it does lead
to other minor complications (like this one).
2. There are two problems that need to be solved here. One is capturing
whether the struct was passed by value (which we can easily do with the
attribute) the other is deciding (in the codegen) what to do with it.
In particular, some ABIs require that "_Complex float" be passed by
value *differently* that a struct with two floats in it. Even if LLVM
supported passing aggregates by value, this problem would still be
unsolved. Since the solution to the two problems is the same (function
attributes), solving them both with the same mechanism is attractive.
Another alternative would be to try to encode the full source level
type system into the LLVM IR, but that is intractable for many reasons
:), and wouldn't solve the by-value copying problem.
-Chris