Exact meaning of byval

Hi,

after working with llvm for a while, I'm still a little confused with the
meaning of the 'byval' attribute. From the langref:

"This indicates that the pointer parameter should really be passed by value to
the function. The attribute implies that a hidden copy of the pointee is made
between the caller and the callee, so the callee is unable to modify the value
in the callee. This attribute is only valid on llvm pointer arguments. It is
generally used to pass structs and arrays by value, but is also valid on
scalars (even though this is silly)."

I'm particularly confused by the "between the caller and the callee" part. The
way I see this, the responsibility for the copying should be with either the
caller or the callee, not somewhere in between. In particular, I think byval
could either mean:
  a) The callee is not allowed to modify the argument. If the original code
  modifies the argument, the callee should make a copy and modify that
  instead.
  b) The caller will always pass in a copy of the object, which the callee is
  free to modify and will be thrown away afterwards.

In both cases, it seems that byval argument must always be a valid pointer.

From the code, I suspect that option b is the case. I would think that option
a is the better option, since it can prevent copies when the callee doesn't
even modify the value (but this stems from C ABI or something?)

Gr.

Matthijs

Matthijs Kooijman wrote:

Hi,

after working with llvm for a while, I'm still a little confused with the
meaning of the 'byval' attribute. From the langref:

"This indicates that the pointer parameter should really be passed by value to
the function. The attribute implies that a hidden copy of the pointee is made
between the caller and the callee, so the callee is unable to modify the value
in the callee. This attribute is only valid on llvm pointer arguments. It is
generally used to pass structs and arrays by value, but is also valid on
scalars (even though this is silly)."

I'm particularly confused by the "between the caller and the callee" part. The
way I see this, the responsibility for the copying should be with either the
caller or the callee, not somewhere in between. In particular, I think byval
could either mean:
  a) The callee is not allowed to modify the argument. If the original code
  modifies the argument, the callee should make a copy and modify that
  instead.
  b) The caller will always pass in a copy of the object, which the callee is
  free to modify and will be thrown away afterwards.
  

Or it could mean:

c) Either a) or b) where the choice is determined by the target when lowering calls and arguments. This allows the different targets to do different things with this attribute to satisfy their ABI.

(I don't know LLVM well enough yet to comment on which of these is right)

In both cases, it seems that byval argument must always be a valid pointer.

From the code, I suspect that option b is the case. I would think that option
a is the better option, since it can prevent copies when the callee doesn't
even modify the value (but this stems from C ABI or something?)
  

Conversely option a) allows the caller to not make a copy when there are no further uses of the object in the function.

Gr.

Matthijs
  

Richard

Hi All,

> I'm particularly confused by the "between the caller and the callee" part. The
> way I see this, the responsibility for the copying should be with either the
> caller or the callee, not somewhere in between. In particular, I think byval
> could either mean:
> a) The callee is not allowed to modify the argument. If the original code
> modifies the argument, the callee should make a copy and modify that
> instead.
> b) The caller will always pass in a copy of the object, which the callee is
> free to modify and will be thrown away afterwards.
>
Or it could mean:

c) Either a) or b) where the choice is determined by the target when
lowering calls and arguments. This allows the different targets to do
different things with this attribute to satisfy their ABI.

Any definitive comment on this one?

The main problem I see with option c), is that different languages want
different semantics, so it's not really target ABI dependent but language ABI
dependent. Which would require the frontend to explicitely add the copying,
because byval can't be reliably used for non-internal functions. Does that
make sense?

Also, I think that an implicit requirement of the byval attribute is that the
pointer it is attached to must always be valid. Ie, a load from it is
guaranteed to be valid (which is what argpromotion seems to assume). Not sure
if the current frontends fully comply with this, though...

Gr.

Matthijs

Hi, byval means that the parameter setup code for performing a call
copies the struct onto the stack, in an appropriate position relative
to the other call parameters.

I'm particularly confused by the "between the caller and the callee" part. The
way I see this, the responsibility for the copying should be with either the
caller or the callee,

It is with the caller. It would be nice to do an explicit memcpy in the caller,
but the problem is that the value has to end up in a very precise place on the
stack and there is no way of representing this in the LLVM IR.

not somewhere in between. In particular, I think byval
could either mean:
  a) The callee is not allowed to modify the argument. If the original code
  modifies the argument, the callee should make a copy and modify that
  instead.

It is the same as passing anything else (for example an integer) by value:
the callee can modify the value if it likes, but this doesn't cause any
changes to the value the callee has. That's because, just like for an
integer, a copy is passed to the callee (in a register or on the stack;
for byval arguments it's always on the stack).

  b) The caller will always pass in a copy of the object, which the callee is
  free to modify and will be thrown away afterwards.

This is how it is done.

In both cases, it seems that byval argument must always be a valid pointer.

Correct. I thought this was stated in the LangRef?

From the code, I suspect that option b is the case. I would think that option
a is the better option, since it can prevent copies when the callee doesn't
even modify the value (but this stems from C ABI or something?)

If the callee doesn't modify the argument, then it might be possible to drop
the byval attribute (you also have to worry about aliasing etc, but some IPO
pass could do it).

Ciao,

Duncan.

Duncan Sands wrote:

Hi, byval means that the parameter setup code for performing a call
copies the struct onto the stack, in an appropriate position relative
to the other call parameters.

I'm particularly confused by the "between the caller and the callee" part. The
way I see this, the responsibility for the copying should be with either the
caller or the callee,
    
It is with the caller. It would be nice to do an explicit memcpy in the caller,
but the problem is that the value has to end up in a very precise place on the
stack and there is no way of representing this in the LLVM IR.
  
not somewhere in between. In particular, I think byval
could either mean:
  a) The callee is not allowed to modify the argument. If the original code
  modifies the argument, the callee should make a copy and modify that
  instead.
    
It is the same as passing anything else (for example an integer) by value:
the callee can modify the value if it likes, but this doesn't cause any
changes to the value the callee has. That's because, just like for an
integer, a copy is passed to the callee (in a register or on the stack;
for byval arguments it's always on the stack).

  b) The caller will always pass in a copy of the object, which the callee is
  free to modify and will be thrown away afterwards.
    
This is how it is done.

In both cases, it seems that byval argument must always be a valid pointer.
    
Correct. I thought this was stated in the LangRef?

From the code, I suspect that option b is the case. I would think that option
a is the better option, since it can prevent copies when the callee doesn't
even modify the value (but this stems from C ABI or something?)
    
If the callee doesn't modify the argument, then it might be possible to drop
the byval attribute (you also have to worry about aliasing etc, but some IPO
pass could do it).

Ciao,

Duncan.

In that case if you did want the callee to copy the argument am I right in thinking that the Frontend would have to pass a pointer and add an explicit memcpy in the callee?

Richard

Hi Richard,

In that case if you did want the callee to copy the argument am I right
in thinking that the Frontend would have to pass a pointer and add an
explicit memcpy in the callee?

I'm not sure what you are asking. Of course caller or callee can
always allocate some temporary on the stack and memcpy to it, then
use that copy from then on. The point of byval is that the copy is
made where the ABI mandates it for by-value call parameters. If
you don't need to be ABI compatible then there is no point using
byval: it would be better to use an explicit temporary and memcpy.

Ciao,

Duncan.

I'm not sure what you are asking. Of course caller or callee can
always allocate some temporary on the stack and memcpy to it, then
use that copy from then on. The point of byval is that the copy is
made where the ABI mandates it for by-value call parameters. If
you don't need to be ABI compatible then there is no point using
byval: it would be better to use an explicit temporary and memcpy.

I think the point is that there could be an ABI where some argument is
guaranteed to be not modified. Ie, like having a const * argument to a
function, but in such a way that the function body can actually modify the
value pointed to, in which case the frontend must modify the code to work on a
copy of the value instead of the actual pointed in value. Something like that?

Gr.

Matthijs

Matthijs Kooijman wrote:

I'm not sure what you are asking. Of course caller or callee can
always allocate some temporary on the stack and memcpy to it, then
use that copy from then on. The point of byval is that the copy is
made where the ABI mandates it for by-value call parameters. If
you don't need to be ABI compatible then there is no point using
byval: it would be better to use an explicit temporary and memcpy.
    

I think the point is that there could be an ABI where some argument is
guaranteed to be not modified. Ie, like having a const * argument to a
function, but in such a way that the function body can actually modify the
value pointed to, in which case the frontend must modify the code to work on a
copy of the value instead of the actual pointed in value. Something like that?

Gr.

Matthijs
  

I referring to the best way to satisfy an ABI which requires that structures / unions are passed as a read only pointer with the callee making a copy if it needs to modify the value.

Richard