Difference between call site attributes and declaration attributes

bfbachmann · January 2, 2025, 7:01pm

I recently ran into issues using the byval attribute. I was adding it to my functions that take arguments by value, but as pointers. I noticed that the optimizer would totally butcher my code if I only added the byval attribute to the function declaration and not at the call site. Specifically, it would simply remove any code that initialized data that I was passing in my byval arguments, so it would just pass a pointer to uninitialized memory from a previous alloca. Adding byval to the relevant arguments at the call site seemed to fix the problem.

This is very confusing to me as none of the documentation I’ve read says anything about call site attributes (to the point where I didn’t even think you could set attributes at the call site). Why do I need to set attributes at the call site if they’re already on the function declaration? Am I using this attribute wrong? Is this covered somewhere in the docs?

For additional context, here is a post I made about this on r/llvm with specific code samples: Reddit - Dive into anything.

Thanks so much for your help!

akorobeynikov · January 2, 2025, 8:39pm

This is right here:

‘function args’: argument list whose types match the function signature argument types and parameter attributes. All arguments must be of first class type. If the function signature indicates the function accepts a variable number of arguments, the extra arguments can be specified.

The IR Verifier is supposed to catch these kinds of discrepancies. So, it would make sense to run on your IR output.

bfbachmann · January 2, 2025, 9:04pm

Thank you!

nikic · January 2, 2025, 9:28pm

This is quite peculiar. You are the third person to hit this case recently, two others being opt -O3 removes memove initialiser · Issue #118235 · llvm/llvm-project · GitHub and DSE removes store to an alloca that is passed to a byval parameter · Issue #120696 · llvm/llvm-project · GitHub.

The background here is that for indirect calls like call void %foo(), where %foo is an argument/instruction rather than a global, the function declaration is not known, so the only place where ABI-affecting attributes can be specified is the call-site. Anything that affects the ABI (including the function type and the ABI attributes) needs to be the same at the call-site and the function definition, to ensure that the caller passes arguments the same way as the callee receives them. (In first approximation, I’m glossing over some details here.)

A mismatch is still valid IR, just undefined behavior at runtime, which is why the IR verifier does not report this. This is something the IR linter (-passes=lint) should report, but currently doesn’t.

The reason why you mostly get away with only placing the attributes on the function declaration is that LLVM usually inherits attributes from the declaration to the call-site for optimization purposes, which also inherits the ABI attributes. This isn’t guaranteed though, and you hit one of the cases where it does not happen.

All this doesn’t appear to be well-documented in LangRef. We should improve that.

rnk · January 7, 2025, 12:01am

The confusion here is very similar in spirit to the (admittedly extremely old) FAQ
entry about mismatched calling conventions becoming unreachable:

Why does instcombine + simplifycfg turn a call to a function with a mismatched calling convention into “unreachable”? Why not make the verifier reject it?
This is a common problem run into by authors of front-ends that are using custom calling conventions: you need to make sure to set the right calling convention on both the function and on each call to the function.

ABI argument attributes are sort of a calling convention extension mechanism, so you need to set them on both the call site and function declaration for the same reasons.

As Nikita mentioned, LLVM will sometimes look through direct call sites to see some ABI attributes, but it really shouldn’t. I believe Arthur attempted to stop looking through direct call sites, but it breaks a ton of instrumentation passes, which then need to start adding sext annotations and similar attributes.

As for documenting this, I think the paragraphs on parameter attributes could use some work and communicate this.

nikic · January 7, 2025, 1:18pm

I’ve created two PRs to improve this a bit.

LangRef: [LangRef] Add some documentation for ABI / call-site attributes by nikic · Pull Request #121930 · llvm/llvm-project · GitHub
Lint: [Lint] Lint mismatch in ABI attributes by nikic · Pull Request #121929 · llvm/llvm-project · GitHub

bfbachmann · January 8, 2025, 6:57pm

Thank you @nikic. You’re awesome!

Topic		Replies	Views
ABI attributes on arguments vs parameters LLVM Dev List Archives	6	86	June 24, 2021
Parameter Attributes in Call instruction and Function LLVM Dev List Archives	1	75	April 22, 2014
Setting alignment for a ByVal argument LLVM Dev List Archives	5	81	April 28, 2010
Difference between "byval" and actually passing by value? LLVM Dev List Archives	3	187	April 23, 2018
How to add/use parameter attributes? Troubles with "byval" LLVM Dev List Archives	0	103	April 21, 2018

Difference between call site attributes and declaration attributes

Related topics