RFC: liveoncall parameter attribute

TLDR - I have a runtime which expects to be able to inspect certain arguments to a function even if that argument isn't used within the callee itself. DeadArgumentElimination doesn't respect this today. I want to add an argument that records an argument to a call as live even if the value is known to be not used in the callee.

My use case

Hey Philip

I have no problem with this but I'm also not experienced enough with patch points to give a formal LGTM or anything like that.

However, I do wonder about another pass impacting calls. I can't remember it's name right now but it's basically IPO SROA. It would be able to take an argument you've tagged here, and if it's a struct, split it in 2.

Can you imagine ever creating a call in your runtime where that would be a problem?

If you can't then no worries and I can't think of anything better named than liveoncall. However, if you can then how about something simple like noopt on the argument which just turns off any optimization on that argument?

Alternatively, the intrinsic equivalent to this is sideeffects, but that might be a bit overkill as it could mean just about anything. I'd prefer to keep this quite specific.

Cheers
Pete

Hi Philip,

Without knowing more, using an intrinsic for this seems like a better way to go: the intrinsic call will keep value alive, and you can special case the behavior for inlining in the inliner, which is apparently the only place where this is problematic. I suspect that this will be a lot less impact on the existing compiler, be much less likely to break going forward, and also trivially composes across other existing argument attributes.

What are the disadvantages of going with a new intrinsic?

-Chris

Hey Philip

I have no problem with this but I'm also not experienced enough with patch points to give a formal LGTM or anything like that.

However, I do wonder about another pass impacting calls. I can't remember it's name right now but it's basically IPO SROA. It would be able to take an argument you've tagged here, and if it's a struct, split it in 2.

Can you imagine ever creating a call in your runtime where that would be a problem?

In my specific case, not really. We break apart all structs into their component pieces for ABI reasons. But your point is a good one and is definitely worth considering in terms of general usage in LLVM.

The challenge here is that the splitting you describe might not be an optimization. I'm not sure, but I think there are calling conventions already in tree that require exactly this type of splitting. I think this is currently done in the frontend (clang), but if we wanted to change that, it would be problematic.

I think we'd need to restrict this to ABI primitive types, but that's not unreasonable to do. Definitely something to document though.

The intrinsic I was thinking of would be something along the lines of:
void @llvm.live_on_call(<type> val) readnone

One challenge would be that I'd want the intrinsic to have readnone memory semantics, but such an intrinsic with a void return would be trivially dropped. We already somewhat hack around this for things like assumes, so I could probably extend the logic.

Places that would need extended include:
- InlineCostAnalysis - discount the cost of the call
- InlineFunction - remove the call after inlining
- AliasAnalyis/MDA/etc.. (per above comment)
- Verifier - must be in entry block

You're right, that doesn't actually sound that bad. It's a bit more code, but not *that* much more. Semantically, it makes a bit more sense to me as a parameter attribute - it really is an requirement of the *call*, not of the *callee* - but I could see doing the intrinsic instead. It's not that confusing.

As it happens, my immediate motivation to implement this has disappeared. It turned out the symptom I was investigating when I came up with this proposal was merely a hint of a larger problem that became apparent once we started really looking at what dead argument elimination was doing in the original case. We use a limited form of function interposition which replaces the implementation of the callee, and dead arg elimination is unsound in this context. (i.e. just because we analyzed what the method did as if were compiled doesn't mean it's safe to run that method in the interpreter. That unused pointer argument might actually be accessed and it had better be valid and dereferenceable even if the result is provably irrelevant to the final result.) Any solution I've come up with to the underlying problems solves this somewhat by accident, so I'm probably going to just set this aside for the moment and come back to it in the future if I need it again.

Philip