How to prevent optimizing away a call + its arguments

Hi llvm-dev,

I have a C function:

__attribute__((__visibility__("default")))
__attribute__((used))
__attribute__((noinline))
void please_do_not_optimize_me_away(int arg1, void *arg2) {
  asm volatile("" :::);
}

(the purpose is that this function will be used dynamically at runtime, perhaps by interposing the function, or via the debugger)

I really thought this will not get optimized out, but I've realized (the hard way) that LLVM will happily optimize a call to this function, and replace all arguments with undef, because it figures out that they're not really needed.

I'm going to fix this by passing the arguments explicitly as inputs to the asm, but is that expected? Is there any more reasonable way (attribute) of telling that the compiler should really not expect anything from the body of the function, not assume that it's not doing anything, and not optimizing out arguments?

Thanks,
Kuba

Hi Kuba,

Try:

attribute(optnone)

See https://clang.llvm.org/docs/AttributeReference.html#optnone-clang-optnone

Actually, it should be enough to use:

__attribute__((noinline))
void please_do_not_optimize_me_away(int arg1, void *arg2) {
  asm volatile("":::"memory");
}

Creating a real barrier is important.

Joerg

optnone should work, but really noinline should probably (Chandler: Can you confirm: is it reasonable to model noinline as “no interprocedural analysis across this function boundary” (so FunctionAttrs should do the same thing for noinline as it does for optnone, for example? ie: not derive any new attributes) - allowing the function to be optimized internally (unlike optnone) but not allowing interprocedural analysis inside the function to be used in callers (unlike optnone)) work as well?

noinline should, and does, not mean “do not do IPO”; we will still do IPCP. The easiest way to defeat IPO is to use attribute((weak)) as it makes isDefinitionExact false: https://godbolt.org/g/VVBcgF

IMHO, No. The only semantic of noinline is that the inliner(s) pass shouldn’t do anything with that function. Inhibiting IPO seems like could be a valid usecase but that would need a different attribute (e.g. noipo).

I don't think it is reasonable to expect "noinline" to mean "must not do
IPA". There are different reasons for using "noinline": ensuring a stack
frame, forcing outlining of "cold" code etc. Many of those reasons are
perfectly fine to still allow IPA. Debug hooks fall into two categories:
making sure that the call happens (noinline should allow that) and
making sure that the debugger can actually do something at this point
(noinline should not have to allow that).

Joerg

optnone should work, but really noinline should probably (Chandler: Can you
confirm: is it reasonable to model noinline as “no interprocedural analysis
across this function boundary” (so FunctionAttrs should do the same thing
for noinline as it does for optnone, for example? ie: not derive any new
attributes) - allowing the function to be optimized internally (unlike
optnone) but not allowing interprocedural analysis inside the function to
be used in callers (unlike optnone)) work as well?

I don’t think it is reasonable to expect “noinline” to mean “must not do
IPA”. There are different reasons for using “noinline”: ensuring a stack
frame, forcing outlining of “cold” code etc. Many of those reasons are
perfectly fine to still allow IPA. Debug hooks fall into two categories:
making sure that the call happens (noinline should allow that)

noinline (& in fact, even optnone) doesn’t make sure the call happens - various forms of IPA can cause a call to go away without actually inlining.

(simplest example, that even the inliner got wrong (& I fixed recently, which is why any of this comes to mind/I have any context on it) - the inliner removed a call to an optnone+readnone function without consulting the inliner heuristic (this was in the alwaysinliner) because it assumed the operation was so cheap no inliner heuristic would ever disagree, basically :wink: )

But some other optimization could/would still remove a noinline+readnone function because it’s a trivially dead instruction (assuming the result isn’t used). So noinline doesn’t preserve the call - because some IPA can, in some cases, be as powerful as inlining-ish.

I agree, but still I don't think it's `noinline` job to prevent this
from happening. It sounds weird (and probably a POLA violation) having
`noinline` preventing interprocedural constant propagation.
About `optnone`, I'm surprised is not powerful enough to prevent this
from happening, modulo bugs of course. Do you have other examples?

optnone should work, but really noinline should probably (Chandler: Can
you
confirm: is it reasonable to model noinline as “no interprocedural
analysis
across this function boundary” (so FunctionAttrs should do the same
thing
for noinline as it does for optnone, for example? ie: not derive any new
attributes) - allowing the function to be optimized internally (unlike
optnone) but not allowing interprocedural analysis inside the function
to
be used in callers (unlike optnone)) work as well?

I don’t think it is reasonable to expect “noinline” to mean “must not do
IPA”. There are different reasons for using “noinline”: ensuring a stack
frame, forcing outlining of “cold” code etc. Many of those reasons are
perfectly fine to still allow IPA. Debug hooks fall into two categories:
making sure that the call happens (noinline should allow that)

noinline (& in fact, even optnone) doesn’t make sure the call happens -
various forms of IPA can cause a call to go away without actually inlining.

(simplest example, that even the inliner got wrong (& I fixed recently,
which is why any of this comes to mind/I have any context on it) - the
inliner removed a call to an optnone+readnone function without consulting
the inliner heuristic (this was in the alwaysinliner) because it assumed the
operation was so cheap no inliner heuristic would ever disagree, basically
:wink: )

But some other optimization could/would still remove a noinline+readnone
function because it’s a trivially dead instruction (assuming the result
isn’t used). So noinline doesn’t preserve the call - because some IPA can,
in some cases, be as powerful as inlining-ish.

I agree, but still I don’t think it’s noinline job to prevent this
from happening. It sounds weird (and probably a POLA violation) having
noinline preventing interprocedural constant propagation.
About optnone, I’m surprised is not powerful enough to prevent this
from happening, modulo bugs of course. Do you have other examples?

There were bugs. I fixed them. :slight_smile: (specifically it was a combination of FunctionAttrs proving readnone on an optnone function - fixed. And the alwaysinliner killing trivially dead calls (so any function call with readnone, even without alwaysinline, could be ‘inlined’ (removed) by the alwaysinliner) - also fixed)

Actually, it should be enough to use:

__attribute__((noinline))
void please_do_not_optimize_me_away(int arg1, void *arg2) {
asm volatile("":::"memory");
}

Creating a real barrier is important.

This doesn't work – the call still gets turned into please_do_not_optimize_me_away(undef, undef).

__attribute__((optnone))

optnone works, but I'm actually surprised by this. I would expect that it would only affect the generated code of that function...

Is it guaranteed to work? Or is my safest bet still to use:

__attribute__((noinline))
void please_do_not_optimize_me_away(int arg1, void *arg2) {
asm volatile("" :: "r" (arg1), "r" (arg2) : "memory");
}

(The other benefit compared to optnone is that this will actually generate a nice empty function. Using optnone generates code that stores the arguments to the stack.)

Kuba

Actually, it should be enough to use:

attribute((noinline))
void please_do_not_optimize_me_away(int arg1, void *arg2) {
asm volatile("":::“memory”);
}

Creating a real barrier is important.

This doesn’t work – the call still gets turned into please_do_not_optimize_me_away(undef, undef).

attribute((optnone))

optnone works, but I’m actually surprised by this. I would expect that it would only affect the generated code of that function…

Is it guaranteed to work?

Modulo bugs, yes - optnone should have the same behavior as if you put the function definition in another file and compiled that file with -O0.

It looks like what you are trying to do here is define a weak function. Marking the function as weak should have the desired effect, though last time I looked apple paltforms only had limited support for weak functions...

- Matthias

I'm not saying it does that. But I am saying that is why someone might
want to use it. That's why I gave the example with the memory clobber --
that is known to work for both GCC and Clang and fits here in the sense
that (1) it can't be duplicated (2) it contains a side effect. Now
whether this is the semantic we want to have for noinline is a different
question.

Joerg

If you also want it to preserve the arguments (that wasn't clear to me),
just add them as arguments to the asm statement?

Joerg