Revisiting/refining the definition of optnone with interprocedural transformations

There seems to be a bunch of confusion and probably some
conflation/ambiguity about whether we're talking about IR constructs
on the C attributes.

Johannes - I assume your claim is restricted mostly to the IR? That
having optnone not require or imply noinline improves orthogonality of
features and that there are reasonable use cases where one might want
optnone while allowing inlining (of the optnone function) or optnone
while disallowing inlining (of the optnone function)

Yes, IR level is my concern. I'd be in favor of matching it in C
but I don't care enough to go through the discussions.

My latest use case: `s/optnone//g` should not result in a verifier
error when you just want to try out an optimization on a single function.

Paul - I think you're mostly thinking about/interested in the specific
source level/end user use case that motivated the initial
implementation of optnone. Where, I tend to agree - inlining an
optnone function is not advantageous to the user. Though it's possible
Johannes 's argument could be generalized from IR to C and still
apply: orthogonal features are more powerful and the user can always
compose them together to get what they want. (good chance they're
using attributes behind macros for ease of use anyway - they're a bit
verbose to write by hand all the time)

I'd give the user the capability building blocks and let them work
with that, but again, this is not my main concern right now.

There's also the -O0 use of optnone these days (clang puts optnone on
all functions when compiling with -O0 - the intent being to treat such
functions as though they were compiled in a separate object file
without optimizations (that's me projecting what I /think/ the mental
model should be) - which, similarly, I think will probably want to
keep the current behavior (no ipa/inlining and no optimization -
however that's phrased).

Essentially the high level use cases of optnone all look like "imagine
if I compiled this in a separate object file without LTO" - apparently
even noipa+optnone isn't enough for that, maybe (I need to test that
more, based on a comment from Johannes earlier that inexact
definitions don't stop the inliner... )? Sounds like maybe it's more
like what I'm thinking of is "what if this function had weak linkage"
(ie: could be replaced by a totally different one)?

Yes, weak should do the trick. That said, if you want "separate
object file without LTO", go with `noinline` + `noipa`. This will
make the call edges optimization barriers. If you want to also not
optimize the function, add `optnone` as required.

FWIW, if all functions are `optnone`, `noinline` and `noipa` are
not needed, that is the -O0 case. More specifically, if your caller
is `optnone`, `noinline` and `noipa` are not needed (for that caller).

~ Johannes

> There seems to be a bunch of confusion and probably some
> conflation/ambiguity about whether we're talking about IR constructs
> on the C attributes.
>
> Johannes - I assume your claim is restricted mostly to the IR? That
> having optnone not require or imply noinline improves orthogonality of
> features and that there are reasonable use cases where one might want
> optnone while allowing inlining (of the optnone function) or optnone
> while disallowing inlining (of the optnone function)

Yes, IR level is my concern. I'd be in favor of matching it in C
but I don't care enough to go through the discussions.

My latest use case: `s/optnone//g` should not result in a verifier
error when you just want to try out an optimization on a single function.

I guess you meant adding optnone, rather than removing it? (removing
optnone shouldn't cause any verifier errors, does it?)

> Paul - I think you're mostly thinking about/interested in the specific
> source level/end user use case that motivated the initial
> implementation of optnone. Where, I tend to agree - inlining an
> optnone function is not advantageous to the user. Though it's possible
> Johannes 's argument could be generalized from IR to C and still
> apply: orthogonal features are more powerful and the user can always
> compose them together to get what they want. (good chance they're
> using attributes behind macros for ease of use anyway - they're a bit
> verbose to write by hand all the time)

I'd give the user the capability building blocks and let them work
with that, but again, this is not my main concern right now.

> There's also the -O0 use of optnone these days (clang puts optnone on
> all functions when compiling with -O0 - the intent being to treat such
> functions as though they were compiled in a separate object file
> without optimizations (that's me projecting what I /think/ the mental
> model should be) - which, similarly, I think will probably want to
> keep the current behavior (no ipa/inlining and no optimization -
> however that's phrased).
>
> Essentially the high level use cases of optnone all look like "imagine
> if I compiled this in a separate object file without LTO" - apparently
> even noipa+optnone isn't enough for that, maybe (I need to test that
> more, based on a comment from Johannes earlier that inexact
> definitions don't stop the inliner... )? Sounds like maybe it's more
> like what I'm thinking of is "what if this function had weak linkage"
> (ie: could be replaced by a totally different one)?

Yes, weak should do the trick. That said, if you want "separate
object file without LTO", go with `noinline` + `noipa`. This will
make the call edges optimization barriers. If you want to also not
optimize the function, add `optnone` as required.

Starts to feel like a long list to get what seems like one wholistic
concept ("treat it as though it were a separate non-LTO object file" /
"treat it as though this function had weak linkage"), but it's
probably OK/a fine thing to do (not like there's a high cost to having
multiple attributes since the attribute lists are shared, etc) - just
psychologically for me, there seems to be one core concept and
stitching it together from several attributes makes me worry that
there are gaps (as there have been/what's motivating this discussion -
though it certainly sounds like there will be fewer gaps after this
work, for sure).

FWIW, if all functions are `optnone`, `noinline` and `noipa` are
not needed, that is the -O0 case. More specifically, if your caller
is `optnone`, `noinline` and `noipa` are not needed (for that caller).

Right - though LTO is the case that motivated adding optnone for -O0,
so it would be respected even under LTO - so in that case we'd want
noinline and noipa, by the sounds of it.

- Dave

There seems to be a bunch of confusion and probably some
conflation/ambiguity about whether we're talking about IR constructs
on the C attributes.

Johannes - I assume your claim is restricted mostly to the IR? That
having optnone not require or imply noinline improves orthogonality of
features and that there are reasonable use cases where one might want
optnone while allowing inlining (of the optnone function) or optnone
while disallowing inlining (of the optnone function)

Yes, IR level is my concern. I'd be in favor of matching it in C
but I don't care enough to go through the discussions.

My latest use case: `s/optnone//g` should not result in a verifier
error when you just want to try out an optimization on a single function.

I guess you meant adding optnone, rather than removing it? (removing
optnone shouldn't cause any verifier errors, does it?)

Yes, right, my bad. Removing `noinline` or adding `optnone` can
get you in trouble.

Paul - I think you're mostly thinking about/interested in the specific
source level/end user use case that motivated the initial
implementation of optnone. Where, I tend to agree - inlining an
optnone function is not advantageous to the user. Though it's possible
Johannes 's argument could be generalized from IR to C and still
apply: orthogonal features are more powerful and the user can always
compose them together to get what they want. (good chance they're
using attributes behind macros for ease of use anyway - they're a bit
verbose to write by hand all the time)

I'd give the user the capability building blocks and let them work
with that, but again, this is not my main concern right now.

There's also the -O0 use of optnone these days (clang puts optnone on
all functions when compiling with -O0 - the intent being to treat such
functions as though they were compiled in a separate object file
without optimizations (that's me projecting what I /think/ the mental
model should be) - which, similarly, I think will probably want to
keep the current behavior (no ipa/inlining and no optimization -
however that's phrased).

Essentially the high level use cases of optnone all look like "imagine
if I compiled this in a separate object file without LTO" - apparently
even noipa+optnone isn't enough for that, maybe (I need to test that
more, based on a comment from Johannes earlier that inexact
definitions don't stop the inliner... )? Sounds like maybe it's more
like what I'm thinking of is "what if this function had weak linkage"
(ie: could be replaced by a totally different one)?

Yes, weak should do the trick. That said, if you want "separate
object file without LTO", go with `noinline` + `noipa`. This will
make the call edges optimization barriers. If you want to also not
optimize the function, add `optnone` as required.

Starts to feel like a long list to get what seems like one wholistic
concept ("treat it as though it were a separate non-LTO object file" /
"treat it as though this function had weak linkage"), but it's
probably OK/a fine thing to do (not like there's a high cost to having
multiple attributes since the attribute lists are shared, etc) - just
psychologically for me, there seems to be one core concept and
stitching it together from several attributes makes me worry that
there are gaps (as there have been/what's motivating this discussion -
though it certainly sounds like there will be fewer gaps after this
work, for sure).

If you want "separate non-LTO" behavior by design, put it in a
different file and don't compile that file with LTO :wink:

Inside a single TU there is no "separate non-LTO" idea, it is
a single TU after all. To get the same effect we build it from
blocks that have a meaning in the single TU case. Sure, there
might be gaps left, that usually means there is something missing
in the single TU case as well so a new attribute is in order.

FWIW, if all functions are `optnone`, `noinline` and `noipa` are
not needed, that is the -O0 case. More specifically, if your caller
is `optnone`, `noinline` and `noipa` are not needed (for that caller).

Right - though LTO is the case that motivated adding optnone for -O0,
so it would be respected even under LTO - so in that case we'd want
noinline and noipa, by the sounds of it.

If you run only one file with -O0 and then LTO it with other files
that do not have -O0, you probably want to add `noinline` + `noipa`
at the "entry points" that you want to debug. I think -O0 could
reasonably add all three arguments anyway, if you say O0 they all
make sense (to me). If you want more control, you need to seed them
manually.

~ Johannes

There seems to be a bunch of confusion and probably some
conflation/ambiguity about whether we're talking about IR constructs
on the C attributes.

Johannes - I assume your claim is restricted mostly to the IR? That
having optnone not require or imply noinline improves orthogonality of
features and that there are reasonable use cases where one might want
optnone while allowing inlining (of the optnone function) or optnone
while disallowing inlining (of the optnone function)

Even at the IR level, I'd argue that inlining a function marked optnone
is violating the contract that the function should not be optimized;
because once it's inlined somewhere else, there's no control over the
optimization applied to the inlined instance.

Not to say we couldn't redefine the IR optnone that way, but it really
feels wrong to have 'optnone' actually mean 'optsometimes'.

Paul - I think you're mostly thinking about/interested in the specific
source level/end user use case that motivated the initial
implementation of optnone. Where, I tend to agree - inlining an
optnone function is not advantageous to the user. Though it's possible
Johannes 's argument could be generalized from IR to C and still
apply: orthogonal features are more powerful and the user can always
compose them together to get what they want. (good chance they're
using attributes behind macros for ease of use anyway - they're a bit
verbose to write by hand all the time)

I'm obviously finding it hard to imagine a real use-case for that...
I mean, sure you can lay out cases and say in a rather theoretical
way, here's this interesting thing that happens when you do this.
Interesting things are interesting, but are they practical/useful?
Any non-speculative, real-world applications? The YAGNI principle
applies here.

(I believe the original inspiration was an MSVC feature, actually.)

As long as the existing Clang __attribute((optnone)) semantics don't
change (i.e., continued to imply noinline) it won't affect my users;
but I would *really* not want to change something like that on them,
without a bonafide use-case that could be readily explained.

Here's a real-world case that might help explain my resistance.
Sony has a downstream feature that allows suppressing debug-info for
inlined functions; the argument is that these are generally small,
easily verifiable, and debugging sessions that keep popping down into
them are annoying and distracting from looking at the real problem.

Our initial implementation depending on whether the function was
actually inlined. For one thing, it was easy to identify inlined
scopes, and just not emit them. However, this was a terrible user
experience, because whether step-in did or didn't happen was dependent
on how the optimizer happened to feel that day. Programmers had no
control over their debugging experience.

We changed this so that programmers could tell, by looking at their
source code, whether debug info would be suppressed. In effect it's
a command-line option that implicitly adds 'nodebug' to a given set
of cases (methods defined in-class, 'inline' keyword).

So, anything that smacks of "you get different things depending on
whether the compiler decided to inline your function" just makes me
twitch.

And that's what the "optnone doesn't mean noinline" proposal does.

There's also the -O0 use of optnone these days (clang puts optnone on
all functions when compiling with -O0 - the intent being to treat such
functions as though they were compiled in a separate object file
without optimizations (that's me projecting what I /think/ the mental
model should be) - which, similarly, I think will probably want to
keep the current behavior (no ipa/inlining and no optimization -
however that's phrased).

The -O0 case was so that you can mix -O0 with LTO and have it stick.
Your as-if seems like a reasonable model for it.

Thanks,
--paulr

There seems to be a bunch of confusion and probably some
conflation/ambiguity about whether we're talking about IR constructs
on the C attributes.

Johannes - I assume your claim is restricted mostly to the IR? That
having optnone not require or imply noinline improves orthogonality of
features and that there are reasonable use cases where one might want
optnone while allowing inlining (of the optnone function) or optnone
while disallowing inlining (of the optnone function)

Even at the IR level, I'd argue that inlining a function marked optnone
is violating the contract that the function should not be optimized;
because once it's inlined somewhere else, there's no control over the
optimization applied to the inlined instance.

Not to say we couldn't redefine the IR optnone that way, but it really
feels wrong to have 'optnone' actually mean 'optsometimes'.

It does not. Take `noinline` as an example. A `noinline` function
is not inlined, so far so good. Now a caller of a `noinline`
function might be inlined all over the place anyway.
What I try to say is that function attributes apply to the function,
not to the rest of the world. If you want to say: do not optimize this
code ever, not here nor anywhere else, use `optnone` + `noinline`. If
you want to the function symbol to contain unoptimized code so
you can debug it, use `optnone`.

Let's make my context sensitive debugging example more concrete:

static void f1(int x) { ... }
static void f2(int x) { ...; f1(x); ... }
static void f3(int x) { ...; f2(x); ... }
static void f4(int x) { ...; f3(x); ... }

static void broken() { f4(B); }
static void working() {
for (int i = 0; i < 1<<20; ++i)
f4(A);
}

void entry() {
working();
broken();
}

So, let's assume we crash somewhere in f1 when we reach it from
broken but not from working. To debug we want to avoid optimizing
f1-4 and broken. To do that we can add `optnone` to all 5 functions
and `noinline` to broken. The effect will be that we have untouched
code in the call chain we want to debug while we potentially/probably
have reasonably fast code in the context of working which allows us
to actually run this fast.

Right now, you can get that effect if you use `__attribute__((flatten))`
on working, however it reverses the problem. You are required to "mark"
all context that should be fast, not the ones you want to debug. Both
can be useful (IMHO).

Paul - I think you're mostly thinking about/interested in the specific
source level/end user use case that motivated the initial
implementation of optnone. Where, I tend to agree - inlining an
optnone function is not advantageous to the user. Though it's possible
Johannes 's argument could be generalized from IR to C and still
apply: orthogonal features are more powerful and the user can always
compose them together to get what they want. (good chance they're
using attributes behind macros for ease of use anyway - they're a bit
verbose to write by hand all the time)

I'm obviously finding it hard to imagine a real use-case for that...
I mean, sure you can lay out cases and say in a rather theoretical
way, here's this interesting thing that happens when you do this.
Interesting things are interesting, but are they practical/useful?
Any non-speculative, real-world applications? The YAGNI principle
applies here.

What about the above? I can totally imagine something like this.

(I believe the original inspiration was an MSVC feature, actually.)

As long as the existing Clang __attribute((optnone)) semantics don't
change (i.e., continued to imply noinline) it won't affect my users;
but I would *really* not want to change something like that on them,
without a bonafide use-case that could be readily explained.

Here's a real-world case that might help explain my resistance.
Sony has a downstream feature that allows suppressing debug-info for
inlined functions; the argument is that these are generally small,
easily verifiable, and debugging sessions that keep popping down into
them are annoying and distracting from looking at the real problem.

Our initial implementation depending on whether the function was
actually inlined. For one thing, it was easy to identify inlined
scopes, and just not emit them. However, this was a terrible user
experience, because whether step-in did or didn't happen was dependent
on how the optimizer happened to feel that day. Programmers had no
control over their debugging experience.

We changed this so that programmers could tell, by looking at their
source code, whether debug info would be suppressed. In effect it's
a command-line option that implicitly adds 'nodebug' to a given set
of cases (methods defined in-class, 'inline' keyword).

So, anything that smacks of "you get different things depending on
whether the compiler decided to inline your function" just makes me
twitch.

And that's what the "optnone doesn't mean noinline" proposal does.

Let's take a step back for a second and assume we would have
always said `optnone` + `noinline` gives you exactly what you
get right now with `optnone`. I think we can explain that to
people, we can say, `optnone` will prevent optimization "inside
this symbol" and `noinline` will prevent the code to be copied
into another symbol. Every use case you have could be served by
adding these two attributes instead of the one you do now. Everyone
would be as happy as they are, all the benefits would be exactly
the same, no behavior change if you use the two together. That said,
it would open up the door for context sensitive debugging. Now you
can argue nobody will ever want to debug only a certain path through
their program, but I find that position requires a justification more
than the opposite which assumes people will find a way to benefit from
it.

There's also the -O0 use of optnone these days (clang puts optnone on
all functions when compiling with -O0 - the intent being to treat such
functions as though they were compiled in a separate object file
without optimizations (that's me projecting what I /think/ the mental
model should be) - which, similarly, I think will probably want to
keep the current behavior (no ipa/inlining and no optimization -
however that's phrased).

The -O0 case was so that you can mix -O0 with LTO and have it stick.
Your as-if seems like a reasonable model for it.

I totally think -O0 can imply all three attributes, optnone, noipa, noinline.
That is not an issue as far as I'm concerned.

~ Johannes

> Not to say we couldn't redefine the IR optnone that way, but it really
> feels wrong to have 'optnone' actually mean 'optsometimes'.

It does not.

A point of phrasing: Please do not tell me how I feel.
I say it feels wrong, and your denial does not help the conversation.

The word "none" means "none." It does not mean "sometimes." Can we
agree on that much?

Redefining a term "xyz-none" to mean "xyz-sometimes" feels wrong.
If you want an attribute that means "optsometimes" then it should
be a new attribute, with a name that reflects its actual semantics.
I am not opposed to that, but my understanding is that we have been
arguing about the definition of the existing attribute.

Take `noinline` as an example. A `noinline` function
is not inlined, so far so good.

And if the compiler decides it is useful to make copies/clones of
the function, those aren't inlined either. The copies retain their
original attributes and semantics. (Perhaps the compiler can make
copies to take advantage of argument propagation, or some such. I
do not think this proposition is unreasonable.)

Now a caller of a `noinline`
function might be inlined all over the place anyway.
What I try to say is that function attributes apply to the function,
not to the rest of the world. If you want to say: do not optimize this
code ever, not here nor anywhere else, use `optnone` + `noinline`. If
you want to the function symbol to contain unoptimized code so
you can debug it, use `optnone`.

You are making a severe distinction between the copy of the function
that happens not to be inlined, and the copies that have been inlined,
such that the inlined copies have lost their original properties.
But just as the copies of the `noinline` function retain `noinline`
and are not inlined, I argue that the `optnone` function copies ought
to retain `optnone` and not be optimized.

LLVM does not have a way to not-optimize part of a function, so we
achieve the goal by not inlining `optnone` functions.

I dispute that the inlined copies of an `optnone` function should
lose that much of their original characteristics, and the rest of
the disagreement follows from there. But I have a suggestion to
offer below.

Let's take a step back for a second and assume we would have
always said `optnone` + `noinline` gives you exactly what you
get right now with `optnone`. I think we can explain that to
people, we can say, `optnone` will prevent optimization "inside
this symbol" and `noinline` will prevent the code to be copied
into another symbol. Every use case you have could be served by
adding these two attributes instead of the one you do now. Everyone
would be as happy as they are, all the benefits would be exactly
the same, no behavior change if you use the two together. That said,
it would open up the door for context sensitive debugging. Now you
can argue nobody will ever want to debug only a certain path through
their program, but I find that position requires a justification more
than the opposite which assumes people will find a way to benefit from
it.

I don't think "inside this symbol" is meaningful to most programmers.
They see methods/functions, and the internal operation of compilers
(e.g., making copies of functions) are relatively mysterious. I say
this as someone who has spent many decades helping programmers use my
compilers.

I don't dispute that you can invent a scenario where it could be useful;
I reserve the right to be unpersuaded that it would occur often enough
that people would think of and make use of the feature.

I totally think -O0 can imply all three attributes, optnone, noipa,
noinline.

I totally think -O0 can imply { opt-sometimes, noipa, noinline }; and
this combination can be an upgrade path away from the existing optnone.

Can we proceed on that basis?

Thanks,
--paulr

I'll just say,

If you want "separate non-LTO" behavior by design, put it in a
different file and don't compile that file with LTO :wink:

that is impractical in many build systems, and an excessive amount
of work if you want to selectively build a some code with -O0
because you're debugging something at the moment.
--paulr

Most of this is a pretty academic discussion - and probably more
heat/angst/difficulty than is needed right now, as much as I do care
about both perspectives (orthogonality of features V usability for the
common case).

I'm going to add noipa, and I'm going to wire it up to optnone in
clang. It's possible one of two things happen there: Either we wire up
noipa the same way noinline is (optnone /requires/ noinline now, and
so it'd /require/ noipa) or we change LLVM IR to remove that
constraint/tie between optnone and noinline, and add noipa in that way
too. (the third option of having optnone require one but not both of
these attributes isn't a state I'd want to get in) - though clang -O0
and clang __attribute__((optnone)) would both still lower to
optnone+noinline+noipa regardless of whether LLVM enforces the
connection between them or not.

- Dave

Because I put my constructive suggestion at the end of a long
email, I'll repeat it with more clarity here:

`optnone` is what it is.

Define a new `opt-sometimes` that means what Johannes suggests.
(A better name is more than welcome!)
Clang's __attribute__((optnone)) can migrate to meaning
{ opt-sometimes, noipa, noinline }.

`optnone` can be retired, existing only in the bitcode upgrader
which replaces it with { opt-sometimes, noipa, noinline }.

Changing Clang's attribute to mean something else can be put off
to another day.
--paulr

Because I put my constructive suggestion at the end of a long
email, I'll repeat it with more clarity here:

`optnone` is what it is.

Define a new `opt-sometimes` that means what Johannes suggests.
(A better name is more than welcome!)

The point is, I never suggested to change the meaning of
`optnone` (in IR). I argue to change the requirement for
it to always go with `noinline`. `optnone` itself is not
changed, you get exactly the same behavior you got before,
and `noinline` is also not changed. They are simply not
required to go together.

If you look at the uses of `optnone` in LLVM, you will not
find the passes to look for `noinline` as well, nor do they
need to. The decision to (not) act is based on `optnone`,
which it not changed at all.

~ Johannes

I'll just say,

If you want "separate non-LTO" behavior by design, put it in a
different file and don't compile that file with LTO :wink:

that is impractical in many build systems, and an excessive amount
of work if you want to selectively build a some code with -O0
because you're debugging something at the moment.

Fair, it was also not my suggestion how to approach this, that
came in the following paragraph.

Generally, I'd prefer if we do not pick sentences than end with
a smiley out of context, there is little gain in that.

~ Johannes

Generally, in conversations that are already a bit heated (by
confusion and otherwise) comments like this come off to me as further
inflammatory (whereas humor in other situations can improve social
bonds/connection) - belittling an argument that's trying to be made in
good faith.

- Dave

Not to say we couldn't redefine the IR optnone that way, but it really
feels wrong to have 'optnone' actually mean 'optsometimes'.

It does not.

A point of phrasing: Please do not tell me how I feel.
I say it feels wrong, and your denial does not help the conversation.

I feel you are interpreting my words in a way that makes them
sound worse than I would imagine outside observers do interpret
them, especially as they come with context and not as standalone
as it looks in your reply.

That said, I do not wish to tell you how you feel, should feel,
or anything else in that direction for that matter. If my words
come across as such, apologies. I will try to work on that.

The word "none" means "none." It does not mean "sometimes." Can we
agree on that much?

We can. Unsure why you would imagine I do not know the meaning of
"none" or "sometimes" for that matter. I think we can establish I
know basic words to avoid these kind of questions in the future :slight_smile:

Redefining a term "xyz-none" to mean "xyz-sometimes" feels wrong.

Agreed. I do not believe I'm proposing to do that.

If you want an attribute that means "optsometimes" then it should
be a new attribute, with a name that reflects its actual semantics.

Agreed. I am always in favor of attributes that have a single
specific meaning and a suitable name. I don't have a use case
for "optsomtimes" just yet but generally speaking I'm all for
composeable attributes that do not conflate ideas.

I am not opposed to that, but my understanding is that we have been
arguing about the definition of the existing attribute.

I don't think we do, especially since I do not think I want to
change the definition of `optnone`, at least the part that
all use cases in LLVM that I'm aware of are looking at. So, passes
would still be skipped if a function is `optnone` as the description
in the lang ref says. What would be different is that you have the
option, not the obligation, to pair it with `noinline`. If you do,
you get the `noinline` effect. If you don't, you don't. The `optnone`
effect stays the same either way.

Take `noinline` as an example. A `noinline` function
is not inlined, so far so good.

And if the compiler decides it is useful to make copies/clones of
the function, those aren't inlined either. The copies retain their
original attributes and semantics. (Perhaps the compiler can make
copies to take advantage of argument propagation, or some such. I
do not think this proposition is unreasonable.)

Now a caller of a `noinline`
function might be inlined all over the place anyway.
What I try to say is that function attributes apply to the function,
not to the rest of the world. If you want to say: do not optimize this
code ever, not here nor anywhere else, use `optnone` + `noinline`. If
you want to the function symbol to contain unoptimized code so
you can debug it, use `optnone`.

You are making a severe distinction between the copy of the function
that happens not to be inlined, and the copies that have been inlined,
such that the inlined copies have lost their original properties.
But just as the copies of the `noinline` function retain `noinline`
and are not inlined, I argue that the `optnone` function copies ought
to retain `optnone` and not be optimized.

If you want `optnone` functions to not be copied/inlined, use
`noinline`. We have an attribute for that and we literally require
it right now to get the effect you want. It is not `optnone` that
prevents copies which are then optimized, it is `noinline`. I do
not propose to change that one bit.

LLVM does not have a way to not-optimize part of a function, so we
achieve the goal by not inlining `optnone` functions.

Agreed.

I dispute that the inlined copies of an `optnone` function should
lose that much of their original characteristics, and the rest of
the disagreement follows from there. But I have a suggestion to
offer below.

Let's take a step back for a second and assume we would have
always said `optnone` + `noinline` gives you exactly what you
get right now with `optnone`. I think we can explain that to
people, we can say, `optnone` will prevent optimization "inside
this symbol" and `noinline` will prevent the code to be copied
into another symbol. Every use case you have could be served by
adding these two attributes instead of the one you do now. Everyone
would be as happy as they are, all the benefits would be exactly
the same, no behavior change if you use the two together. That said,
it would open up the door for context sensitive debugging. Now you
can argue nobody will ever want to debug only a certain path through
their program, but I find that position requires a justification more
than the opposite which assumes people will find a way to benefit from
it.

I don't think "inside this symbol" is meaningful to most programmers.
They see methods/functions, and the internal operation of compilers
(e.g., making copies of functions) are relatively mysterious. I say
this as someone who has spent many decades helping programmers use my
compilers.

When I say symbol I mean function/method. So "inside this function
or method" is what I tried to say. People can deal with that concept.

I don't dispute that you can invent a scenario where it could be useful;
I reserve the right to be unpersuaded that it would occur often enough
that people would think of and make use of the feature.

I don't claim people will jump on it, nor can I predict how many will
use it at all. What I'm saying it can be useful and that is by itself
a good enough reason (for me) to expose the functionality. Time, and
users, will tell us if they use it or not. Furthermore, I did say that
the C level can be untouched if we really want to, not that I'm in favor
of that but it is certainly a harder sell. The IR level change I am
advocating for is just a verifier condition, nothing else, it doesn't
even leak into the user space. I'm not sure why this is so controversial.

I totally think -O0 can imply all three attributes, optnone, noipa,
noinline.

I totally think -O0 can imply { opt-sometimes, noipa, noinline }; and
this combination can be an upgrade path away from the existing optnone.

Can we proceed on that basis?

I don't know what `optsomtimes` is nor how it differentiates itself from
`optnone`. I'm also unsure why you would not go with `optnone`, `noinline`,
and `noipa` for -O0, isn't that exactly what you wanted to have all along?

~ Johannes

Hi Johannes,

I've taken some time to try to understand your viewpoint,
and I will give some more of the history as I remember it;
hopefully that will help. And a suggestion at the end.

The point is, I never suggested to change the meaning of
`optnone` (in IR). I argue to change the requirement for
it to always go with `noinline`. `optnone` itself is not
changed, you get exactly the same behavior you got before,
and `noinline` is also not changed. They are simply not
required to go together.

Okay.

I can understand looking at the as-implemented handling of
'optnone' and taking that as the intended meaning. So from that
perspective, lacking the history, it does look to you like you are
not suggesting a change in its meaning.

Actually, removing the requirement to tie them together *is*
something that I would consider a semantic change, will require an
update to the LangRef and verifier, and so on. This would be more
obvious if we had originally done either of two things with the
same net semantic effect as we have now:
- make optnone *imply* noinline (like 'naked' does IIUC) instead
  of *requiring* it.
- make the inliner check for optnone on the callee, instead of
  simply tying the two attributes together.

I hope this helps explain why I see optnone+noinline as simply
two parts of one unified feature.

History:

The meaning of 'optnone' was always intended as, don't optimize,
in as many ways as we can manage. I'd rather have not had the
coupling with noinline, but it was the best way forward at the
time to achieve the effect we needed. I spent too many months of
my life getting this accepted at all...
Maybe my definition of optnone in the LangRef is inadequate; but
I am quite sure I understand the original intent.

Which includes this:

When it comes to the interaction of optnone and inlining, there
were of course four cases to consider: caller and callee, and each
might or might not have optnone.

1) caller - N, callee - N
2) caller - N, callee - Y
3) caller - Y, callee - N
4) caller - Y, callee - Y

1) normal case. Inlining and other opts are okay.
2) callee is optnone, so we didn't want it inlined and optimized.
3) caller is optnone, so no inlining happens.
4) caller is optnone, so no inlining happens.

Cases 3/4 are handled by checking for optnone in the inliner pass,
just like a hundred other passes do. This is boilerplate and was
added in bulk to all those passes. Having it all be boilerplate
was important in getting the reviews accepted.

Case 2 was handled by coupling optnone to noinline. As I said,
this was a practical thing done at the time, not because we ever
intended the semantics of optnone to mean anything else. It got
the overall feature accepted, which was the key thing for Sony.

So, what we implemented achieved the effect we wanted, even if
that implementation didn't mean the new attribute *by itself* had
exactly all the effects we wanted.

Yes, decoupling optnone (as implemented) from noinline would allow
case 2 to inline bar into foo, and optimize the result, if that
somehow seems desirable. But, that result is very much against
the original intent of optnone, and is why I have been giving you
such a hard time about this.

I will continue to insist that something called 'optnone' cannot
properly fail to apply to all instances, inlined or not. But if
we're willing to rename the attribute, that problem is solved.

I'm assuming there would be resistance to doing either of the two
things I mentioned at the top (make optnone *imply* noinline, or
modify the inliner to check for optnone on the callee) as that
doesn't seem to be the direction people are moving in.

So the suggestion is:
- reword the definition of optnone to be something that would be
  better named "nolocalopt";
- ideally, actually rename the attribute (because I still say
  that "none" does not mean "unless we've inlined it");
- and yes, if you must, decouple it from noinline and remove that
  paragraph from the LangRef description.

Clang will still pass both IR attributes, so end users won't see
any feature regressions. David can add 'noipa' and we can make
Clang pass that as well.
--paulr

Hi Johannes,

I've taken some time to try to understand your viewpoint,
and I will give some more of the history as I remember it;
hopefully that will help. And a suggestion at the end.

The point is, I never suggested to change the meaning of
`optnone` (in IR). I argue to change the requirement for
it to always go with `noinline`. `optnone` itself is not
changed, you get exactly the same behavior you got before,
and `noinline` is also not changed. They are simply not
required to go together.

Okay.

I can understand looking at the as-implemented handling of
'optnone' and taking that as the intended meaning. So from that
perspective, lacking the history, it does look to you like you are
not suggesting a change in its meaning.

Actually, removing the requirement to tie them together *is*
something that I would consider a semantic change, will require an
update to the LangRef and verifier, and so on. This would be more
obvious if we had originally done either of two things with the
same net semantic effect as we have now:
- make optnone *imply* noinline (like 'naked' does IIUC) instead
   of *requiring* it.
- make the inliner check for optnone on the callee, instead of
   simply tying the two attributes together.

I hope this helps explain why I see optnone+noinline as simply
two parts of one unified feature.

History:

The meaning of 'optnone' was always intended as, don't optimize,
in as many ways as we can manage. I'd rather have not had the
coupling with noinline, but it was the best way forward at the
time to achieve the effect we needed. I spent too many months of
my life getting this accepted at all...
Maybe my definition of optnone in the LangRef is inadequate; but
I am quite sure I understand the original intent.

Which includes this:

When it comes to the interaction of optnone and inlining, there
were of course four cases to consider: caller and callee, and each
might or might not have optnone.

1) caller - N, callee - N
2) caller - N, callee - Y
3) caller - Y, callee - N
4) caller - Y, callee - Y

1) normal case. Inlining and other opts are okay.
2) callee is optnone, so we didn't want it inlined and optimized.
3) caller is optnone, so no inlining happens.
4) caller is optnone, so no inlining happens.

Cases 3/4 are handled by checking for optnone in the inliner pass,
just like a hundred other passes do. This is boilerplate and was
added in bulk to all those passes. Having it all be boilerplate
was important in getting the reviews accepted.

Case 2 was handled by coupling optnone to noinline. As I said,
this was a practical thing done at the time, not because we ever
intended the semantics of optnone to mean anything else. It got
the overall feature accepted, which was the key thing for Sony.

So, what we implemented achieved the effect we wanted, even if
that implementation didn't mean the new attribute *by itself* had
exactly all the effects we wanted.

Yes, decoupling optnone (as implemented) from noinline would allow
case 2 to inline bar into foo, and optimize the result, if that
somehow seems desirable. But, that result is very much against
the original intent of optnone, and is why I have been giving you
such a hard time about this.

I will continue to insist that something called 'optnone' cannot
properly fail to apply to all instances, inlined or not. But if
we're willing to rename the attribute, that problem is solved.

I'm assuming there would be resistance to doing either of the two
things I mentioned at the top (make optnone *imply* noinline, or
modify the inliner to check for optnone on the callee) as that
doesn't seem to be the direction people are moving in.

So the suggestion is:
- reword the definition of optnone to be something that would be
   better named "nolocalopt";
- ideally, actually rename the attribute (because I still say
   that "none" does not mean "unless we've inlined it");
- and yes, if you must, decouple it from noinline and remove that
   paragraph from the LangRef description.

Clang will still pass both IR attributes, so end users won't see
any feature regressions. David can add 'noipa' and we can make
Clang pass that as well.

Works for me.

~ Johannes

Hi Johannes,

I've taken some time to try to understand your viewpoint,
and I will give some more of the history as I remember it;
hopefully that will help. And a suggestion at the end.

The point is, I never suggested to change the meaning of
`optnone` (in IR). I argue to change the requirement for
it to always go with `noinline`. `optnone` itself is not
changed, you get exactly the same behavior you got before,
and `noinline` is also not changed. They are simply not
required to go together.

Okay.

I can understand looking at the as-implemented handling of
'optnone' and taking that as the intended meaning. So from that
perspective, lacking the history, it does look to you like you are
not suggesting a change in its meaning.

Actually, removing the requirement to tie them together *is*
something that I would consider a semantic change, will require an
update to the LangRef and verifier, and so on. This would be more
obvious if we had originally done either of two things with the
same net semantic effect as we have now:
- make optnone *imply* noinline (like 'naked' does IIUC) instead
  of *requiring* it.
- make the inliner check for optnone on the callee, instead of
  simply tying the two attributes together.

I hope this helps explain why I see optnone+noinline as simply
two parts of one unified feature.

History:

The meaning of 'optnone' was always intended as, don't optimize,
in as many ways as we can manage. I'd rather have not had the
coupling with noinline, but it was the best way forward at the
time to achieve the effect we needed. I spent too many months of
my life getting this accepted at all...
Maybe my definition of optnone in the LangRef is inadequate; but
I am quite sure I understand the original intent.

Which includes this:

When it comes to the interaction of optnone and inlining, there
were of course four cases to consider: caller and callee, and each
might or might not have optnone.

1) caller - N, callee - N
2) caller - N, callee - Y
3) caller - Y, callee - N
4) caller - Y, callee - Y

1) normal case. Inlining and other opts are okay.
2) callee is optnone, so we didn't want it inlined and optimized.
3) caller is optnone, so no inlining happens.
4) caller is optnone, so no inlining happens.

Cases 3/4 are handled by checking for optnone in the inliner pass,
just like a hundred other passes do. This is boilerplate and was
added in bulk to all those passes. Having it all be boilerplate
was important in getting the reviews accepted.

Case 2 was handled by coupling optnone to noinline. As I said,
this was a practical thing done at the time, not because we ever
intended the semantics of optnone to mean anything else. It got
the overall feature accepted, which was the key thing for Sony.

So, what we implemented achieved the effect we wanted, even if
that implementation didn't mean the new attribute *by itself* had
exactly all the effects we wanted.

Yes, decoupling optnone (as implemented) from noinline would allow
case 2 to inline bar into foo, and optimize the result, if that
somehow seems desirable. But, that result is very much against
the original intent of optnone, and is why I have been giving you
such a hard time about this.

I will continue to insist that something called 'optnone' cannot
properly fail to apply to all instances, inlined or not. But if
we're willing to rename the attribute, that problem is solved.

I'm assuming there would be resistance to doing either of the two
things I mentioned at the top (make optnone *imply* noinline, or
modify the inliner to check for optnone on the callee) as that
doesn't seem to be the direction people are moving in.

So the suggestion is:
- reword the definition of optnone to be something that would be
  better named "nolocalopt";
- ideally, actually rename the attribute (because I still say
  that "none" does not mean "unless we've inlined it");
- and yes, if you must, decouple it from noinline and remove that
  paragraph from the LangRef description.

Clang will still pass both IR attributes, so end users won't see
any feature regressions. David can add 'noipa' and we can make
Clang pass that as well.

Works for me.

~ Johannes

I see that the clang attribute 'optnone' patch rC205255 (in 2014) added
both the IR 'optnone' and 'noinline' attributes.

If the clang attribute 'optnone' (for debugging purposes) is to be renamed,
I humbly suggest we may consider implementing __attribute__((optimize("O0")))
(limited to "O0" only; other values are not accepted).

Common Function Attributes (Using the GNU Compiler Collection (GCC)) says "The optimize attribute should be used for debugging purposes only. It
is not suitable in production code." which matches our debugging only
purposes.

-O0 code already emits 'optnone' and 'noinline' for non-alwaysinline
functions, so we may not need a new attribute.

I don't have any plans to rename or add clang attributes - though that
one could be added as an alias for whatever the optnone clang
attribute does.

Hi Johannes,

I've taken some time to try to understand your viewpoint,
and I will give some more of the history as I remember it;
hopefully that will help. And a suggestion at the end.

The point is, I never suggested to change the meaning of
`optnone` (in IR). I argue to change the requirement for
it to always go with `noinline`. `optnone` itself is not
changed, you get exactly the same behavior you got before,
and `noinline` is also not changed. They are simply not
required to go together.

Okay.

I can understand looking at the as-implemented handling of
'optnone' and taking that as the intended meaning. So from that
perspective, lacking the history, it does look to you like you are
not suggesting a change in its meaning.

Actually, removing the requirement to tie them together *is*
something that I would consider a semantic change, will require an
update to the LangRef and verifier, and so on. This would be more
obvious if we had originally done either of two things with the
same net semantic effect as we have now:
- make optnone *imply* noinline (like 'naked' does IIUC) instead
of *requiring* it.
- make the inliner check for optnone on the callee, instead of
simply tying the two attributes together.

I hope this helps explain why I see optnone+noinline as simply
two parts of one unified feature.

History:

The meaning of 'optnone' was always intended as, don't optimize,
in as many ways as we can manage. I'd rather have not had the
coupling with noinline, but it was the best way forward at the
time to achieve the effect we needed. I spent too many months of
my life getting this accepted at all...
Maybe my definition of optnone in the LangRef is inadequate; but
I am quite sure I understand the original intent.

Which includes this:

When it comes to the interaction of optnone and inlining, there
were of course four cases to consider: caller and callee, and each
might or might not have optnone.

1) caller - N, callee - N
2) caller - N, callee - Y
3) caller - Y, callee - N
4) caller - Y, callee - Y

1) normal case. Inlining and other opts are okay.
2) callee is optnone, so we didn't want it inlined and optimized.
3) caller is optnone, so no inlining happens.
4) caller is optnone, so no inlining happens.

Cases 3/4 are handled by checking for optnone in the inliner pass,
just like a hundred other passes do. This is boilerplate and was
added in bulk to all those passes. Having it all be boilerplate
was important in getting the reviews accepted.

Case 2 was handled by coupling optnone to noinline. As I said,
this was a practical thing done at the time, not because we ever
intended the semantics of optnone to mean anything else. It got
the overall feature accepted, which was the key thing for Sony.

So, what we implemented achieved the effect we wanted, even if
that implementation didn't mean the new attribute *by itself* had
exactly all the effects we wanted.

Yes, decoupling optnone (as implemented) from noinline would allow
case 2 to inline bar into foo, and optimize the result, if that
somehow seems desirable. But, that result is very much against
the original intent of optnone, and is why I have been giving you
such a hard time about this.

I will continue to insist that something called 'optnone' cannot
properly fail to apply to all instances, inlined or not. But if
we're willing to rename the attribute, that problem is solved.

I'm assuming there would be resistance to doing either of the two
things I mentioned at the top (make optnone *imply* noinline, or
modify the inliner to check for optnone on the callee) as that
doesn't seem to be the direction people are moving in.

So the suggestion is:
- reword the definition of optnone to be something that would be
better named "nolocalopt";
- ideally, actually rename the attribute (because I still say
that "none" does not mean "unless we've inlined it");
- and yes, if you must, decouple it from noinline and remove that
paragraph from the LangRef description.

Clang will still pass both IR attributes, so end users won't see
any feature regressions. David can add 'noipa' and we can make
Clang pass that as well.

Works for me.

~ Johannes

I see that the clang attribute 'optnone' patch rC205255 (in 2014) added
both the IR 'optnone' and 'noinline' attributes.

If the clang attribute 'optnone' (for debugging purposes) is to be renamed,
I humbly suggest we may consider implementing __attribute__((optimize("O0")))
(limited to "O0" only; other values are not accepted).

I mentioned that somewhere else, maybe not on the list though, but we are looking
into the option to select the optimization level per function. Basically, remove
the limit to O0 in `__attribute__((optimize("OX"))`. I'll start a new thread once
we are closer where we explain why exposing it to the user is only one use case.
Long story short, `optimize("O0")` would be nice and we could use it for debugging
as suggested :wink:

~ Johannes

I see that the clang attribute 'optnone' patch rC205255 (in 2014) added
both the IR 'optnone' and 'noinline' attributes.

If the clang attribute 'optnone' (for debugging purposes) is to be
renamed,
I humbly suggest we may consider implementing
__attribute__((optimize("O0")))
(limited to "O0" only; other values are not accepted).

Common Function Attributes (Using the GNU Compiler Collection (GCC)) says
"The optimize attribute should be used for debugging purposes only. It
is not suitable in production code." which matches our debugging only
purposes.

-O0 code already emits 'optnone' and 'noinline' for non-alwaysinline
functions, so we may not need a new attribute.

Renaming or adding a clang attribute should be proposed on its own
thread on cfe-dev, as it will not have the proper visibility buried
on llvm-dev at the end of a long thread like this one.
--paulr