[RFC] A proposal for #pragma optnone

Hi cfe-dev!

Following the talk by Greg Bedwell at the EuroLLVM conference, which reiterated the need for a range-based solution to selectively disable optimizations (see http://llvm.org/devmtg/2014-04/PDFs/Talks/GBedwell_PS4CPUToolchain_EuroLLVM2014_distribution.pdf from slide 80), we are now proposing a range-based pragma that decorates function definitions in the range with the ‘optnone’ attribute.

Proposals like this one have not received much attention in the past, but we are still keen to work with the community on this.

The attached HTML file describes our intentions for the “#pragma clang optnone” feature. There is no implementation yet, but we think the feature can be easily implemented on top of the existing ‘optnone’ infrastructure. We would however like to fix things like the syntax and the semantics before we start working on it.

We would greatly appreciate any feedback on the spec.

Cheers,
Dario Domizioli
SN Systems - Sony Computer Entertainment Group

Range-based-optnone-spec.html (7.68 KB)

Why do you want to invent a similar, but only slightly difference mechanism to disable optimizations? Why not adopt something which is already familiar to a user community.

Alternatively - if you insist yet another pragma is necessary - could an "alias" be made so that some of the same mechanics can be leveraged to implement the duplicate behavior?

My personal opinion is that the exact wording of both the gcc and msvc are more "clean" than this possibly confusing double negative..

Not only that, but we won't be creating new unsupported behaviour on
every other compiler.

--renato

Why do you want to invent a similar, but only slightly difference mechanism to disable optimizations?

The short answer is that our underlying infrastructure in Clang/LLVM is different and our mechanism has to be slightly different.

Essentially, ‘optnone’ does not in fact implement the full feature that “#pragma optimize” is supposed to implement.
A way to control optimization levels per-function has been proposed several times in the past (see for example http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-April/061527.html ), but the community has never reached a consensus on the idea; I think last time the problem was that there were design issues with the PassManager in LLVM which at the time was being redesigned (but as far as I know even the redesigned version does not allow per-function control on optimizations). The only consensus that was reached was to implement a function attribute to disable local optimizations for this specific use case - i.e. ‘optnone’.
Therefore, providing a pragma like “#pragma gcc optimize” that however deals with only one case (i.e. optimization level zero) would cause confusion. Ideally if we want to have such pragma we should implement the full feature, but there is no consensus on that at the moment, so we have to use a different approach.

In terms of what our users would do, they already abstract this functionality behind macros, as explained in the slides I referenced earlier. The important thing for them is that the feature is range-based rather than attribute-based, so they can #define the actual syntax with macros that can be used in the same way as with the other compilers.

I agree that the best solution would be to implement the full #pragma optimize feature. We just don’t want to restart the full debate (with the risk of not reaching consensus again), and we’d rather settle for the low-hanging fruit in the short term.

As we note in the spec, if we ever implement the full #pragma optimize in clang/LLVM then #pragma optnone will just become obsolete and deprecated. So this proposal does not hinder any future work on the full #pragma optimize feature.

Cheers,
Dario Domizioli
SN Systems - Sony Computer Entertainment Group

This all derives from my personal opinion and biases - I'd rather see a subset of an existing pragma supported and then build on that incrementally rather than yet another pragma... I see multiple downsides and no benefit to this approach compared to reusing what's already being "abused".

I don't want to kick old threads or block your proposal - I just hope you adjust it to take advantage of existing syntax..

1) Less things to document and in theory easier for users to take advantage of
2) Less divergent from other popular compilers
3) Allows a path for incrementally improving as consensus can be reached or blockers (PassManager) can be "fixed"

The short answer is that our underlying infrastructure in Clang/LLVM is
different and our mechanism has to be slightly different.

Different implementation, same user interface. We shouldn't pass the
pain of our broken infrastructure to the user, but fix it or work
around it until we have fixed our infrastructure.

I'd rather have an incomplete pragma that breaks on anything but zero
than have a new pragma that will have to be ifdef'd on any other
compiler. So, in a nutshell:

#pragma GCC optimize 0
--> ok
#pragma GCC optimize 1|2|3
--> pre-processor warning not implemented (and ignored), not enabled
unless -Wall

A way to control optimization levels per-function has been proposed several
times in the past (see for example
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-April/061527.html ), but the
community has never reached a consensus on the idea; I think last time the
problem was that there were design issues with the PassManager in LLVM which
at the time was being redesigned (but as far as I know even the redesigned
version does not allow per-function control on optimizations).

You are correct. I have implemented a similar mechanism (for loop
optimizations) in the old pass manager that has been propagated by the
new one, but no generic solution exists.

IIRC, the main issue was with inlining. With no optimization we can
easily say "don't inline", but with optimize 2 vs -O 3 (or
vice-versa), there's no easy answer. What will the O2 inlined on O3
function be? Do we suppress *all* inlining of forced-opt functions? If
we do, we won't be doing O2 on it any more, but a subset. It's
complex.

FWIW, GCC produces inconsistent results for all combinations of pragma
vs. -O options...

The only
consensus that was reached was to implement a function attribute to disable
local optimizations for this specific use case - i.e. 'optnone'.

Yes, that was the easy one. There is another case that this is
interesting, heavily optimise a specific function, ie. #pragma GCC
optimize 3" on -O1. Anything else will be madness anyway, and GCC
doesn't seem to care much either.

What we could do is to define one behaviour for inlining (say add
neverinline) on all optimization levels. Then we turn a problem of
definition to one of optimisation.

However, we do have to change the pass manager...

Therefore, providing a pragma like "#pragma gcc optimize" that however deals
with only one case (i.e. optimization level zero) would cause confusion.
Ideally if we want to have such pragma we should implement the full feature,
but there is no consensus on that at the moment, so we have to use a
different approach.

#pragma optnone also causes confusion (it's unknown) and adds incompatibility.

I agree that the best solution would be to implement the full #pragma
optimize feature. We just don't want to restart the full debate (with the
risk of not reaching consensus again), and we'd rather settle for the
low-hanging fruit in the short term.

The cost of doing this is too high. You'll be moving from "our
problem" to "our users' problem" by forcing them to ifdef their code
forever.

As we note in the spec, if we ever implement the full #pragma optimize in
clang/LLVM then #pragma optnone will just become obsolete and deprecated. So
this proposal does not hinder any future work on the full #pragma optimize
feature.

Nor does adding "#pragma GCC optimize 0" from the start. With the
added bonus that Clang already ignores "optimize N".

cheers,
--renato

Setting aside the reasonable concerns over naming...

Proposals like this one have not received much attention in the past, but
we are still keen to work with the community on this.

This doesn't seem accurate.

When the optnone stuff was first discussed, the use of a pragma *was*
discussed, and there were arguments against it because the semantics are
highly confusing: it only has effect on the function definitions which are
started after the pragma. This is confusing as you might start the pragma
*inside* a function definition. Such a pragma might even have semantic
impact by disabling optimizations within the body of lambda, but *not*
within any surrounding expressions.

Personally, I find the semantics of such a pragma extremely confusing. I
would never advocate the use of such a pragma, instead I would strongly
advocate *against* its use in literally all circumstances. It is hard to
support including it in Clang given that.

On the flip side, we have a function attribute which has a reasonable
semantic model and addresses the use case originally posited.

So I don't think that this is something which has been left unattended. I
think it was attended, and in the discussion that led to optnone, the
approach was not pursued and instead a different one was.
-Chandler

Chandler, while I actually agree with you in principal, I think you’re ignoring an important factor. The pragma approach is already widely deployed. If we don’t support pragma usage, there’s no real migration path for these applications. Adding support for this usage does not have to imply an endorsement of the style. In fact, our documentation could explicitly suggest migration to per-function attributes. (In my view, it should.) Philip

I agree. Strongly against the use of virtually any pragmas, but still
support for legacy code. Implementation wise, if possible make it an
alias to existing (or future-proof) similar features, but trying to
get as close as possible to the "original" semantics (by copying the
compilers that do implement to a reasonable distance). However, any
such change should be very isolated, easy to remove and not introduce
incompatible or irreparable changes to the code base or the language
reference.

AFAICS, all pragmas should be transformed into function attributes,
annotation, metadata, etc. This way, the problem stays isolated in the
front-end.

But this is *not* an encouragement to use, just a migration path. This
is why I'm against *creating* a new pragma.

cheers,
--renato

I'm not ignoring it in this case, but this is a special and unusual case,
so it is somewhat surprising.

The fact of the matter is that existing usage of this pragma outside of
Clang and LLVM is somewhat irrelevant. There were only two proposed uses of
this pragma when it was brought up:

1) Working around a miscompile in the compiler.
2) Debugging a single function in unoptimized form while the rest of the
program was optimized.

In the first use case, there is no reason to care about existing
deployments as those are targeted at fixing a different compiler's
miscompiles.

In the second case, there is no widespread checked in usage of this pattern
because it is only used for the sake of debugging sessions.

Again, this was discussed previously. The variant on #2 which came up was a
header or macro that enabled the pragma for one TU at a time in a somewhat
programmatic way. However, the response was that it would be much better to
do this at the build system level, and there was never any real argument
against that.

In general, the use cases for this *particular* pragma seem vanishingly
small. It does not establish an ABI, provide a semantic contract, or
control diagnostics. It is not something that would be expected to be
checked into a codebase long term and work across implementations (see #1).
So I don't think the legacy application concerns apply in this case.

From: cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-bounces@cs.uiuc.edu]
On Behalf Of Chandler Carruth

Setting aside the reasonable concerns over naming...

> Proposals like this one have not received much attention in
> the past, but we are still keen to work with the community
> on this.

This doesn't seem accurate.

When the optnone stuff was first discussed, the use of a pragma
*was* discussed, and there were arguments against it because
the semantics are highly confusing: it only has effect on the
function definitions which are started after the pragma. This is
confusing as you might start the pragma *inside* a function
definition. Such a pragma might even have semantic impact by
disabling optimizations within the body of lambda, but *not*
within any surrounding expressions.

You seem to be objecting to a different proposal.

If you read Dario's proposal, you saw that putting the pragma
inside a function body is specifically prohibited. The effect
applies to entire functions, not piecemeal within a function.

The pragma implements "apply attribute optnone to all function
definitions from here on" until the pragma gets turned off again.
How is that confusing?

Personally, I find the semantics of such a pragma extremely
confusing. I would never advocate the use of such a pragma,
instead I would strongly advocate *against* its use in
literally all circumstances. It is hard to support including
it in Clang given that.

I wouldn't expect you to support something you find so confusing,
no. :slight_smile: Fortunately we are proposing something else.

On the flip side, we have a function attribute which has a
reasonable semantic model and addresses the use case originally
posited.

It addresses certain use cases, but not others.

One significant use case is bisecting on a buggy function. This
is obviously tedious and painful if you're having to add and
remove function attributes from piles of functions, in order to
iterate down to the problem function. Doing the same iteration
where you just move one pragma around is obviously faster and
way more convenient. Our users are very grudgingly using the
attribute for this process (because they have little choice)
but are clamoring for a pragma.

So I don't think that this is something which has been left unattended.
I think it was attended,

I personally remember ridiculous numbers of unanswered pings,
but whatever. Water under the bridge.

and in the discussion that led to optnone, the
approach was not pursued and instead a different one was.
-Chandler

Because we wanted something that was minimally helpful to our users
to be acceptable at all, upstream. Given your adamant objections at
the time, we backed off from the pragma and went with the attribute,
a solution you said you could support. However, our users insist on
something simpler, i.e. a pragma that works like the pragmas that are
supported by a number of other widely used compilers. It's a whole
lot easier to bisect when you're only moving one line around than
to be adding and removing attributes from piles of functions.

So, it's syntactic sugar on top of the internal implementation
mechanism, which we're not changing.
--paulr

Others may be rejecting something else, but I'm clear. This stinks - Lets not invent wholly new pragma

From: "C. Bergström" [mailto:cbergstrom@pathscale.com]
>> From: cfe-dev-bounces@cs.uiuc.edu [mailto:cfe-dev-
bounces@cs.uiuc.edu]
>> On Behalf Of Chandler Carruth
>>
>> Setting aside the reasonable concerns over naming...
>>
>>> Proposals like this one have not received much attention in
>>> the past, but we are still keen to work with the community
>>> on this.
>> This doesn't seem accurate.
>>
>> When the optnone stuff was first discussed, the use of a pragma
>> *was* discussed, and there were arguments against it because
>> the semantics are highly confusing: it only has effect on the
>> function definitions which are started after the pragma. This is
>> confusing as you might start the pragma *inside* a function
>> definition. Such a pragma might even have semantic impact by
>> disabling optimizations within the body of lambda, but *not*
>> within any surrounding expressions.
> You seem to be objecting to a different proposal.
Others may be rejecting something else, but I'm clear. This stinks -

Chandler was setting aside the naming part, and objecting to other
aspects of the proposal that were not actually in the proposal.
As well as generally rejecting the notion of pragma at all, but
hopefully he'll come around.

Lets not invent wholly new pragma

We stayed away from the existing pragmas because what we can (and
want) to do is much less than what the existing pragmas do (in
other compilers). If people think it's better to use exactly the
existing syntax, and fail to support all of what they do, that's
okay with us (if I'm remembering the internal discussions correctly).

------------
My voice counts for nothing around here, but I would however +1|

#pragmaoptimize level=0|
or even
#pragma OPTIMIZE OFF
---------
>
Is there any reason that wouldn't be sufficient for your needs?
-------
If you make a patch for the above I'll review it. It would also give the
chance for anyone who is strongly apposed to really come up with some
exceptional argument against it.

>

Dario owns this one but I'm sure he appreciates the offer!
(I'm responding here because he's in the UK and I wanted to provide
some responses during US working hours.)
--paulr

We stayed away from the existing pragmas because what we can (and
want) to do is much less than what the existing pragmas do (in
other compilers).

I think here's where you're mistaken. It's not. As Chandler said,
there are only two valid cases: disable all opts or force higher opt.
Both can be used for either debugging or working around miscompiles.

I've done some tests on GCC with pragmas from 0 to 3 and -O0 to -O3
and the results are inconsistent. The generated code is different for
"pragma N" on almost all -O levels, so no one will expect consistency.

If people think it's better to use exactly the
existing syntax, and fail to support all of what they do, that's
okay with us (if I'm remembering the internal discussions correctly).

That's what I expected, yes.

The same reason why we didn't invent "#pragma vectorize ...", because
there were others (even if disparate) that did the same thing. The
consensus was to implement each specific case from different bundles
(OMP, Cilk, etc) instead of creating a list of new pragmas, and I
think this is what has to be done in this case, too.

General rule of new pragmas: don't.

General rule of existing pragmas: mirroring attributes or annotation
with the same semantics.

cheers,
--renato

From: "Paul Robinson" <Paul_Robinson@playstation.sony.com>
To: "Chandler Carruth" <chandlerc@google.com>, "Dario Domizioli" <dario.domizioli@gmail.com>
Cc: "clang-dev Developers" <cfe-dev@cs.uiuc.edu>
Sent: Monday, April 28, 2014 4:51:15 PM
Subject: Re: [cfe-dev] [RFC] A proposal for #pragma optnone

> From: cfe-dev-bounces@cs.uiuc.edu
> [mailto:cfe-dev-bounces@cs.uiuc.edu]
> On Behalf Of Chandler Carruth
>
> Setting aside the reasonable concerns over naming...
>
> > Proposals like this one have not received much attention in
> > the past, but we are still keen to work with the community
> > on this.
>
> This doesn't seem accurate.
>
> When the optnone stuff was first discussed, the use of a pragma
> *was* discussed, and there were arguments against it because
> the semantics are highly confusing: it only has effect on the
> function definitions which are started after the pragma. This is
> confusing as you might start the pragma *inside* a function
> definition. Such a pragma might even have semantic impact by
> disabling optimizations within the body of lambda, but *not*
> within any surrounding expressions.

You seem to be objecting to a different proposal.

If you read Dario's proposal, you saw that putting the pragma
inside a function body is specifically prohibited. The effect
applies to entire functions, not piecemeal within a function.

As Chandler points out, you would want to put them inside function bodies to catch lambdas. However, we should issue a warning in this case.

The pragma implements "apply attribute optnone to all function
definitions from here on" until the pragma gets turned off again.
How is that confusing?

>
> Personally, I find the semantics of such a pragma extremely
> confusing. I would never advocate the use of such a pragma,
> instead I would strongly advocate *against* its use in
> literally all circumstances. It is hard to support including
> it in Clang given that.

I wouldn't expect you to support something you find so confusing,
no. :slight_smile: Fortunately we are proposing something else.

>
> On the flip side, we have a function attribute which has a
> reasonable semantic model and addresses the use case originally
> posited.

It addresses certain use cases, but not others.

One significant use case is bisecting on a buggy function. This
is obviously tedious and painful if you're having to add and
remove function attributes from piles of functions, in order to
iterate down to the problem function. Doing the same iteration
where you just move one pragma around is obviously faster and
way more convenient. Our users are very grudgingly using the
attribute for this process (because they have little choice)
but are clamoring for a pragma.\

I'd like to say that I find this use case compelling, and I'm now in favor of this functionality.

-Hal

I've done some tests on GCC with pragmas from 0 to 3 and -O0 to -O3
and the results are inconsistent. The generated code is different for
"pragma N" on almost all -O levels, so no one will expect consistency.

This is interesting. :slight_smile:
OK, so we can be slightly different as long as in general we cover the
"disable/enable optimizations" use case in some form.

> If people think it's better to use exactly the
> existing syntax, and fail to support all of what they do, that's
> okay with us (if I'm remembering the internal discussions correctly).

That's what I expected, yes.

The same reason why we didn't invent "#pragma vectorize ...", because
there were others (even if disparate) that did the same thing. The
consensus was to implement each specific case from different bundles
(OMP, Cilk, etc) instead of creating a list of new pragmas, and I
think this is what has to be done in this case, too.

General rule of new pragmas: don't.

General rule of existing pragmas: mirroring attributes or annotation
with the same semantics.

I see. Yes, I now remember reading the #pragma vectorize discussion some
time ago.
My concern is that we still have to make clear to the user that the full
feature is not supported, but we could do it in the documentation and/or
with diagnostics about unsupported parts of the feature. So I would be OK
with using GCC's or MSVC's syntax, and implementing the "disable
optimizations" case by adding 'optnone' to the function definitions in the
range covered by the pragma.

For what we can do, GCC's syntax would require us to implement a push/pop
stack as well, because if we only support the "0" level then code like this
would not work:
    #pragma GCC optimize "0"
    // unoptimized code
    #pragma GCC optimize "2" <-- this would not restore -O2
And it would have to be rewritten as:
    #pragma GCC push_options
    #pragma GCC optimize "0"
    // unoptimized code
    #pragma GCC pop_options
Users will #define those lines into a macro anyway, but it might still be a
difference in style.

If we use the MSVC syntax, instead, we can just do it with on/off
semantics, which is our main use case. However the MSVC syntax is slightly
more convoluted:
    #pragma optimize ("", off)
    // unoptimized code
    #pragma optimize ("", on)
And we would have to emit unsupported diagnostics if that empty string ""
is not empty.

So from our point of view, the MSVC syntax is slightly easier to implement,
but we have no strong opinion. How does the community feel about the GCC
syntax vs. the MSVC syntax?
Maybe the clang-cl people would actually like a contribution towards
supporting more MSVC-style code?

As a final note, I would reiterate that the use case for this feature is
not as rare as people might think.
The point is that for our users (i.e. computer game programmers):
a) their codebase is multi-platform, and they would like all platforms to
support the same features, and
b) debugging sessions are happening much more often than with other kinds
of software projects.
Debugging games is the bread-and-butter of game programmers, essentially
they do it every day.
There is therefore very strong pressure for us to implement this feature,
and we still think this feature might also be beneficial (or at least
neutral) in general to the wider community of users.

Cheers,
    Dario Domizioli
    SN Systems - Sony Computer Entertainment Group

My concern is that we still have to make clear to the user that the full
feature is not supported, but we could do it in the documentation and/or
with diagnostics about unsupported parts of the feature. So I would be OK
with using GCC's or MSVC's syntax, and implementing the "disable
optimizations" case by adding 'optnone' to the function definitions in the
range covered by the pragma.

Yes. Docs are ok. A warning that is only enabled on -Wall (and could
be disabled individually) is also ok. Both would be better.

So from our point of view, the MSVC syntax is slightly easier to implement,
but we have no strong opinion. How does the community feel about the GCC
syntax vs. the MSVC syntax?

I'd argue that the GCC syntax is always preferred over MSVC in any
general case, just because GCC's reach is orders of magnitude higher
than anything else, and also because LLVM has historically followed
GCC's trail with regards to supporting legacy code. Only recently MSVC
support has been added.

But realistically, it depends on which kind of users will use this
apart from your toolchain. If this is more used around MSVC users,
than do the MSVC style first. I don't have a strong opinion on that.

But whatever you do, it should be very easy to connect the other style
with your low-level implementation (function attributes).

As a final note, I would reiterate that the use case for this feature is not
as rare as people might think.

I understand the importance and this is why we want this to happen,
but we have to be extremely careful on what we add. I'm sure you know
very well how long bad decisions tend to stay in public APIs... :wink:

cheers,
--renato

We can easily skip that semantics for now. GCC doesn't allow it:

error: #pragma GCC optimize is not allowed inside functions

cheers,
-renato

From: "Renato Golin" <renato.golin@linaro.org>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Paul Robinson" <Paul_Robinson@playstation.sony.com>, "clang-dev Developers" <cfe-dev@cs.uiuc.edu>
Sent: Tuesday, April 29, 2014 8:39:44 AM
Subject: Re: [cfe-dev] [RFC] A proposal for #pragma optnone

> As Chandler points out, you would want to put them inside function
> bodies to catch lambdas. However, we should issue a warning in
> this case.

We can easily skip that semantics for now. GCC doesn't allow it:

error: #pragma GCC optimize is not allowed inside functions

But why? I don't find this gcc comparison a compelling argument. It seems easier to support it, and more useful, than not supporting it. Adding a warning does not seem more difficult than adding the error. Regardless of what gcc does, Clang/LLVM must have a design process that keeps modern C++ in mind, and modern C++ includes lambda functions.

-Hal

C++11 constructs should use annotations, not pragmas. We should only
implement legacy pragmas for legacy behaviour. Any new pragmas, or new
behaviour that we add on legacy pragmas will diverge from the original
intention and require ifdefs on user code.

cheers,
-renato