RFC: Supported Optimizations attribute

Hi folks,

please check out our RFC: Supported Optimizations attribute

https://docs.google.com/document/d/1s0n-JVaSNML1Z9SCZVg2bgisxswIJAK2Tp9DahucW10/edit?usp=sharing

Pasting it here for the record:

RFC: supported_optimizations attribute

Piotr Padlewski - piotr.padlewski@gmail.com
Krzysztof Pszeniczny - kpszeniczny@google.com
December 2018

Introduction

Sometimes a possible class of optimizations requires very precise use of metadata and/or intrinsics, at least to achieve a clean implementation thereof.

Our primary example is devirtualization[RFC: Devirtualization v2], which requires that the LLVM IR be annotated with certain intrinsic calls, viz. launder.invariant.group and strip.invariant.group, in certain places, e.g. when the dynamic type of an object changes.

This currently makes it impossible to merge IR created with this kind of optimization in mind and without it, as this can lead to miscompilation when the optimizer relies on the lack of intrinsics and/or metadata to make certain assumptions about the code in question. For now, in case of devirtualization, we plainly refuse to merge such pieces of IR. This predominantly affects cross-language link-time optimization.

Solution

We propose a fine-grained, function-level solution to this problem, originally suggested by John McCall: mark each function with a list of such optimization requirements it complies with, by the means of an attribute for now called supported_optimizations. The exact naming of this attribute is of course subject to bikeshedding, so we kindly request the community to provide us with a better name, if found.

When performing inter-procedural optimizations (mostly inlining), the list of supported optimizations by the both functions should be considered to be to the intersection of their previous lists of supported optimizations. This effectively hinders any optimization relying on precise annotations from happening if such precise annotations can no longer be guaranteed.

More precisely, when doing an IPO other than inlining, such a pass must treat the functions in question as if they had the supported optimizations lists set to the intersection of their original ones. This in general obviously does not require modifying those lists.

In the case of inlining, the list of optimizations supported by the callee is left intact, but the list of optimizations supported by the caller must be set to the said intersection.

invariant.group

The !invariant.group metadata should be equipped with an argument referring to the supported_optimization it is related to. In the existing case, it will be set to “clang.cxx.devirtualization”. If this supported_optimization is not present in the list of optimizations supported by the enclosing function, the !invariant.group metadata has no effect and cannot be relied upon by the optimizer. Of course, in this case the metadata in question is not useful anymore and can be safely removed. We kindly thank Richard Smith for suggesting this approach.

As a transitional measure and for backwards compatibility reasons, any !invariant.group metadata with an empty argument (i.e. as before this RFC), shall not be subject to the above restrictions and shall remain applicable even when there is no supported_optimizations list provided for the enclosing function.

Example:

define void @with_devirtualization(i8* %p) supported_optimizations = “clang.cxx.devirtualization” {
%v1 = load i8, i8* %p, !invariant.group !0
%p2 = call i8* @llvm.launder.invariant.group.p0i8(i8* %p)
call void @without_devirtualization(i8* %p, i8* %p2)
%v2 = load i8, i8* %p2, !invariant.group !0
ret void
}

define void @without_devirtualization(i8* %p, i8* %p2) {
%b = icmp eq i8* %p, %p2
call void @llvm.assume(i1 %b)
ret void
}

!0 = !{!“clang.cxx.devirtualization”}
declare void @llvm.assume(i1)
declare i8* @llvm.launder.invariant.group.p0i8(i8*)

After inlining:

define void @with_devirtualization(i8* %p) {
; !invariant.group is inactive now, so it can be stripped.
%v1 = load i8, i8* %p, !invariant.group !0
%p2 = call i8* @llvm.launder.invariant.group.p0i8(i8* %p)
%b = icmp eq i8* %p, %p2
call void @llvm.assume(i1 %b)
%v2 = load i8, i8* %p2, !invariant.group !0
ret void
}

Possible extensions

Instead of indiscriminately taking the intersection of supported optimizations’ lists, we may imagine that some of such optimizations may be able to provide a conservative set of annotations to a function lacking them. By doing so, we may retain at least some level of information available to the optimizer, by preserving the annotations already present in the optimization-compliant function.

For example, devirtualization may conservatively pass every load and function argument through a strip and every store and return value through a launder, which is a way of conservatively making an arbitrary function compliant with the requirements of this particular optimization.

Bikeshedding

Other possible names:

  • compilant_with

  • brittle_representations (John McCall’s original suggestion)

Replying on llvm-dev, since I mainly want to brainstorm the IR attribute name…

This attribute is really a promise that the IR in this function obeys certain invariants. In this case, it’s that every virtual pointer load uses a particular pattern to support a C+±specific optimization. That suggests a name something along the lines of “lso_invariants” (language-specific optimization = LSO) or “opt_invariants_preserved”. What do you think of those names or that direction?

Thank you for pushing this forward.

We propose a fine-grained, function-level solution to this problem,
originally suggested by John McCall: mark each function with a list of such
optimization requirements it complies

compiles

invariant.group

In this proposal, it seems like you're assuming that only the invariant.group
metadata needs to be parameterized with a supported_optimization. Should you
also decorate all the other structures used as part of this optimization, e.g.
the calls to llvm.strip.invariant.group and llvm.launder.invariant.group?
This would also allow these intrinsics to be more explicit about exactly which
family of invariant.group metadata they're supposed to be laundering/stripping,
and correspondingly it would allow LLVM to remove these intrinsics when it
strips invariant.group optimization information.

As a transitional measure and for backwards compatibility reasons, any
!invariant.group metadata with an empty argument (i.e. as before this RFC),
shall not be subject to the above restrictions and shall remain applicable
even when there is no supported_optimizations list provided for the
enclosing function.

I don't think it's important to support this transitionally. AFAIK, there are
no existing, non-experimental optimizations using invariant.group where it's
crucial that we continue to perform the optimization when linking existing .bc
files. We should just insist that invariant.group has a tag and otherwise
strip it during deserialization.

Possible extensions

Instead of indiscriminately taking the intersection of supported
optimizations' lists, we may imagine that some of such optimizations may be
able to provide a conservative set of annotations to a function lacking
them. By doing so, we may retain at least some level of information
available to the optimizer, by preserving the annotations already present
in the optimization-compliant function.

For example, devirtualization may conservatively pass every load and
function argument through a strip and every store and return value through
a launder, which is a way of conservatively making an arbitrary function
compliant with the requirements of this particular optimization.

Sure, that's a theoretical future enhancement that we could provide.

John.

I think we should have some bounds on how "badly" a
supported_optimizations tag on a function can affect its semantics.
For instance, can a "supported_optimizations" invariant be that "the
CFG is always structured"? Or (exaggerating to illustrate the point)
"the function has an equal number of loads and stores"?

-- Sanjoy

...I have no idea what I was thinking here, "complies" is obviously the right word.
Apologies.

John.

Piotr’s proposal unfortunately doesn’t give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them “brittle”, but I can understand why
nobody wants to call their optimization that, so let’s call them
“good-faith optimizations” instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that’s what makes them structural rules. We don’t want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won’t be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

  • other transforms will preserve the apparent semantics of the function and
  • other transforms will maintain the standard structural rules of LLVM IR.

For example, if you need arbitrary side-effects to not be re-ordered past
some point in the function, you should put an instruction there that
inhibits that kind of code motion, probably by making it look like a call
to an unknown function. No transform can reorder arbitrary side-effects
across such a call because doing so might change the apparent semantics of
the program. Similarly, if you need a direct use-def link between two
points in the function, you should make the first point define a token
that’ll be used by the second point. No transform will break or abstract
that association because that would violate the basic structural rules of IR.
If there’s a structural rule that can’t be expressed with the current tools
of IR, we can talk about how to fix that — although tokens are such a big
hammer that I’d be somewhat surprised to hear that something really can’t
be expressed at all (but much less surprised to hear that the token rules
are too strong and we’d like other transforms to have more power to change
the code).

So the defining property of a good-faith optimization is that:

  • there are rules which participating functions are expected to follow on
    pain of undefined behavior but which LLVM IR doesn’t require every function
    to follow, and
  • those rules will be preserved by any transform that doesn’t move code
    between functions and which preserves the apparent function semantics
    and maintains the standard structural rules of LLVM IR.

I think that answers your question about what sort of rules we can expect
supported_optimizations to guarantee.

John.

I think we should have some bounds on how "badly" a
supported_optimizations tag on a function can affect its semantics.
For instance, can a "supported_optimizations" invariant be that "the
CFG is always structured"? Or (exaggerating to illustrate the point)
"the function has an equal number of loads and stores"?

Piotr's proposal unfortunately doesn't give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them "brittle", but I can understand why
nobody wants to call their optimization that, so let's call them
"good-faith optimizations" instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that's what makes them structural rules. We don't want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won't be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

  * other transforms will preserve the apparent semantics of the function and
  * other transforms will maintain the standard structural rules of LLVM IR.

For example, if you need arbitrary side-effects to not be re-ordered past
some point in the function, you should put an instruction there that
inhibits that kind of code motion, probably by making it look like a call
to an unknown function. No transform can reorder arbitrary side-effects
across such a call because doing so might change the apparent semantics of
the program. Similarly, if you need a direct use-def link between two
points in the function, you should make the first point define a token
that'll be used by the second point. No transform will break or abstract
that association because that would violate the basic structural rules of IR.
If there's a structural rule that can't be expressed with the current tools
of IR, we can talk about how to fix that --- although tokens are such a big
hammer that I'd be somewhat surprised to hear that something really can't
be expressed at all (but much less surprised to hear that the token rules
are too strong and we'd like other transforms to have more power to change
the code).

So the defining property of a good-faith optimization is that:
- there are rules which participating functions are expected to follow on
pain of undefined behavior but which LLVM IR doesn't require every function
to follow, and
- those rules will be preserved by any transform that doesn't move code
between functions and which preserves the apparent function semantics
and maintains the standard structural rules of LLVM IR.

I like these conditions, in terms of bounding what this means.

We might need to be more explicit about what moves code means, as we're run into fuzzy lines before around inter-procedural optimizations (e.g., return-value propagation, using the 'returned' attribute, and variants). Would it work to say "move code with potential side effects"?

Thanks again,

Hal

I think that answers your question about what sort of rules we can expect
supported_optimizations to guarantee.

John.

Piotr's proposal unfortunately doesn't give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them "brittle", but I can understand why
nobody wants to call their optimization that, so let's call them
"good-faith optimizations" instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that's what makes them structural rules. We don't want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won't be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

other transforms will preserve the apparent semantics of the function and
other transforms will maintain the standard structural rules of LLVM IR.

Ok. Just to make sure we're on the same page, if this was all there
is we would not need this attribute right? All LLVM optimizations do
need to preserve semantics and structural properties anyway?

So the defining property of a good-faith optimization is that:
- there are rules which participating functions are expected to follow on
pain of undefined behavior but which LLVM IR doesn't require every function
to follow, and
- those rules will be preserved by any transform that doesn't move code
between functions and which preserves the apparent function semantics
and maintains the standard structural rules of LLVM IR.

In other words, certain things are UB in functions tagged with
supported_optimizations that are not UB otherwise? This breaks code
hoisting transformations right? I.e.
isSafeToSpeculativelyExecute(Inst) will have to return false if Inst
is in a function with a non-empty supported_optimizations?

-- Sanjoy

Piotr’s proposal unfortunately doesn’t give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them “brittle”, but I can understand why
nobody wants to call their optimization that, so let’s call them
“good-faith optimizations” instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that’s what makes them structural rules. We don’t want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won’t be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

other transforms will preserve the apparent semantics of the function and
other transforms will maintain the standard structural rules of LLVM IR.

Ok. Just to make sure we’re on the same page, if this was all there
is we would not need this attribute right? All LLVM optimizations do
need to preserve semantics and structural properties anyway?

We need this attribute because interprocedural optimizations otherwise
break good-faith optimizations, so yes, my suummary here is missing some
qualification (that I included in the next paragraph, but with a slightly
different spin). So let me restate this.

The designer of a good-faith optimization can rely on this:

  • other transforms will preserve the apparent semantics of the function,
  • other transforms will maintain the standard structural rules of LLVM IR, and
  • interprocedural transforms will honor supported_optimizations as mentioned in Piotr’s proposal — and, in particular, will intersect the supported_optimizations list whenever moving code into a function.

Note that IPO is generally permitted to partially inline or outline code,
and so good-faith optimizations that e.g. require two instructions to be moved
in tandem or not at all must use tokens to establish that unbreakable
relationship.

So the defining property of a good-faith optimization is that:

  • there are rules which participating functions are expected to follow on
    pain of undefined behavior but which LLVM IR doesn’t require every function
    to follow, and
  • those rules will be preserved by any transform that doesn’t move code
    between functions and which preserves the apparent function semantics
    and maintains the standard structural rules of LLVM IR.

In other words, certain things are UB in functions tagged with
supported_optimizations that are not UB otherwise? This breaks code
hoisting transformations right? I.e.
isSafeToSpeculativelyExecute(Inst) will have to return false if Inst
is in a function with a non-empty supported_optimizations?

Good question. I would consider that to be an unacceptable intrusion:
intraprocedural transforms should never have to be aware of
supported_optimizations (unless they’re implementing a good-faith
optimization, of course) and interprocedural transforms should only have
to be aware of supported_optimizations in the narrow sense outlined
by Piotr. If something about the optimization’s representation in IR
is unsafe to speculate, it should be made impossible to speculate for
standard semantic/structural reasons, like having apparently arbitrary
side-effects.

I think the right way to formalize this is to say that, while the
good-faith optimization may impose additional UB rules on the function,
it must guarantee that transforms that are well-behaved as described
above — i.e. that preserve standard structure and semantics and which,
if interprocedural, appropriately honor supported_optimizations — will
never introduce new UB.

John.

I think we should have some bounds on how “badly” a
supported_optimizations tag on a function can affect its semantics.
For instance, can a “supported_optimizations” invariant be that “the
CFG is always structured”? Or (exaggerating to illustrate the point)
“the function has an equal number of loads and stores”?

Piotr’s proposal unfortunately doesn’t give us a good name for the class
of optimizations that require being listed in supported_optimizations.
In earlier discussions I called them “brittle”, but I can understand why
nobody wants to call their optimization that, so let’s call them
“good-faith optimizations” instead since they rely on the good faith of
all the participating code.

Every optimization has to know how to maintain the structural rules of
LLVM IR; that’s what makes them structural rules. We don’t want the set of
structural rules to substantially change because such-and-such good-faith
optimization is in effect because that would require arbitrary transforms
to check the supported_optimizations list before they knew which rules to
follow. Instead, the burden is on the optimization designer to pick IR
constructs that won’t be messed up by an arbitrary transform with no special
knowledge of the optimization. The only thing the optimization designer
can rely on is this:

  • other transforms will preserve the apparent semantics of the function and
  • other transforms will maintain the standard structural rules of LLVM IR.

For example, if you need arbitrary side-effects to not be re-ordered past
some point in the function, you should put an instruction there that
inhibits that kind of code motion, probably by making it look like a call
to an unknown function. No transform can reorder arbitrary side-effects
across such a call because doing so might change the apparent semantics of
the program. Similarly, if you need a direct use-def link between two
points in the function, you should make the first point define a token
that’ll be used by the second point. No transform will break or abstract
that association because that would violate the basic structural rules of IR.
If there’s a structural rule that can’t be expressed with the current tools
of IR, we can talk about how to fix that — although tokens are such a big
hammer that I’d be somewhat surprised to hear that something really can’t
be expressed at all (but much less surprised to hear that the token rules
are too strong and we’d like other transforms to have more power to change
the code).

So the defining property of a good-faith optimization is that:

  • there are rules which participating functions are expected to follow on
    pain of undefined behavior but which LLVM IR doesn’t require every function
    to follow, and
  • those rules will be preserved by any transform that doesn’t move code
    between functions and which preserves the apparent function semantics
    and maintains the standard structural rules of LLVM IR.

I like these conditions, in terms of bounding what this means.

We might need to be more explicit about what moves code means, as we’re run into fuzzy lines before around inter-procedural optimizations (e.g., return-value propagation, using the ‘returned’ attribute, and variants). Would it work to say “move code with potential side effects”?

That’s a really good point. Applying this to our current use-case, I’m pretty
sure that both return-value propagation and the returned attribute can break
Piotr’s devirtualization optimization, so they need to intersect the caller’s
list or clear it, respectively. IIRC it used to be the case that returned
was mostly honored in the backend for liveness optimizations, i.e. long after
IR constructs like supported_optimizations have stopped being relevant, so
being very pessimistic about it is fine; but I don’t know if that’s still true.

John.

I see. Would it be correct to say that a good faith optimization `O`
assumes that a certain construct(s), `FOO`, does not occur in the
*trace* of the program within the function tagged with
supported_optimizations={O}? Also `FOO` has to be side effecting so
LLVM cannot speculate it into existence (e.g. `FOO` can't be
"getelementptr with a certain property").

-- Sanjoy

Peanut gallery says: I don’t fully understand the use-case or the domain, but this description sounds unworkable to me.

AIUI, there are two players here: the “brittle optimization” (which relies on some invariant), and the “transform” (which has the power to break that invariant).

The only two mathematically workable scenarios are:
(A) The brittle optimization’s invariant is “impossible to [break] for standard semantic/structural reasons.” Therefore no transform ever needs to know anything about it. The result is a “robust” optimization, and no need for the supported-optimizations flagset.
(B) The brittle optimization’s invariant is, in fact, brittle. Any transform that doesn’t explicitly preserve the invariant can and will break the invariant. Therefore, every transform must have its own whitelist of “invariants I know I don’t break.” Any flag in the supported-optimizations flagset which is not whitelisted by a given transform must be cleared when that transform is applied to the code. (Because, by definition, a transform that doesn’t explicitly preserve the brittle invariant must be assumed to break it.)

my $.02,
–Arthur

Peanut gallery says: I don't fully understand the use-case or the domain, but this description sounds unworkable to me.

AIUI, there are two players here: the "brittle optimization" (which relies on some invariant), and the "transform" (which has the power to break that invariant).

The only two mathematically workable scenarios are:
(A) The brittle optimization's invariant is "impossible to [break] for standard semantic/structural reasons." Therefore no transform ever needs to know anything about it. The result is a "robust" optimization, and no need for the supported-optimizations flagset.
(B) The brittle optimization's invariant is, in fact, brittle. Any transform that doesn't explicitly preserve the invariant can and will break the invariant. Therefore, every transform must have its own whitelist of "invariants I know I don't break." Any flag in the supported-optimizations flagset which is not whitelisted by a given transform must be cleared when that transform is applied to the code. (Because, by definition, a transform that doesn't explicitly preserve the brittle invariant must be assumed to break it.)

I think the overarching idea here is that good faith opts will only
rely on invariants that can only be broken by inter-procedural
optimizations. So intra-procedural opts should not need any changes.

In a sense this proposal makes function call edges semantically
special; functions calls are no longer yet another kind of control
flow, but they can act as bridges between LLVM IR with different
semantics. Collapsing a function call edge is now a semantically
interesting operation.

-- Sanjoy

In other words, certain things are UB in functions tagged with
supported_optimizations that are not UB otherwise? This breaks code
hoisting transformations right? I.e.
isSafeToSpeculativelyExecute(Inst) will have to return false if Inst
is in a function with a non-empty supported_optimizations?

Good question. I would consider that to be an unacceptable intrusion:
intraprocedural transforms should never have to be aware of
supported_optimizations (unless they’re implementing a good-faith
optimization, of course) and interprocedural transforms should only have
to be aware of supported_optimizations in the narrow sense outlined
by Piotr. If something about the optimization’s representation in IR
is unsafe to speculate, it should be made impossible to speculate for
standard semantic/structural reasons, like having apparently arbitrary
side-effects.

I think the right way to formalize this is to say that, while the
good-faith optimization may impose additional UB rules on the function,
it must guarantee that transforms that are well-behaved as described
above — i.e. that preserve standard structure and semantics and which,
if interprocedural, appropriately honor supported_optimizations — will
never introduce new UB.

I see. Would it be correct to say that a good faith optimization O
assumes that a certain construct(s), FOO, does not occur in the
trace of the program within the function tagged with
supported_optimizations={O}?

I’m not sure that’s a useful way of thinking about it. Usually the assumption
will be something more like “Thing 1 does not occur after Event A but before any
occurrence of Event B”, and we can clearly mark Events A and B in the IR, but
there might be dozens of different code patterns which all qualify as Thing 1.
This would have to be a good-faith optimization because a non-cooperative
function might not mark Events A and B in the IR, so inlining that into a
cooperative function could lead to miscompilation.

Also FOO has to be side effecting so
LLVM cannot speculate it into existence (e.g. FOO can’t be
“getelementptr with a certain property”).

I don’t think “side effecting” is always the right rule, but yes:

  • If something about the sequence shouldn’t be speculated, the optimization
    should arrange it so that transforms following standard rules can’t
    speculate it. For example, if you have a language rule that guarantees
    that a certain global variable is guaranteed to be constant past a certain
    point in the program, you can mark that point with an intrinsic, but that
    intrinsic needs to be made non-speculable, and the easiest way to achieve
    that is to ensure that it claims to have arbitrary side-effects so that code
    which doesn’t specifically know about this intrinsic won’t speculate it.

  • In general, all the intrinsic calls, metadata, and so on that participate
    in the optimization should be marked in some way that’s specific to the
    optimization. Formally, this requires that arbitrary transforms that aren’t
    aware of the optimization can’t just sua sponte add optimization-tagged
    constructs to a function, as opposed to e.g. cloning or moving them from
    somewhere else, but that seems like a pretty safe assumption. :slight_smile:

John.

Note that this has always been true in some ways. For example, naively
cloning call sites during inlining can change exception semantics; to
make this work correctly you have to reconcile the EH contexts of the
inner and outer calls, which is not always possible if the personalities
differ. Or to pick an even grosser example, you can’t just clone a call
to llvm.returnaddress during inlining.

Note also that IPO between functions with the same set of supported
optimizations also doesn’t involve any change in semantics.

John.

Skimming along, apologies if I’m repeating something which already got said.

If I understand this correctly, the basic problem we’re trying to solve is to use a local hint (the invariant.group) to make a global assumption about other code which might exist elsewhere outside the function. The attribute proposed can basically be phrased as describing a universe of functions within which our desired global property holds. There’s an ambiguity about what is allowed to be assumed about code outside that universe.

I think it’s important to note that we have a precedent of something similar to this in TBAA. TBAA information coming from different modules has the same base problem. We solve it by using the “root” of the TBAA tree as a scope descriptor, and essentially making two TBAA nodes from distinct roots incomparable.

Can someone explain concisely why a similar scheme couldn’t be used to solve this problem?

I think the way your framing this is dangerous. We absolutely can not allow any annotation of this form to weaken the semantics of the existing IR. We can and should impose a criteria that any extension of this variety strictly add information to the IR which might not have been previously inferred. We can then design rules for how to preserve our new information as long as possible, but framing this in terms of disallowed transformations is really a non-starter.

Skimming along, apologies if I’m repeating something which already got said.

If I understand this correctly, the basic problem we’re trying to solve is to use a local hint (the invariant.group) to make a global assumption about other code which might exist elsewhere outside the function. The attribute proposed can basically be phrased as describing a universe of functions within which our desired global property holds. There’s an ambiguity about what is allowed to be assumed about code outside that universe.

I think it’s important to note that we have a precedent of something similar to this in TBAA. TBAA information coming from different modules has the same base problem. We solve it by using the “root” of the TBAA tree as a scope descriptor, and essentially making two TBAA nodes from distinct roots incomparable.

Can someone explain concisely why a similar scheme couldn’t be used to solve this problem?

TBAA is conservative in two ways:

  • It allows two accesses to alias if they have TBAA nodes with different roots.
  • It allows two accesses to alias if only one of them has a TBAA node.

The second is what doesn’t generalize: there are optimizations where you need to
rely on transition points being explicitly identified. Looking at a function
with no identified transition points, you don’t know whether it actually doesn’t
transition or whether it was compiled without the transitions being explicitly
marked. There’s no way to extend the TBAA idea to make that work.

Note that IPO is generally permitted to partially inline or outline code,
and so good-faith optimizations that e.g. require two instructions to be moved
in tandem or not at all must use tokens to establish that unbreakable
relationship.

I think the way your framing this is dangerous. We absolutely can not allow any annotation of this form to weaken the semantics of the existing IR. We can and should impose a criteria that any extension of this variety strictly add information to the IR which might not have been previously inferred. We can then design rules for how to preserve our new information as long as possible, but framing this in terms of disallowed transformations is really a non-starter.

That’s exactly what I was trying to convey here. Authors of good-faith
optimizations need to design their representations so that transformations
that know nothing about their optimizations but merely preserve semantics
and well-formed IR structure will not break their representations. The only
transforms that need to know about the existence of good-faith optimizations
are interprocedural optimizations; furthermore, those optimizations don’t
need to know about any good-faith optimizations specifically, they just need
to understand how to correctly update the supported_optimizations list.
That is a very small burden on IPO that enables an interesting class of
language-specific optimizations.

John.

Skimming along, apologies if I’m repeating something which already got said.

If I understand this correctly, the basic problem we’re trying to solve is to use a local hint (the invariant.group) to make a global assumption about other code which might exist elsewhere outside the function. The attribute proposed can basically be phrased as describing a universe of functions within which our desired global property holds. There’s an ambiguity about what is allowed to be assumed about code outside that universe.

I think it’s important to note that we have a precedent of something similar to this in TBAA. TBAA information coming from different modules has the same base problem. We solve it by using the “root” of the TBAA tree as a scope descriptor, and essentially making two TBAA nodes from distinct roots incomparable.

Can someone explain concisely why a similar scheme couldn’t be used to solve this problem?

TBAA is conservative in two ways:

  • It allows two accesses to alias if they have TBAA nodes with different roots.
  • It allows two accesses to alias if only one of them has a TBAA node.

The second is what doesn’t generalize: there are optimizations where you need to
rely on transition points being explicitly identified. Looking at a function
with no identified transition points, you don’t know whether it actually doesn’t
transition or whether it was compiled without the transitions being explicitly
marked. There’s no way to extend the TBAA idea to make that work.

The other reason why similar scheme doesn’t work for !invariant.group is that we rely on a calls to launder/strip being present for some constructs to preserve
information about invartianess of an object (like in the example from RFC).

An additional bit to the discussion: I think that “null-pointer-is-valid” attribute should also be handled in a similar manner - if one function calls
another one marked with “null-pointer-is-valid”, then after inlining the function should also be marked with “null-pointer-is-valid”.
I assume this attribute was added to handle kernel or embedded code correctly and probably there was no use case to do LTO between
modules with and without this attribute (yet).
This, unfortunately, can’t be solved with our supported_optimizations attribute, as null pointer dereference is considered to be UB by default.
We would need something like “unsupported_optimizations”, but I think this shows that this is not the last time when a similar problem will arise.

        Skimming along, apologies if I'm repeating something which
        already got said.

        If I understand this correctly, the basic problem we're trying
        to solve is to use a local hint (the invariant.group) to make
        a global assumption about other code which might exist
        elsewhere outside the function. The attribute proposed can
        basically be phrased as describing a universe of functions
        within which our desired global property holds. There's an
        ambiguity about what is allowed to be assumed about code
        outside that universe.

        I think it's important to note that we have a precedent of
        something similar to this in TBAA. TBAA information coming
        from different modules has the same base problem. We solve it
        by using the "root" of the TBAA tree as a scope descriptor,
        and essentially making two TBAA nodes from distinct roots
        incomparable.

        Can someone explain concisely why a similar scheme couldn't be
        used to solve this problem?

    TBAA is conservative in /two/ ways:
    - It allows two accesses to alias if they have TBAA nodes with
    different roots.
    - It allows two accesses to alias if only one of them has a TBAA node.

    The second is what doesn't generalize: there are optimizations
    where you need to
    rely on transition points being explicitly identified. Looking at
    a function
    with no identified transition points, you don't know whether it
    actually doesn't
    transition or whether it was compiled without the transitions
    being explicitly
    marked. There's no way to extend the TBAA idea to make that work.

The other reason why similar scheme doesn't work for !invariant.group is that we rely on a calls to launder/strip being present for some constructs to preserve
information about invartianess of an object (like in the example from RFC).

I'm really not sure I buy this. You're effectively saying that you have two points which need to share a common root: 1) the "transition point" and 2) the invariant.group marker. If it were possible to mark the transition point with a metadata node - is it? - the exact same rules as used for TBAA work just fine.

p.s. It would help me a lot if you'd spell out specific examples of transition points. I checked the RFC and don't see them.