invariant.load metadata semantics

I’m working on enhancing EarlyCSE to use MemorySSA and have come across the following issue due to differences in EarlyCSE and MemorySSA’s handling of !invariant.load. EarlyCSE will not currently optimize the following code by replacing %x2 with %x and removing the second load:

B1:

%x = load %p

clobber()

B2: // dominated by B1

%x2 = load %p !invariant.load

Sanjoy (who added the !invariant.load support to EarlyCSE) and I discussed this, and I believe we are both in agreement that this optimization should be legal. I’d like to make sure there is agreement on this and possibly clarify the LangRef wording on !invariant.load to make the legality of this transformation more clear.

Sanjoy suggested the following:

Instead of “The existence of the !invariant.load metadata on the instruction tells the optimizer and code generator that the address operand to this load points to memory which can be assumed unchanged.” we say “It is undefined behavior to invariant_load from a location that has been changed since it became dereferenceable”. In the current langref, I find “The existence” somewhat confusing, since it seems to imply that adding dead code can change the behavior of the program.

I don’t want to specify the semantics in a way that:

int* ptr = …
int k0 = *ptr; // normal load
clobber();
int k1 = *ptr; // normal load

has a different meaning than

int* ptr = …
int k0 = *ptr; // normal load
clobber();
int k1 = *ptr; // normal load
if () {
int k2 = *ptr; // !invariant load
}

That is, adding dead code should not change the behavior of the
program – the code guarded by () should be able to have
any amount of junk without breaking the program, since it does not
actually execute.

Does this seem like a clearer wording of the intended semantics?

From: "Geoff Berry via llvm-dev" <llvm-dev@lists.llvm.org>
To: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Thursday, August 25, 2016 2:23:01 PM
Subject: [llvm-dev] invariant.load metadata semantics

I'm working on enhancing EarlyCSE to use MemorySSA and have come
across the following issue due to differences in EarlyCSE and
MemorySSA's handling of !invariant.load. EarlyCSE will *not*
currently optimize the following code by replacing %x2 with %x and
removing the second load:

> B1:

> %x = load %p

> clobber()

> ...

> B2: // dominated by B1

> %x2 = load %p !invariant.load

Sanjoy (who added the !invariant.load support to EarlyCSE) and I
discussed this, and I believe we are both in agreement that this
optimization should be legal. I'd like to make sure there is
agreement on this and possibly clarify the LangRef wording on
!invariant.load to make the legality of this transformation more
clear.

Sanjoy suggested the following:

> Instead of "The existence of the !invariant.load metadata on the
> instruction tells the optimizer and code generator that the address
> operand to this load points to memory which can be assumed
> unchanged." we say "It is undefined behavior to invariant_load from
> a location that has been changed since it became dereferenceable".
> In the current langref, I find "The existence" somewhat confusing,
> since it seems to imply that adding dead code can change the
> behavior of the program.

> I don't want to specify the semantics in a way that:

> int* ptr = ...

> int k0 = *ptr; // normal load

> clobber();

> int k1 = *ptr; // normal load

> has a different meaning than

> int* ptr = ...

> int k0 = *ptr; // normal load

> clobber();

> int k1 = *ptr; // normal load

> if (<always false>) {

> int k2 = *ptr; // !invariant load

> }

> That is, adding dead code should not change the behavior of the

> program -- the code guarded by (<always false>) should be able to
> have

> any amount of junk without breaking the program, since it does not

> actually execute.

I agree.

Regarding the proposed text, I find the "since it became dereferenceable" phrase ambiguous. Further, I think we can say something stronger: Storing into a location previously loaded using a load tagged with !invariant.load is undefined behavior.

-Hal

Hi Hal,

> I agree.
>
> Regarding the proposed text, I find the "since it became
> dereferenceable" phrase ambiguous. Further, I think we can say something
> stronger: Storing into a location previously loaded using a load tagged
> with !invariant.load is undefined behavior.

That prevents doing the optimization Geoff suggested:

   int k = *ptr;
   clobber();
   int k2 = *ptr; // invariant

==>

   int k = *ptr;
   clobber();
   int k2 = k;

since clobber(), given what you said, could have legitimately changed
the contents of ptr.

-- Sanjoy

From: "Hal Finkel via llvm-dev" <llvm-dev@lists.llvm.org>
To: "Geoff Berry" <gberry@codeaurora.org>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Thursday, August 25, 2016 3:05:48 PM
Subject: Re: [llvm-dev] invariant.load metadata semantics

> From: "Geoff Berry via llvm-dev" <llvm-dev@lists.llvm.org>

> To: "llvm-dev" <llvm-dev@lists.llvm.org>

> Sent: Thursday, August 25, 2016 2:23:01 PM

> Subject: [llvm-dev] invariant.load metadata semantics

> I'm working on enhancing EarlyCSE to use MemorySSA and have come
> across the following issue due to differences in EarlyCSE and
> MemorySSA's handling of !invariant.load. EarlyCSE will *not*
> currently optimize the following code by replacing %x2 with %x and
> removing the second load:

> > B1:
>

> > %x = load %p
>

> > clobber()
>

> > ...
>

> > B2: // dominated by B1
>

> > %x2 = load %p !invariant.load
>

> Sanjoy (who added the !invariant.load support to EarlyCSE) and I
> discussed this, and I believe we are both in agreement that this
> optimization should be legal. I'd like to make sure there is
> agreement on this and possibly clarify the LangRef wording on
> !invariant.load to make the legality of this transformation more
> clear.

> Sanjoy suggested the following:

> > Instead of "The existence of the !invariant.load metadata on the
> > instruction tells the optimizer and code generator that the
> > address
> > operand to this load points to memory which can be assumed
> > unchanged." we say "It is undefined behavior to invariant_load
> > from
> > a location that has been changed since it became
> > dereferenceable".
> > In the current langref, I find "The existence" somewhat
> > confusing,
> > since it seems to imply that adding dead code can change the
> > behavior of the program.
>

> > I don't want to specify the semantics in a way that:
>

> > int* ptr = ...
>

> > int k0 = *ptr; // normal load
>

> > clobber();
>

> > int k1 = *ptr; // normal load
>

> > has a different meaning than
>

> > int* ptr = ...
>

> > int k0 = *ptr; // normal load
>

> > clobber();
>

> > int k1 = *ptr; // normal load
>

> > if (<always false>) {
>

> > int k2 = *ptr; // !invariant load
>

> > }
>

> > That is, adding dead code should not change the behavior of the
>

> > program -- the code guarded by (<always false>) should be able to
> > have
>

> > any amount of junk without breaking the program, since it does
> > not
>

> > actually execute.
>

I agree.

Regarding the proposed text, I find the "since it became
dereferenceable" phrase ambiguous. Further, I think we can say
something stronger: Storing into a location previously loaded using
a load tagged with !invariant.load is undefined behavior.

Alternatively, we might phrase this as: The optimizer may assume that all values loaded from a location, where any of the loads are tagged with !invariant.load, are identical.

This has the benefit of covering the fact that no outside entity (i.e. the operating system) changes the value, and that we can change it, but only to the same value it had before (if we'll later be able to observe the difference).

-Hal

From: "Sanjoy Das" <sanjoy@playingwithpointers.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Geoff Berry" <gberry@codeaurora.org>, "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Thursday, August 25, 2016 3:11:27 PM
Subject: Re: [llvm-dev] invariant.load metadata semantics

Hi Hal,

> I agree.
>
> Regarding the proposed text, I find the "since it became
> dereferenceable" phrase ambiguous. Further, I think we can say
> something
> stronger: Storing into a location previously loaded using a load
> tagged
> with !invariant.load is undefined behavior.

That prevents doing the optimization Geoff suggested:

   int k = *ptr;
   clobber();
   int k2 = *ptr; // invariant

==>

   int k = *ptr;
   clobber();
   int k2 = k;

since clobber(), given what you said, could have legitimately changed
the contents of ptr.

I agree this is undesirable; see my second suggestion :wink:

-Hal

From: "Hal Finkel" <hfinkel@anl.gov>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>, "Geoff Berry"
<gberry@codeaurora.org>
Sent: Thursday, August 25, 2016 3:12:08 PM
Subject: Re: [llvm-dev] invariant.load metadata semantics

> From: "Hal Finkel via llvm-dev" <llvm-dev@lists.llvm.org>

> To: "Geoff Berry" <gberry@codeaurora.org>

> Cc: "llvm-dev" <llvm-dev@lists.llvm.org>

> Sent: Thursday, August 25, 2016 3:05:48 PM

> Subject: Re: [llvm-dev] invariant.load metadata semantics

> > From: "Geoff Berry via llvm-dev" <llvm-dev@lists.llvm.org>
>

> > To: "llvm-dev" <llvm-dev@lists.llvm.org>
>

> > Sent: Thursday, August 25, 2016 2:23:01 PM
>

> > Subject: [llvm-dev] invariant.load metadata semantics
>

> > I'm working on enhancing EarlyCSE to use MemorySSA and have come
> > across the following issue due to differences in EarlyCSE and
> > MemorySSA's handling of !invariant.load. EarlyCSE will *not*
> > currently optimize the following code by replacing %x2 with %x
> > and
> > removing the second load:
>

> > > B1:
> >
>

> > > %x = load %p
> >
>

> > > clobber()
> >
>

> > > ...
> >
>

> > > B2: // dominated by B1
> >
>

> > > %x2 = load %p !invariant.load
> >
>

> > Sanjoy (who added the !invariant.load support to EarlyCSE) and I
> > discussed this, and I believe we are both in agreement that this
> > optimization should be legal. I'd like to make sure there is
> > agreement on this and possibly clarify the LangRef wording on
> > !invariant.load to make the legality of this transformation more
> > clear.
>

> > Sanjoy suggested the following:
>

> > > Instead of "The existence of the !invariant.load metadata on
> > > the
> > > instruction tells the optimizer and code generator that the
> > > address
> > > operand to this load points to memory which can be assumed
> > > unchanged." we say "It is undefined behavior to invariant_load
> > > from
> > > a location that has been changed since it became
> > > dereferenceable".
> > > In the current langref, I find "The existence" somewhat
> > > confusing,
> > > since it seems to imply that adding dead code can change the
> > > behavior of the program.
> >
>

> > > I don't want to specify the semantics in a way that:
> >
>

> > > int* ptr = ...
> >
>

> > > int k0 = *ptr; // normal load
> >
>

> > > clobber();
> >
>

> > > int k1 = *ptr; // normal load
> >
>

> > > has a different meaning than
> >
>

> > > int* ptr = ...
> >
>

> > > int k0 = *ptr; // normal load
> >
>

> > > clobber();
> >
>

> > > int k1 = *ptr; // normal load
> >
>

> > > if (<always false>) {
> >
>

> > > int k2 = *ptr; // !invariant load
> >
>

> > > }
> >
>

> > > That is, adding dead code should not change the behavior of the
> >
>

> > > program -- the code guarded by (<always false>) should be able
> > > to
> > > have
> >
>

> > > any amount of junk without breaking the program, since it does
> > > not
> >
>

> > > actually execute.
> >
>

> I agree.

> Regarding the proposed text, I find the "since it became
> dereferenceable" phrase ambiguous. Further, I think we can say
> something stronger: Storing into a location previously loaded using
> a load tagged with !invariant.load is undefined behavior.

Alternatively, we might phrase this as: The optimizer may assume that
all values loaded from a location, where any of the loads are tagged
with !invariant.load, are identical.

This has the benefit of covering the fact that no outside entity
(i.e. the operating system) changes the value, and that we can
change it, but only to the same value it had before (if we'll later
be able to observe the difference).

Some questions: Do we allow stores to these locations at all? Only if the value is the same? Must any change be observable to be a problem? Do atomic loads of invariant locations really need to be atomic?

-Hal

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of Hal Finkel via llvm-dev
Subject: Re: [llvm-dev] invariant.load metadata semantics

Alternatively, we might phrase this as: The optimizer may assume that all values loaded
from a location, where any of the loads are tagged with !invariant.load, are identical.

This would seem to limit the usefulness of the invariant attribute. I would expect invariant to indicate that loads reachable from the one marked invariant are guaranteed to read the same value, but that prior ones are not. This would allow updates to be made to the location of interest up to the point of declared invariance, but not after.

This has the benefit of covering the fact that no outside entity (i.e. the operating system)
changes the value, and that we can change it, but only to the same value it had before (if
we'll later be able to observe the difference).

Some questions: Do we allow stores to these locations at all? Only if the value is the same?
Must any change be observable to be a problem?

An alternative semantic is that a store would remove the invariant attribute for subsequent loads.

Do atomic loads of invariant locations really need to be atomic?

Ordering would still seem to be important, unless invariant applies function-wide with no reachability concerns.

- Chuck

> From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
Hal Finkel via llvm-dev
> Subject: Re: [llvm-dev] invariant.load metadata semantics

> Alternatively, we might phrase this as: The optimizer may assume that
all values loaded
> from a location, where any of the loads are tagged with !invariant.load,
are identical.

This would seem to limit the usefulness of the invariant attribute. I
would expect invariant to indicate that loads reachable from the one marked
invariant are guaranteed to read the same value, but that prior ones are
not.

Define "reachable".

Do you mean there is any path from the load to the new load?

If so, what about loop backedges?

:slight_smile:

This is one of the reasons they are explicitly marked.

This would allow updates to be made to the location of interest up to the
point of declared invariance, but not after.

This has the same problem.

for loop:

  store a
  load a !invariant.load

Legal or not if we can prove the loop iterates >1 time?
What does it mean?

From: "Charles R Caldarale" <Chuck.Caldarale@unisys.com>
To: "Hal Finkel" <hfinkel@anl.gov>, llvm-dev@lists.llvm.org
Sent: Thursday, August 25, 2016 3:46:17 PM
Subject: RE: [llvm-dev] invariant.load metadata semantics

> From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf
> Of Hal Finkel via llvm-dev
> Subject: Re: [llvm-dev] invariant.load metadata semantics

> Alternatively, we might phrase this as: The optimizer may assume
> that all values loaded
> from a location, where any of the loads are tagged with
> !invariant.load, are identical.

This would seem to limit the usefulness of the invariant attribute.
I would expect invariant to indicate that loads reachable from the
one marked invariant are guaranteed to read the same value, but that
prior ones are not. This would allow updates to be made to the
location of interest up to the point of declared invariance, but not
after.

> This has the benefit of covering the fact that no outside entity
> (i.e. the operating system)
> changes the value, and that we can change it, but only to the same
> value it had before (if
> we'll later be able to observe the difference).

> Some questions: Do we allow stores to these locations at all? Only
> if the value is the same?
> Must any change be observable to be a problem?

An alternative semantic is that a store would remove the invariant
attribute for subsequent loads.

I don't see how this would be useful. If we could prove the lack of intervening stores, then we'd not need the metadata.

-Hal

From: "Charles R Caldarale" <Chuck.Caldarale@unisys.com>
To: "Hal Finkel" <hfinkel@anl.gov>, llvm-dev@lists.llvm.org
Sent: Thursday, August 25, 2016 3:46:17 PM
Subject: RE: [llvm-dev] invariant.load metadata semantics

> From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf
> Of Hal Finkel via llvm-dev
> Subject: Re: [llvm-dev] invariant.load metadata semantics

> Alternatively, we might phrase this as: The optimizer may assume
> that all values loaded
> from a location, where any of the loads are tagged with
> !invariant.load, are identical.

This would seem to limit the usefulness of the invariant attribute.
I would expect invariant to indicate that loads reachable from the
one marked invariant are guaranteed to read the same value, but that
prior ones are not. This would allow updates to be made to the
location of interest up to the point of declared invariance, but not
after.

No, this is a different use case, and we invariant.group/invariant.group.barrier and invariant.start/invariant.end for that.

-Hal

Hi Hal,

> Alternatively, we might phrase this as: The optimizer may assume that
> all values loaded from a location, where any of the loads are tagged
> with !invariant.load, are identical.

I'm okay if we s/any of the loads/any of the loads that execute at
runtime/. Otherwise we end up in the same "dead code changes
semantics" soup.

> This has the benefit of covering the fact that no outside entity (i.e.
> the operating system) changes the value, and that we can change it, but
> only to the same value it had before (if we'll later be able to observe
> the difference).

Not sure if we gain much b allowing "changing it to the same value"
(except perhaps "invariant_load(ptr); *ptr = 40" is not provably UB),
but I don't mind it either.

-- Sanjoy

From: Daniel Berlin [mailto:dberlin@dberlin.org]
Subject: Re: [llvm-dev] invariant.load metadata semantics

> This would seem to limit the usefulness of the invariant attribute. I would expect invariant
> to indicate that loads reachable from the one marked invariant are guaranteed to read the same
> value, but that prior ones are not.

Define "reachable".

Do you mean there is any path from the load to the new load?

Yes.

If so, what about loop backedges?

Depends on the details of the desired semantic definition. The conservative approach would be to drop the invariance; could also declare the situation to be UB and perhaps catch it in the verifier.

> This would allow updates to be made to the location of interest up to the point of declared
> invariance, but not after.

This has the same problem.

for loop:
store a
load a !invariant.load

Legal or not if we can prove the loop iterates >1 time?
What does it mean?

See above. My (slight) preference is to treat it as UB, but that might be a bit strong, considering this is metadata.

- Chuck

From: Hal Finkel [mailto:hfinkel@anl.gov]
Subject: Re: [llvm-dev] invariant.load metadata semantics

> I would expect invariant to indicate that loads reachable from the
> one marked invariant are guaranteed to read the same value, but that
> prior ones are not. This would allow updates to be made to the
> location of interest up to the point of declared invariance, but not
> after.

No, this is a different use case, and we invariant.group/invariant.group.barrier and
invariant.start/invariant.end for that.

Good point; I'd forgotten about those.

- Chuck

Hi Hal,

> Some questions: Do we allow stores to these locations at all? Only if

I'd vote for disallowing stores to these locations, but if "stores
allowed only if the value is the same" is helpful in some situation
then I don't have specific reasons why that would be problematic.

> the value is the same? Must any change be observable to be a problem? Do

Not sure what you mean by "Must any change be observable".

> atomic loads of invariant locations really need to be atomic?

It depends on the answer to "Do we allow stores to these locations at
all?". If we don't allow stores to these locations at all then atomic
loads are not required, since we can't have racing stores to that
location.

However, syntactically, I'd be tempted to allow invariant loads to be
atomic; and maybe have a later pass strip out the atomic bit if the
semantics we decide allow that.

-- Sanjoy

Hi Charles,

>> From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of Hal Finkel via llvm-dev
>> Subject: Re: [llvm-dev] invariant.load metadata semantics
>
>> Alternatively, we might phrase this as: The optimizer may assume that all values loaded
>> from a location, where any of the loads are tagged with !invariant.load, are identical.
>

> This would seem to limit the usefulness of the invariant attribute. I would expect invariant to indicate that loads reachable from the one marked invariant are guaranteed to read the same value, but that prior ones are not. This would allow updates to be made to the location of interest up to the point of declared invariance, but not after.

That is a very defensible point, but depending on your use case you
consider these "initial" writes part of making the memory
dereferenceable. For instance, for us (JVM) loading array lengths
satisfy the requirements of invariant loads, and we bake in the length
of the array as part of the array's allocation routine. LLVM never
"sees" a store the the length field of an array.

However, I can imagine more complex use cases for which the above
approach won't work.

-- Sanjoy

To be specific, does this imply "if we see a store, we can assume it is of
the same value the load already has"?

IE if i have:

func()
{
load a, invariant.load !1
<use of a>
store a, 5
}

Can i assume that a in <use of a> has the value 5?

From: "Sanjoy Das" <sanjoy@playingwithpointers.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Thursday, August 25, 2016 6:09:14 PM
Subject: Re: [llvm-dev] invariant.load metadata semantics

Hi Hal,

> Some questions: Do we allow stores to these locations at all? Only
> if

I'd vote for disallowing stores to these locations, but if "stores
allowed only if the value is the same" is helpful in some situation
then I don't have specific reasons why that would be problematic.

I would as well; and from what I understand, this is consistent with current use cases.

> the value is the same? Must any change be observable to be a
> problem? Do

Not sure what you mean by "Must any change be observable".

If we allow stores, if we have an invariant load from some location, and then a store to that location (value arbitrary), is that a problem if the IR being analyzed never loads from it again? I don't just mean dead stores: just because the IR in question does not load from it again, that does not mean that "the system" doesn't.

> atomic loads of invariant locations really need to be atomic?

It depends on the answer to "Do we allow stores to these locations at
all?". If we don't allow stores to these locations at all then
atomic
loads are not required, since we can't have racing stores to that
location.

However, syntactically, I'd be tempted to allow invariant loads to be
atomic; and maybe have a later pass strip out the atomic bit if the
semantics we decide allow that.

I agree. We should allow atomic loads to be marked as !invariant.load. We might, if we decide on semantics that allow it, canonicalize by demoting to a non-atomic load.

-Hal

I disagree. It’s much easier to write fronted and use invariant load when you can emit stores that store the same value.
For example with invariant.group used on vptrs optimizer can assume the 2 loads load the same vtable, even when you can write code in c++ that is valid that store vptr after calling virtual function (like placement new(this) with the same type).

Hi Daniel,

Daniel Berlin wrote:
> To be specific, does this imply "if we see a store, we can assume it is
> of the same value the load already has"?
>
> IE if i have:
>
> func()
> {
> load a, invariant.load !1
> <use of a>
> store a, 5
> }
>
> Can i assume that a in <use of a> has the value 5?

Yes, that is what I had in mind. Moreover, in

func()
{
   int k = load a, invariant.load !1
   print(k);
   store a, 5
}

k can be optimized to 5, in a form of "time traveling store
forwarding" :). The store (at least if non-atomic and non-volatile)
is also trivially dead.

But I'd rather not allow stores at all.

-- Sanjoy

Hi Piotr,

Piotr Padlewski wrote:
> I disagree. It's much easier to write fronted and use invariant load
> when you can emit stores that store the same value.
> For example with invariant.group used on vptrs optimizer can assume the
> 2 loads load the same vtable, even when you can write code in c++ that
> is valid that store vptr after calling virtual function (like placement
> new(this) with the same type).

That's a good point -- I didn't have this use case in mind.

-- Sanjoy