Status of llvm.invariant.{start|end}

Hi,

From LangRef, these intrinsics seems really useful for letting LLVM

know about certain higher level immutability guarantee, e.g. for
objects that are not allowed to be mutated after construction.
However, it doesn't seem to work[1] and a quick code search suggests
that there's not a single optimization pass that's currently using it
for store to load forwarding, only very few that use it to eliminate
stores. The issue linked is marked as resolved-later and mentioned
that it "probably have to be redesigned before they work out right".
What has to be redesigned to make it work and is there a better way
that works currently to mark an object as immutable after a certain
point/in certain region?

Yichao Yu

[1] https://bugs.llvm.org/show_bug.cgi?id=5441

Hi Yichao,

We at Azul have been using invariant.start for marking objects as immutable after a certain point.
Also, upstream changes to teach relevant optimizations about invariant.start and end were added
last year.

With respect to store to load forwarding, this is handled in GVN. I think the test cases in test/Transforms/GVN/invariant.start.ll
handle what you’re looking for.

Hope this helps,
Anna

We at Azul have been using invariant.start for marking objects as immutable after a certain point.
Also, upstream changes to teach relevant optimizations about invariant.start and end were added
last year.

With respect to store to load forwarding, this is handled in GVN. I think the test cases in test/Transforms/GVN/invariant.start.ll
handle what you’re looking for.

Hmm, I'm pretty sure I checked that.
It seems that none of the test cases in there actually requires
invariant.start for store-to-load forwarding? (they need
`invariant.start|end` to not be marked as modifying the memory but
should all work without the intrinsics.) AFAICT the simple case in the
issue I linked still doesn't work

declare void @g(i8*)

declare {}* @llvm.invariant.start.p0i8(i64, i8* nocapture) #0

define i8 @f() {
  %a = alloca i8
  store i8 0, i8* %a
  %i = call {}* @llvm.invariant.start.p0i8(i64 1, i8* %a)
  call void @g(i8* %a)
  %r = load i8, i8* %a
  ret i8 %r
}

attributes #0 = { argmemonly nounwind }

A related note, can this be marked as inaccessiblemem_or_argmemonly
plus readonly on the argument? As far as not messing with the alias
analysis goes, this works very well without LLVM knowning anything
about such a function. Of course the store to load forwarding still
need to be handled separately but it seems that this can avoid some of
the special cases on this intrinsics.

We at Azul have been using invariant.start for marking objects as immutable after a certain point.
Also, upstream changes to teach relevant optimizations about invariant.start and end were added
last year.

With respect to store to load forwarding, this is handled in GVN. I think the test cases in test/Transforms/GVN/invariant.start.ll
handle what you’re looking for.

Hmm, I'm pretty sure I checked that.
It seems that none of the test cases in there actually requires
invariant.start for store-to-load forwarding? (they need
`invariant.start|end` to not be marked as modifying the memory but
should all work without the intrinsics.) AFAICT the simple case in the
issue I linked still doesn't work

declare void @g(i8*)

declare {}* @llvm.invariant.start.p0i8(i64, i8* nocapture) #0

define i8 @f() {
 %a = alloca i8
 store i8 0, i8* %a
 %i = call {}* @llvm.invariant.start.p0i8(i64 1, i8* %a)
 call void @g(i8* %a)
 %r = load i8, i8* %a
 ret i8 %r
}

attributes #0 = { argmemonly nounwind }

Yes, you’re right. We don’t forward across the call even in presence of invariant.start.

By definition, invariant.start represents constant physical memory, so I would think this is a legal transform
to do. Ofcourse, this can lead to miscompiles if g is a special function that can modify %a in some way,
but those are things the front end needs to identify.

Anna

We at Azul have been using invariant.start for marking objects as immutable after a certain point.
Also, upstream changes to teach relevant optimizations about invariant.start and end were added
last year.

With respect to store to load forwarding, this is handled in GVN. I think the test cases in test/Transforms/GVN/invariant.start.ll
handle what you’re looking for.

Hmm, I’m pretty sure I checked that.
It seems that none of the test cases in there actually requires
invariant.start for store-to-load forwarding? (they need
invariant.start|end to not be marked as modifying the memory but
should all work without the intrinsics.) AFAICT the simple case in the
issue I linked still doesn’t work

declare void @g(i8*)

declare {}* @llvm.invariant.start.p0i8(i64, i8* nocapture) #0

define i8 @f() {
%a = alloca i8
store i8 0, i8* %a
%i = call {}* @llvm.invariant.start.p0i8(i64 1, i8* %a)
call void @g(i8* %a)
%r = load i8, i8* %a
ret i8 %r
}

attributes #0 = { argmemonly nounwind }

Yes, you’re right. We don’t forward across the call even in presence of invariant.start.

By definition, invariant.start represents constant physical memory, so I would think this is a legal transform
to do. Ofcourse, this can lead to miscompiles if g is a special function that can modify %a in some way,
but those are things the front end needs to identify.

Just to clarify: If g can modify %a in some way, the front end needs to identify it and avoid adding
invariant.start.

Once the invariant.start has been added by the front end,this is a perfectly legal transform to do
based on the LLVM spec.

In fact we should be implementing this in LLVM, patches welcome :slight_smile:

Just to clarify: If g can modify %a in some way, the front end needs to
identify it and avoid adding
invariant.start.

OK cool, that's what I thought too.

Once the invariant.start has been added by the front end,this is a perfectly
legal transform to do
based on the LLVM spec.

In fact we should be implementing this in LLVM, patches welcome :slight_smile:

Which is kind of what I was asking about in the original post. The
issue suggests that something needs rework and I'm not sure what
exactly it is referring to.....

I am pretty interested in this optimization so I would have submitted
a patch if I know how... I'm not really sure where this should be
implemented so that all optimization passes can make use of it (AA?)
and how the information should be translated (I wasn't able to find
anything similar to this in AA)

I think what we need is AA along with dominance information. Regarding an analysis pass that can be
used in various transformation passes, perhaps MemorySSA can help with this sort of information?
CC’ed folks who may be able to help here.

In the example you have, when checking for clobbering between call g and the load of a,
we know that call g does not clobber the load if the invariant.start (which uses a) dominates call g,
and the invariant.start has no uses. If invariant.start has uses, we’ll need to check the dominance information of the use
w.r.t. the call g as well. However, this sort of analysis does not scale well.

AFAIK, the only transformation where we actually use the knowledge of invariant.start is within LICM for hoisting loads.

Anna