[RFC] Introduce the `!nocapture` metadata and "nocapture_use" operand bundle

TL;DR: A pointer stored in memory is not necessarily captured, let's add a way to express this.

Phab: ⚙ D93189 Introduce the `!nocapture` metadata and "nocapture_use" operand bundle

--- Commit Message / Rational ---

Runtime functions, as well as regular functions, might require a pointer
to be passed in memory even though the memory is simply a means to pass
(multiple) arguments. That is, the indirection through memory is only
used on the call edge and not otherwise relevant. However, such pointers
are currently assumed to escape as soon as they are stored in memory
even if the callee only reloads them and use them in a "non-escaping" way.
Generally, storing a pointer might not cause it to escape if all "uses of
  the memory" it is stored to all have the "nocapture" property.

To allow optimizations in the presence of pointers stored to memory we
introduce two new IR extensions. `!nocapture` metadata on stores and
"nocapture_use" operand bundles for call(base) instructions. The former
ensures that the store can be ignored for the purpose of escape
analysis. The latter indicates that a call is using a pointer value
but not capturing it. This is important as the call might still read
or write the pointer and since the passing of the pointer through
memory is not considered "capturing" with the "nocapture" metadata,
we need to otherwise indicate the potential read/write.

As an example use case where we can deduce `!nocapture` metadata,
consider the following code:

struct Payload {
   int *a;
   double *b;
};

int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
                     void *(*start_routine) (void *), void *arg);

int use(double);

void fn(void *v) {
   Payload *p = (Payload*)(v);
   // Load the pointers from the payload and then dereference them,
   // this will not capture the pointers.
   int *a = p->a;
   double *b = p->b;
   *a = use(*b);
}

void foo(int *a, double *b) {
   Payload p = {a, b};
   pthread_create(..., &fn, &p);
}

Given the usage of the payload struct in `fn` we can conclude neither
`a` nor `b` in are captured in `foo`, however we could not express this
fact "locally" before. That is, we can deduce and annotate it for the
arguments `a` and `b` but only since there is no other use (later on).
Similarly, if the callee would not be known, we were not able to
describe the "nocapture" behavior of the API.

A follow up patch will introduce `!nocapture` metadata to stores
generated during OpenMP lowering. This will, among other things, fix
PR48475. I generally expect us to find more APIs that could benefit from
the annotation in addition to the deduction we can do if we see the callee.

Hey Johannes,

For the !nocapture/nocapture_use optimizations, what is the intended semantics in the presence of interprocedural analysis, such as inlining? It sounds like the !nocapture is tied to the lifetime of the alloca, but the last use of the value loaded is always, necessarily, going to come after the last use of the alloca (to load the value), and thus seems insufficient to cover the lifetime of the uses of that stored pointer.

Hi Jameson,

thanks for the input.

Hey Johannes,

From my recollection of past conversations with you, in the example with
`foo`, shouldn't there be a `pthread_join` call also represented in the
documentation, so that the lifetime of the child call is shown to not
escape the scope of the parent stack frame?

Absolutely, or we would need to allocate `p` on the heap. The
underlying idea is not impacted by this though.

For the !nocapture/nocapture_use optimizations, what is the intended
semantics in the presence of interprocedural analysis, such as inlining? It
sounds like the !nocapture is tied to the lifetime of the alloca, but the
last use of the value loaded is always, necessarily, going to come after
the last use of the alloca (to load the value), and thus seems insufficient
to cover the lifetime of the uses of that stored pointer.

Good point. I think we have a conservatively correct inliner:
Compiler Explorer

That is, if you see operand bundles, don't inline. This is, in general,
necessary. Not to say we cannot teach the inliner how to handle some
of them. The most important use cases for the new features would be to
bridge API boundaries through which we cannot inline though. If we have
the definition of the callee we could inline, or (internalize and) do
argument promotion. However, if we have runtime calls we can't. Things
like the OpenMP reduction API take all the pointers that will be reduced
into, thereby capturing them. We cannot inline or promote the passed
struct without static linking of the library, which is often not a (good)
option.

Does this make sense?

~ Johannes

I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all. So, as a workaround you introduce a use which implicitly reads and writes. But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.

Artur

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.

Correct.

  So, as a workaround you introduce a use which implicitly reads and writes.

Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.

Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, see https://reviews.llvm.org/D93189#2485826 .

Note that we do not inline a call with an "unkown" operand bundle, so there is no fear we
accidentally produce such a situation as you pointed out. A "proper" version of the example
would be:

a = new
store a, ptr, !nocapture
call foo(ptr, a) !nocapture_use(a)

void foo(arg_ptr. arg_a) {
   a' = load arg_ptr
   v1 = load arg_a
  ...
}

which should be OK.

Does that make sense?

~ Johannes

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.

Correct.

So, as a workaround you introduce a use which implicitly reads and writes.

Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.

Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, see https://reviews.llvm.org/D93189#2485826 .

My concern here is that we miscompile a program which is seemingly correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that the pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?
b) Does the example above violate this statement somehow?

Basically, what am I doing wrong that I get a miscompile on this example?

Artur

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.

Correct.

  So, as a workaround you introduce a use which implicitly reads and writes.

Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.

Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, see https://reviews.llvm.org/D93189#2485826 .

My concern here is that we miscompile a program which is seemingly correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that the pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer from the location
you stored it using `!nocapture`, there are no uses and "all uses" are explicitly
marked. If you do load it, you should make sure the use is "explicitly marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr !nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but that instruction is
not annotated with an explicit use of `a`. Now, this could actually be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load %ptr ["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr) ["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without causing it to be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use" but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of them is, so far,
discouraged since it can easily lead to unexpected results.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot be violated,
as stated right now. Using the annotation changes the program semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).

Basically, what am I doing wrong that I get a miscompile on this example?

You don't get the clobber because you did not explicitly mark the use of `%a` in `%a'`.

WDYT?

~ Johannes

Hi Artur,

I’m a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.

Correct.

So, as a workaround you introduce a use which implicitly reads and writes.

Correct, for now. We could add “readonly”/“writeonly” etc. later on.

But this might be a more general problem. For example:

a = new …
store a, ptr, !nocapture
a’ = load ptr
; Now you have 2 pointers to the same object (a’ and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a’
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.

Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, see https://reviews.llvm.org/D93189#2485826 .

My concern here is that we miscompile a program which is seemingly correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says:
"!nocapture metadata on the instruction tells the optimizer that the pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer from the location
you stored it using !nocapture, there are no uses and “all uses” are explicitly
marked. If you do load it, you should make sure the use is “explicitly marked otherwise”
because you do not get a “dependence edge” from the store %a %ptr !nocapture to %a' = load %ptr.
In your example, that explicit use is missing. So you load ptr but that instruction is
not annotated with an explicit use of a. Now, this could actually be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: %a' = load %ptr ["nocapture_use"(%a)],
which would correspond to the intended use call @f(%ptr) ["nocapture_use"(%a)]. That way you would
be able to communicate %a through memory (here %ptr) without causing it to be captured.
(This assumes you ensured %a' is not captured.)

I think we could require !nocapture to be used with “nocapture_use” but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of them is, so far,
discouraged since it can easily lead to unexpected results.

So, basically every use of the value loaded from memory which was previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the original pointer.

Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?

a = new …
store a, ptr, !nocapture
a’ = load ptr
s = select c, a’, some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if this is a load in the same function as
the nocapture store).

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot be violated,
as stated right now. Using the annotation changes the program semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).

The way I’m looking at this is I have a program without !nocapture metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf
https://www.youtube.com/watch?v=WHiU2-h_kRM
We are looking into ways we can communicate facts this analysis computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect from the nocapture property?

Artur

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other problem, let me know.

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.
Correct.

  So, as a workaround you introduce a use which implicitly reads and writes.
Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.
Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, see https://reviews.llvm.org/D93189#2485826 .
My concern here is that we miscompile a program which is seemingly correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that the pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer from the location
you stored it using `!nocapture`, there are no uses and "all uses" are explicitly
marked. If you do load it, you should make sure the use is "explicitly marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr !nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but that instruction is
not annotated with an explicit use of `a`. Now, this could actually be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load %ptr ["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr) ["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without causing it to be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use" but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of them is, so far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the original pointer.

Not necessarily an alias but a use of the original pointer. More below.

Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if this is a load in the same function as
the nocapture store).

Right, we could do that.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot be violated,
as stated right now. Using the annotation changes the program semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf
2020 LLVM Developers’ Meeting: A. Pilipenko “Control-flow Sensitive Escape Analysis in Falcon JIT” - YouTube
We are looking into ways we can communicate facts this analysis computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect from the nocapture property?

You can mark the pointers in the caller nocapture, with all the benefits that might bring.
For a more direct example, see this thread: https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
Basically, we pass a bunch of pointers to an API. Since we need to pass an unknown number, we
do it with an array of void* and a size. The runtime will only load the pointers and not cause
them to escape, though it looks like they could. The same problem appears for memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy. Since you do more than one, you
pass a bunch of pointers via a void* array, which causes them to escape. This causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be) linked statically and you need
domain knowledge to deal with this. I split the OpenMP usage code from the patch but the initial version
still had it: https://reviews.llvm.org/D93189?id=311479 I'll clean it up and put it on phab again soon.

Given an escape analysis you might be interested in the same use case. A pointer is stored but you can
see from the uses of the stored-to location that it does not escape. As you pointed out, if those uses
are non-calls we cannot attach "nocapture_use" for now, which is a problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate the load, or the memory is already
aliasing something and it is unclear how much we can deduce anyway. If we would want to use `!nocapture`
more fine-grained we could try to come up with a way to tie it to a "nocapture_use", but that would
certainly make it more complicated.

~ Johannes

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other problem, let me know.

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.
Correct.

So, as a workaround you introduce a use which implicitly reads and writes.
Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.
Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, see https://reviews.llvm.org/D93189#2485826 .
My concern here is that we miscompile a program which is seemingly correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that the pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer from the location
you stored it using `!nocapture`, there are no uses and "all uses" are explicitly
marked. If you do load it, you should make sure the use is "explicitly marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr !nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but that instruction is
not annotated with an explicit use of `a`. Now, this could actually be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load %ptr ["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr) ["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without causing it to be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use" but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of them is, so far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the original pointer.

Not necessarily an alias but a use of the original pointer. More below.

In general case we might have a pointer which is not the original pointer but an alias for it.

E.g. a select between the original pointer and some other value:
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr
v1 = load s

Or a derived pointer based off the original pointer:
a = new ...
store a, ptr, !nocapture
a' = load ptr
gep = gep a', offset
v1 = load get

Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if this is a load in the same function as
the nocapture store).

Right, we could do that.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot be violated,
as stated right now. Using the annotation changes the program semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf
2020 LLVM Developers’ Meeting: A. Pilipenko “Control-flow Sensitive Escape Analysis in Falcon JIT” - YouTube
We are looking into ways we can communicate facts this analysis computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect from the nocapture property?

You can mark the pointers in the caller nocapture, with all the benefits that might bring.
For a more direct example, see this thread: https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
Basically, we pass a bunch of pointers to an API. Since we need to pass an unknown number, we
do it with an array of void* and a size. The runtime will only load the pointers and not cause
them to escape, though it looks like they could. The same problem appears for memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy. Since you do more than one, you
pass a bunch of pointers via a void* array, which causes them to escape. This causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be) linked statically and you need
domain knowledge to deal with this. I split the OpenMP usage code from the patch but the initial version
still had it: https://reviews.llvm.org/D93189?id=311479 I'll clean it up and put it on phab again soon.

In general we want to have a way to express the results of some more powerful escape analysis in the IR
so CaptureTracking/BasicAA can take advantage of this analysis. In your case this analysis is some domain
specific knowledge, in our case it is a downstream analysis pass.

We can easily express basic facts like capture/nocapture using attributes and metadata. Things get more
difficult when we need to express aliasing properties. For example, your case when the call modifies an
otherwise unescaped pointer. While the proposed solution works for the motivational example, I’m afraid
it’s not really extensible for the general case.

First, we would need to have the ability to attach operand bundles to instructions of any kind. Second, there
are open questions, like how do we treat uses which are not a memory operations? How do we deal with
aliases? These things are not addressed in the current proposal.

The best idea I had so far regarding expressing the results of our EA in the IR was to use noalias and
alias.scope metadata. Essentially every unescaped object can be assigned to a separate scope and every
memory operation can be marked as noalias wrt the scopes we know it doesn’t touch.

In fact, the scheme you propose where every user of a !nocapture pointer is annotated with the original
pointer resembles the alias scopes and noalias metadata. The difference is alias scopes and noalias use
Indirection through metadata while your proposal relies on operand bundles to directly link the original
pointer.

Do you think you can apply something like this to you problem?

Given an escape analysis you might be interested in the same use case. A pointer is stored but you can
see from the uses of the stored-to location that it does not escape. As you pointed out, if those uses
are non-calls we cannot attach "nocapture_use" for now, which is a problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate the load, or the memory is already
aliasing something and it is unclear how much we can deduce anyway.

The unfortunately property of this approach is that we can’t simply compute the analysis and annotate the
IR with the results. First we need to shape the IR in the way so we can attach the results of the analysis.
This kind of shaping (store load forwarding to eliminate extra aliases) may require the results of this analysis
itself (this is assuming we want to reuse some of the existing LLVM transforms to do store load forwarding).

Artur

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other problem, let me know.

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.
Correct.

  So, as a workaround you introduce a use which implicitly reads and writes.
Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.
Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, seehttps://reviews.llvm.org/D93189#2485826 .
My concern here is that we miscompile a program which is seemingly correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that the pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer from the location
you stored it using `!nocapture`, there are no uses and "all uses" are explicitly
marked. If you do load it, you should make sure the use is "explicitly marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr !nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but that instruction is
not annotated with an explicit use of `a`. Now, this could actually be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load %ptr ["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr) ["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without causing it to be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use" but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of them is, so far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the original pointer.

Not necessarily an alias but a use of the original pointer. More below.

In general case we might have a pointer which is not the original pointer but an alias for it.

E.g. a select between the original pointer and some other value:
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr
v1 = load s

Or a derived pointer based off the original pointer:
a = new ...
store a, ptr, !nocapture
a' = load ptr
gep = gep a', offset
v1 = load get

Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if this is a load in the same function as
the nocapture store).

Right, we could do that.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot be violated,
as stated right now. Using the annotation changes the program semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf
2020 LLVM Developers’ Meeting: A. Pilipenko “Control-flow Sensitive Escape Analysis in Falcon JIT” - YouTube
We are looking into ways we can communicate facts this analysis computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect from the nocapture property?

You can mark the pointers in the caller nocapture, with all the benefits that might bring.
For a more direct example, see this thread:https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
Basically, we pass a bunch of pointers to an API. Since we need to pass an unknown number, we
do it with an array of void* and a size. The runtime will only load the pointers and not cause
them to escape, though it looks like they could. The same problem appears for memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy. Since you do more than one, you
pass a bunch of pointers via a void* array, which causes them to escape. This causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be) linked statically and you need
domain knowledge to deal with this. I split the OpenMP usage code from the patch but the initial version
still had it:https://reviews.llvm.org/D93189?id=311479 I'll clean it up and put it on phab again soon.

In general we want to have a way to express the results of some more powerful escape analysis in the IR
so CaptureTracking/BasicAA can take advantage of this analysis. In your case this analysis is some domain
specific knowledge, in our case it is a downstream analysis pass.

We can easily express basic facts like capture/nocapture using attributes and metadata. Things get more
difficult when we need to express aliasing properties. For example, your case when the call modifies an
otherwise unescaped pointer. While the proposed solution works for the motivational example, I’m afraid
it’s not really extensible for the general case.

First, we would need to have the ability to attach operand bundles to instructions of any kind. Second, there
are open questions, like how do we treat uses which are not a memory operations? How do we deal with
aliases? These things are not addressed in the current proposal.

The best idea I had so far regarding expressing the results of our EA in the IR was to use noalias and
alias.scope metadata. Essentially every unescaped object can be assigned to a separate scope and every
memory operation can be marked as noalias wrt the scopes we know it doesn’t touch.

In fact, the scheme you propose where every user of a !nocapture pointer is annotated with the original
pointer resembles the alias scopes and noalias metadata. The difference is alias scopes and noalias use
Indirection through metadata while your proposal relies on operand bundles to directly link the original
pointer.

Do you think you can apply something like this to you problem?

There is a fundamental difference with the proposal here and alias scopes that is not
"indirection through metadata". `!nocapture` + `"nocapture_use"` are independent of the
surrounding. This is crucial because it avoids the scaling problem that comes with the
bidirectional approach of `noalias.scopes` + `!noalias` metadata. As part of the Fortran
alias discussions and the AA monthly working group, we have seen how problematic large
amounts of `noalias` metadata can be. You basically start listing edges in your
alias/dependence graph at some point. We are looking at alternative solutions but those
are so far unrelated to he use case that is discussed here.

Given an escape analysis you might be interested in the same use case. A pointer is stored but you can
see from the uses of the stored-to location that it does not escape. As you pointed out, if those uses
are non-calls we cannot attach "nocapture_use" for now, which is a problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate the load, or the memory is already
aliasing something and it is unclear how much we can deduce anyway.

The unfortunately property of this approach is that we can’t simply compute the analysis and annotate the
IR with the results. First we need to shape the IR in the way so we can attach the results of the analysis.
This kind of shaping (store load forwarding to eliminate extra aliases) may require the results of this analysis
itself (this is assuming we want to reuse some of the existing LLVM transforms to do store load forwarding).

You seem to argue this approach is not a good fit for your analysis, which I agree. That seems
to be not a problem on it's own. Do you have an alternative proposal that would subsume this
approach?

Partially related: Maybe you want to give a short presentation in our next AA call about your
analysis and the ideas you have in this space:

~ Johannes

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other problem, let me know.

Hi Artur,

I’m a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.
Correct.

So, as a workaround you introduce a use which implicitly reads and writes.
Correct, for now. We could add “readonly”/“writeonly” etc. later on.

But this might be a more general problem. For example:

a = new …
store a, ptr, !nocapture
a’ = load ptr
; Now you have 2 pointers to the same object (a’ and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a’
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.
Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, seehttps://reviews.llvm.org/D93189#2485826 .
My concern here is that we miscompile a program which is seemingly correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says:
"!nocapture metadata on the instruction tells the optimizer that the pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer from the location
you stored it using !nocapture, there are no uses and “all uses” are explicitly
marked. If you do load it, you should make sure the use is “explicitly marked otherwise”
because you do not get a “dependence edge” from the store %a %ptr !nocapture to %a' = load %ptr.
In your example, that explicit use is missing. So you load ptr but that instruction is
not annotated with an explicit use of a. Now, this could actually be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: %a' = load %ptr ["nocapture_use"(%a)],
which would correspond to the intended use call @f(%ptr) ["nocapture_use"(%a)]. That way you would
be able to communicate %a through memory (here %ptr) without causing it to be captured.
(This assumes you ensured %a' is not captured.)

I think we could require !nocapture to be used with “nocapture_use” but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of them is, so far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the original pointer.

Not necessarily an alias but a use of the original pointer. More below.

In general case we might have a pointer which is not the original pointer but an alias for it.

E.g. a select between the original pointer and some other value:
a = new …
store a, ptr, !nocapture
a’ = load ptr
s = select c, a’, some other ptr
v1 = load s

Or a derived pointer based off the original pointer:
a = new …
store a, ptr, !nocapture
a’ = load ptr
gep = gep a’, offset
v1 = load get

Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new …
store a, ptr, !nocapture
a’ = load ptr
s = select c, a’, some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if this is a load in the same function as
the nocapture store).

Right, we could do that.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot be violated,
as stated right now. Using the annotation changes the program semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf
2020 LLVM Developers’ Meeting: A. Pilipenko “Control-flow Sensitive Escape Analysis in Falcon JIT” - YouTube
We are looking into ways we can communicate facts this analysis computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect from the nocapture property?

You can mark the pointers in the caller nocapture, with all the benefits that might bring.
For a more direct example, see this thread:https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
Basically, we pass a bunch of pointers to an API. Since we need to pass an unknown number, we
do it with an array of void* and a size. The runtime will only load the pointers and not cause
them to escape, though it looks like they could. The same problem appears for memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy. Since you do more than one, you
pass a bunch of pointers via a void* array, which causes them to escape. This causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be) linked statically and you need
domain knowledge to deal with this. I split the OpenMP usage code from the patch but the initial version
still had it:https://reviews.llvm.org/D93189?id=311479 I’ll clean it up and put it on phab again soon.

In general we want to have a way to express the results of some more powerful escape analysis in the IR
so CaptureTracking/BasicAA can take advantage of this analysis. In your case this analysis is some domain
specific knowledge, in our case it is a downstream analysis pass.

We can easily express basic facts like capture/nocapture using attributes and metadata. Things get more
difficult when we need to express aliasing properties. For example, your case when the call modifies an
otherwise unescaped pointer. While the proposed solution works for the motivational example, I’m afraid
it’s not really extensible for the general case.

First, we would need to have the ability to attach operand bundles to instructions of any kind. Second, there
are open questions, like how do we treat uses which are not a memory operations? How do we deal with
aliases? These things are not addressed in the current proposal.

The best idea I had so far regarding expressing the results of our EA in the IR was to use noalias and
alias.scope metadata. Essentially every unescaped object can be assigned to a separate scope and every
memory operation can be marked as noalias wrt the scopes we know it doesn’t touch.

In fact, the scheme you propose where every user of a !nocapture pointer is annotated with the original
pointer resembles the alias scopes and noalias metadata. The difference is alias scopes and noalias use
Indirection through metadata while your proposal relies on operand bundles to directly link the original
pointer.

Do you think you can apply something like this to you problem?

There is a fundamental difference with the proposal here and alias scopes that is not
“indirection through metadata”. !nocapture + "nocapture_use" are independent of the
surrounding. This is crucial because it avoids the scaling problem that comes with the
bidirectional approach of noalias.scopes + !noalias metadata. As part of the Fortran
alias discussions and the AA monthly working group, we have seen how problematic large
amounts of noalias metadata can be. You basically start listing edges in your
alias/dependence graph at some point.

It sounds like it’s a matter of representation. Conceptually in both cases we are explicitly providing external
aliasing facts. Note that with the proposed !nocapture + "nocapture_use” scheme every user of a !nocapture
stored pointer needs to be annotated.

We are looking at alternative solutions but those
are so far unrelated to he use case that is discussed here.

Given an escape analysis you might be interested in the same use case. A pointer is stored but you can
see from the uses of the stored-to location that it does not escape. As you pointed out, if those uses
are non-calls we cannot attach “nocapture_use” for now, which is a problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate the load, or the memory is already
aliasing something and it is unclear how much we can deduce anyway.

The unfortunately property of this approach is that we can’t simply compute the analysis and annotate the
IR with the results. First we need to shape the IR in the way so we can attach the results of the analysis.
This kind of shaping (store load forwarding to eliminate extra aliases) may require the results of this analysis
itself (this is assuming we want to reuse some of the existing LLVM transforms to do store load forwarding).

You seem to argue this approach is not a good fit for your analysis, which I agree. That seems
to be not a problem on its own. Do you have an alternative proposal that would subsume this

approach?

To me the fact that you need to annonate every single user of a !nocapture stored pointer looks like the most
fragile part of the proposal. It makes it hard to generalize this scheme. It makes it easy to misuse this scheme,
both for the frontends and for the optimizer. Moreover, current wording of the proposal doesn’t address these
issues.

How far can we get with an approach which is conservatively correct without explicit "nocapture_use”
annotations?

Currently there are 2 places where BasicAA uses isNonEscapingLocalObject:

If we were to support nocapture stores without explicit "nocapture_use” annotations we would need to
differentiate between isNonEscapingLocalObject with StoreCaptures==true and StoreCaptures==false.
These are two different properties and we would need to handle them separately.

For the aliasing between isNonEscapingLocalObject and isEscapeSource we would need to change
isEscapeSource for isNonEscapingLocalObject(StoreCaptures==false) case. Specifically, if StoreCaptures==false
then isEscapeSource should not consider loads as escape sources.

For the ModRef query around calls it’s more compilcated. The easiest thing to do here is to not use
isNonEscapingLocalObject(StoreCaptures==false). I don’t know if this is too limiting for your example.
But we can come up with some analysis here as well. What we need to know is the fact that a call doesn’t
capture pointers reachable through its arguments. This would require a new argument attribute, something
like nocapture_content.

My next comment is only somewhat related to "nocapture_use” issue. Your current proposal marks stores
as !nocapture. But I assume that the store can be marked as nocapture because the underlying object is
unescaped. Why don’t we mark this fact instead? If we have this fact then any store to an unescaped memory
can be treated as nocapture (assuming we are doing conservatively correct analysis for StoreCaptures==false).
This will give us a useful fact about the underlying object as well (it can be used for things other than aliasing).

There is a caveat to this suggestion. The proposed !nocapture metadata makes it possible to express
flow-sensitive facts (a store into the same underlying object might be nocapture on one path, but capture on some
other). We would give up this flow-sensitivity if we make the underlying object instead. On the other hand, global
property instead of a flow-sensitive propery will make it easier for a conservatively correct analysis.

Partially related: Maybe you want to give a short presentation in our next AA call about your
analysis and the ideas you have in this space:

https://docs.google.com/document/d/1ybwEKDVtIbhIhK50qYtwKsL50K-NvB6LfuBsfepBZ9Y/edit?usp=sharing

I’ll plan to join.

Artur

Hi Artur,

I'm still having trouble reading your emails. As you see here https://lists.llvm.org/pipermail/llvm-dev/2021-January/147963.html
there is no visual indication of the quoted text versus the new one. I will try to answer you but I might simply overlook parts
of your last email. Apologies. Could you try to modify your email client to indent the quoted text, or something?

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other problem, let me know.

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.
Correct.

  So, as a workaround you introduce a use which implicitly reads and writes.
Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.
Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, seehttps://reviews.llvm.org/D93189#2485826 .
My concern here is that we miscompile a program which is seemingly correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that the pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer from the location
you stored it using `!nocapture`, there are no uses and "all uses" are explicitly
marked. If you do load it, you should make sure the use is "explicitly marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr !nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but that instruction is
not annotated with an explicit use of `a`. Now, this could actually be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load %ptr ["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr) ["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without causing it to be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use" but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of them is, so far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the original pointer.
Not necessarily an alias but a use of the original pointer. More below.
In general case we might have a pointer which is not the original pointer but an alias for it.

E.g. a select between the original pointer and some other value:
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr
v1 = load s

Or a derived pointer based off the original pointer:
a = new ...
store a, ptr, !nocapture
a' = load ptr
gep = gep a', offset
v1 = load get
Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if this is a load in the same function as
the nocapture store).
Right, we could do that.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot be violated,
as stated right now. Using the annotation changes the program semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf
2020 LLVM Developers’ Meeting: A. Pilipenko “Control-flow Sensitive Escape Analysis in Falcon JIT” - YouTube
We are looking into ways we can communicate facts this analysis computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect from the nocapture property?
You can mark the pointers in the caller nocapture, with all the benefits that might bring.
For a more direct example, see this thread:https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
Basically, we pass a bunch of pointers to an API. Since we need to pass an unknown number, we
do it with an array of void* and a size. The runtime will only load the pointers and not cause
them to escape, though it looks like they could. The same problem appears for memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy. Since you do more than one, you
pass a bunch of pointers via a void* array, which causes them to escape. This causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be) linked statically and you need
domain knowledge to deal with this. I split the OpenMP usage code from the patch but the initial version
still had it:https://reviews.llvm.org/D93189?id=311479 I'll clean it up and put it on phab again soon.
In general we want to have a way to express the results of some more powerful escape analysis in the IR
so CaptureTracking/BasicAA can take advantage of this analysis. In your case this analysis is some domain
specific knowledge, in our case it is a downstream analysis pass.

We can easily express basic facts like capture/nocapture using attributes and metadata. Things get more
difficult when we need to express aliasing properties. For example, your case when the call modifies an
otherwise unescaped pointer. While the proposed solution works for the motivational example, I’m afraid
it’s not really extensible for the general case.

First, we would need to have the ability to attach operand bundles to instructions of any kind. Second, there
are open questions, like how do we treat uses which are not a memory operations? How do we deal with
aliases? These things are not addressed in the current proposal.

The best idea I had so far regarding expressing the results of our EA in the IR was to use noalias and
alias.scope metadata. Essentially every unescaped object can be assigned to a separate scope and every
memory operation can be marked as noalias wrt the scopes we know it doesn’t touch.

In fact, the scheme you propose where every user of a !nocapture pointer is annotated with the original
pointer resembles the alias scopes and noalias metadata. The difference is alias scopes and noalias use
Indirection through metadata while your proposal relies on operand bundles to directly link the original
pointer.

Do you think you can apply something like this to you problem?

There is a fundamental difference with the proposal here and alias scopes that is not
"indirection through metadata". `!nocapture` + `"nocapture_use"` are independent of the
surrounding. This is crucial because it avoids the scaling problem that comes with the
bidirectional approach of `noalias.scopes` + `!noalias` metadata. As part of the Fortran
alias discussions and the AA monthly working group, we have seen how problematic large
amounts of `noalias` metadata can be. You basically start listing edges in your
alias/dependence graph at some point.
It sounds like it's a matter of representation. Conceptually in both cases we are explicitly providing external
aliasing facts. Note that with the proposed `!nocapture` + `"nocapture_use”` scheme every user of a `!nocapture`
stored pointer needs to be annotated.

Yes. But not every potential aliasing access. So we are talking about the number of users of the
memory it is stored in, which needs to be alias free or otherwise we could not annotate it in the
first place. For the main motivational example the memory has only a single use, a runtime call.
This can be a __kmpc_reduce (see PR48475), __tgt_target_data_update_mapper ...
If we deduce the property the single uses in pthread_create, a __kmpc_fork_call, ... become
interesting as well as cases where we have potentially multiple uses:

__attribute__((noinline)) void f1(struct S* s) { int* /* noescape */ p = s->p; ... }
__attribute__((noinline)) void f2(struct S* s) { int* /* noescape */ p = s->p; ... }
__attribute__((noinline)) void f3(struct S* s) { int* /* noescape */ p = s->p; ... }
void g() { struct S s; int x; s.p = &x; f1(&s); f2(&s); f2(&s); /* x is not captured here! */ ... }

The key difference is it doesn't depend on the surrounding, that is the number of other pointers
floating around. It is linear in the uses of the memory stored in, which is an OK price to
pay given that `noalias.scopes` + `!noalias` metadata can numerate edges in a graph that grows
quadratic with the number of pointers.

Note that this *does not* manifest aliasing facts but capture facts, which is arguably different and simpler.

We are looking at alternative solutions but those
are so far unrelated to he use case that is discussed here.

Given an escape analysis you might be interested in the same use case. A pointer is stored but you can
see from the uses of the stored-to location that it does not escape. As you pointed out, if those uses
are non-calls we cannot attach "nocapture_use" for now, which is a problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate the load, or the memory is already
aliasing something and it is unclear how much we can deduce anyway.
The unfortunately property of this approach is that we can’t simply compute the analysis and annotate the
IR with the results. First we need to shape the IR in the way so we can attach the results of the analysis.
This kind of shaping (store load forwarding to eliminate extra aliases) may require the results of this analysis
itself (this is assuming we want to reuse some of the existing LLVM transforms to do store load forwarding).

You seem to argue this approach is not a good fit for your analysis, which I agree. That seems
to be not a problem on its own. Do you have an alternative proposal that would subsume this
approach?
To me the fact that you need to annonate every single user of a `!nocapture` stored pointer looks like the most
fragile part of the proposal. It makes it hard to generalize this scheme. It makes it easy to misuse this scheme,
both for the frontends and for the optimizer. Moreover, current wording of the proposal doesn't address these
issues.

Hm, OK. So, you don't need to annotate each use, but the common usage would expect you do.
For DSL usages you might actually not. Assuming you do annotate both sides, which I can explain
better in the lang ref, this is no different from common techniques, e.g., `!noalias`. I'm not
sure why this would be any harder. In fact, you don't need to combine scopes, deal with the
issues of scope duplication, find all side-effect instructions to annotate that are unrelated,
... so arguably simpler. Lastly, you don't annotate the users of the memory it was stored in
with `!nocapture`, which is simply following def-use chains. One key point here is that we might
not see the users of the pointer at all, but just the users of the memory it was stored in.
In that case, we want to capture the "noescape" property we know from domain knowledge.

How far can we get with an approach which is conservatively correct without explicit `"nocapture_use”`
annotations?

Currently there are 2 places where BasicAA uses isNonEscapingLocalObject:
* Aliasing between isNonEscapingLocalObject and isEscapeSource
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub
* ModRef for a isNonEscapingLocalObject and a call
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub

If we were to support nocapture stores without explicit `"nocapture_use”` annotations we would need to
differentiate between isNonEscapingLocalObject with StoreCaptures==true and StoreCaptures==false.
These are two different properties and we would need to handle them separately.

For the aliasing between isNonEscapingLocalObject and isEscapeSource we would need to change
isEscapeSource for isNonEscapingLocalObject(StoreCaptures==false) case. Specifically, if StoreCaptures==false
then isEscapeSource should not consider loads as escape sources.

For the ModRef query around calls it's more compilcated. The easiest thing to do here is to not use
isNonEscapingLocalObject(StoreCaptures==false). I don't know if this is too limiting for your example.
But we can come up with some analysis here as well. What we need to know is the fact that a call doesn’t
capture pointers reachable through its arguments. This would require a new argument attribute, something
like nocapture_content.

I don't follow what this means. The way I understand this it seems more complicated and brittle
than my proposal. What I suggested should work if you outline part of the code, introduce int2ptr
+ ptr2int pairs in between the stores and use of the stored memory, etc. I am always worried that
"implicit" schemes will break as soon as uses get disturbed and are hard to track afterwards. That
is why I mark things explicitly:
The store would cause the `nocapture` property to be removed so we say `!nocapture`.
This break the implicit dependence on the memory use calls which we need to make explicit.

My next comment is only somewhat related to `"nocapture_use”` issue. Your current proposal marks stores
as `!nocapture`. But I assume that the store can be marked as nocapture because the underlying object is
unescaped. Why don't we mark this fact instead? If we have this fact then any store to an unescaped memory
can be treated as nocapture (assuming we are doing conservatively correct analysis for StoreCaptures==false).
This will give us a useful fact about the underlying object as well (it can be used for things other than aliasing).

There is a caveat to this suggestion. The proposed `!nocapture` metadata makes it possible to express
flow-sensitive facts (a store into the same underlying object might be nocapture on one path, but capture on some
other). We would give up this flow-sensitivity if we make the underlying object instead. On the other hand, global
property instead of a flow-sensitive propery will make it easier for a conservatively correct analysis.

I'll try to capture some thoughts about your proposals:

1) Having `nocapture_memory` on the declaration of storage to indicate store into it won't escape.

We could look to the pointer operand at the store side to find the attribute, which is doable I guess.
We give up on field-sensitivity. We need a declaration to anchor the attribute at, which inherently
comes with a scope. Inlining will become an issue then as the scope is expanded; ref all the
`noalias.scope` intrinsic stuff we need to do to keep `noalias` scopes alive.

2) Having `nocaptue_content` on the usage of a pointer in a call.

We give up on field-sensitivity again. This would not help you in the non-call situations. We would
need to look from to store to all uses of the pointer stored into every time we ask ourselves if the
memory escaped. This seems doable, albeit more involved for each query and can be disturbed.

I think I could live with both approaches for the runtime library case, some deduction cases will also
work. I am doubtful it is better though.

~ Johannes

Let me try a different mail client. One question below for now, will respond to the rest later.

Hi Artur,

I'm still having trouble reading your emails. As you see here https://lists.llvm.org/pipermail/llvm-dev/2021-January/147963.html
there is no visual indication of the quoted text versus the new one. I will try to answer you but I might simply overlook parts
of your last email. Apologies. Could you try to modify your email client to indent the quoted text, or something?

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other problem, let me know.

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because without it BasicAA would assume that the pointer is not accessed by the call at all.
Correct.

So, as a workaround you introduce a use which implicitly reads and writes.
Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was clobbered by the store through a’.
Right. But that is not strictly speaking a problem. You can build things with the annotation
that are nonsensical, though, that is nothing new. Especially if you employ the annotations
alone you might not find a good use case, seehttps://reviews.llvm.org/D93189#2485826 .
My concern here is that we miscompile a program which is seemingly correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to mark the store as !nocapture.

It looks like you assume a more restrictive definition for !nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that the pointer
stored is not captured in the sense that all uses of the pointer are explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer from the location
you stored it using `!nocapture`, there are no uses and "all uses" are explicitly
marked. If you do load it, you should make sure the use is "explicitly marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr !nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but that instruction is
not annotated with an explicit use of `a`. Now, this could actually be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load %ptr ["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr) ["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without causing it to be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use" but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of them is, so far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the original pointer.
Not necessarily an alias but a use of the original pointer. More below.
In general case we might have a pointer which is not the original pointer but an alias for it.

E.g. a select between the original pointer and some other value:
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr
v1 = load s

Or a derived pointer based off the original pointer:
a = new ...
store a, ptr, !nocapture
a' = load ptr
gep = gep a', offset
v1 = load get
Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if this is a load in the same function as
the nocapture store).
Right, we could do that.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot be violated,
as stated right now. Using the annotation changes the program semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf
2020 LLVM Developers’ Meeting: A. Pilipenko “Control-flow Sensitive Escape Analysis in Falcon JIT” - YouTube
We are looking into ways we can communicate facts this analysis computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect from the nocapture property?
You can mark the pointers in the caller nocapture, with all the benefits that might bring.
For a more direct example, see this thread:https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
Basically, we pass a bunch of pointers to an API. Since we need to pass an unknown number, we
do it with an array of void* and a size. The runtime will only load the pointers and not cause
them to escape, though it looks like they could. The same problem appears for memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy. Since you do more than one, you
pass a bunch of pointers via a void* array, which causes them to escape. This causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be) linked statically and you need
domain knowledge to deal with this. I split the OpenMP usage code from the patch but the initial version
still had it:https://reviews.llvm.org/D93189?id=311479 I'll clean it up and put it on phab again soon.
In general we want to have a way to express the results of some more powerful escape analysis in the IR
so CaptureTracking/BasicAA can take advantage of this analysis. In your case this analysis is some domain
specific knowledge, in our case it is a downstream analysis pass.

We can easily express basic facts like capture/nocapture using attributes and metadata. Things get more
difficult when we need to express aliasing properties. For example, your case when the call modifies an
otherwise unescaped pointer. While the proposed solution works for the motivational example, I’m afraid
it’s not really extensible for the general case.

First, we would need to have the ability to attach operand bundles to instructions of any kind. Second, there
are open questions, like how do we treat uses which are not a memory operations? How do we deal with
aliases? These things are not addressed in the current proposal.

The best idea I had so far regarding expressing the results of our EA in the IR was to use noalias and
alias.scope metadata. Essentially every unescaped object can be assigned to a separate scope and every
memory operation can be marked as noalias wrt the scopes we know it doesn’t touch.

In fact, the scheme you propose where every user of a !nocapture pointer is annotated with the original
pointer resembles the alias scopes and noalias metadata. The difference is alias scopes and noalias use
Indirection through metadata while your proposal relies on operand bundles to directly link the original
pointer.

Do you think you can apply something like this to you problem?

There is a fundamental difference with the proposal here and alias scopes that is not
"indirection through metadata". `!nocapture` + `"nocapture_use"` are independent of the
surrounding. This is crucial because it avoids the scaling problem that comes with the
bidirectional approach of `noalias.scopes` + `!noalias` metadata. As part of the Fortran
alias discussions and the AA monthly working group, we have seen how problematic large
amounts of `noalias` metadata can be. You basically start listing edges in your
alias/dependence graph at some point.
It sounds like it's a matter of representation. Conceptually in both cases we are explicitly providing external
aliasing facts. Note that with the proposed `!nocapture` + `"nocapture_use”` scheme every user of a `!nocapture`
stored pointer needs to be annotated.

Yes. But not every potential aliasing access. So we are talking about the number of users of the
memory it is stored in, which needs to be alias free or otherwise we could not annotate it in the
first place. For the main motivational example the memory has only a single use, a runtime call.
This can be a __kmpc_reduce (see PR48475), __tgt_target_data_update_mapper ...
If we deduce the property the single uses in pthread_create, a __kmpc_fork_call, ... become
interesting as well as cases where we have potentially multiple uses:

__attribute__((noinline)) void f1(struct S* s) { int* /* noescape */ p = s->p; ... }
__attribute__((noinline)) void f2(struct S* s) { int* /* noescape */ p = s->p; ... }
__attribute__((noinline)) void f3(struct S* s) { int* /* noescape */ p = s->p; ... }
void g() { struct S s; int x; s.p = &x; f1(&s); f2(&s); f2(&s); /* x is not captured here! */ ... }

The key difference is it doesn't depend on the surrounding, that is the number of other pointers
floating around. It is linear in the uses of the memory stored in, which is an OK price to
pay given that `noalias.scopes` + `!noalias` metadata can numerate edges in a graph that grows
quadratic with the number of pointers.

Note that this *does not* manifest aliasing facts but capture facts, which is arguably different and simpler.

We are looking at alternative solutions but those
are so far unrelated to he use case that is discussed here.

Given an escape analysis you might be interested in the same use case. A pointer is stored but you can
see from the uses of the stored-to location that it does not escape. As you pointed out, if those uses
are non-calls we cannot attach "nocapture_use" for now, which is a problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate the load, or the memory is already
aliasing something and it is unclear how much we can deduce anyway.
The unfortunately property of this approach is that we can’t simply compute the analysis and annotate the
IR with the results. First we need to shape the IR in the way so we can attach the results of the analysis.
This kind of shaping (store load forwarding to eliminate extra aliases) may require the results of this analysis
itself (this is assuming we want to reuse some of the existing LLVM transforms to do store load forwarding).

You seem to argue this approach is not a good fit for your analysis, which I agree. That seems
to be not a problem on its own. Do you have an alternative proposal that would subsume this
approach?
To me the fact that you need to annonate every single user of a `!nocapture` stored pointer looks like the most
fragile part of the proposal. It makes it hard to generalize this scheme. It makes it easy to misuse this scheme,
both for the frontends and for the optimizer. Moreover, current wording of the proposal doesn't address these
issues.

Hm, OK. So, you don't need to annotate each use, but the common usage would expect you do.
For DSL usages you might actually not. Assuming you do annotate both sides, which I can explain
better in the lang ref, this is no different from common techniques, e.g., `!noalias`. I'm not
sure why this would be any harder. In fact, you don't need to combine scopes, deal with the
issues of scope duplication, find all side-effect instructions to annotate that are unrelated,
... so arguably simpler. Lastly, you don't annotate the users of the memory it was stored in
with `!nocapture`, which is simply following def-use chains. One key point here is that we might
not see the users of the pointer at all, but just the users of the memory it was stored in.
In that case, we want to capture the "noescape" property we know from domain knowledge.

How far can we get with an approach which is conservatively correct without explicit `"nocapture_use”`
annotations?

Currently there are 2 places where BasicAA uses isNonEscapingLocalObject:
* Aliasing between isNonEscapingLocalObject and isEscapeSource
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub
* ModRef for a isNonEscapingLocalObject and a call
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub

If we were to support nocapture stores without explicit `"nocapture_use”` annotations we would need to
differentiate between isNonEscapingLocalObject with StoreCaptures==true and StoreCaptures==false.
These are two different properties and we would need to handle them separately.

For the aliasing between isNonEscapingLocalObject and isEscapeSource we would need to change
isEscapeSource for isNonEscapingLocalObject(StoreCaptures==false) case. Specifically, if StoreCaptures==false
then isEscapeSource should not consider loads as escape sources.

For the ModRef query around calls it's more compilcated. The easiest thing to do here is to not use
isNonEscapingLocalObject(StoreCaptures==false). I don't know if this is too limiting for your example.
But we can come up with some analysis here as well. What we need to know is the fact that a call doesn’t
capture pointers reachable through its arguments. This would require a new argument attribute, something
like nocapture_content.

I don't follow what this means. The way I understand this it seems more complicated and brittle
than my proposal. What I suggested should work if you outline part of the code, introduce int2ptr
+ ptr2int pairs in between the stores and use of the stored memory, etc. I am always worried that
"implicit" schemes will break as soon as uses get disturbed and are hard to track afterwards. That
is why I mark things explicitly:
  The store would cause the `nocapture` property to be removed so we say `!nocapture`.
  This break the implicit dependence on the memory use calls which we need to make explicit.

My next comment is only somewhat related to `"nocapture_use”` issue. Your current proposal marks stores
as `!nocapture`. But I assume that the store can be marked as nocapture because the underlying object is
unescaped. Why don't we mark this fact instead? If we have this fact then any store to an unescaped memory
can be treated as nocapture (assuming we are doing conservatively correct analysis for StoreCaptures==false).
This will give us a useful fact about the underlying object as well (it can be used for things other than aliasing).

There is a caveat to this suggestion. The proposed `!nocapture` metadata makes it possible to express
flow-sensitive facts (a store into the same underlying object might be nocapture on one path, but capture on some
other). We would give up this flow-sensitivity if we make the underlying object instead. On the other hand, global
property instead of a flow-sensitive propery will make it easier for a conservatively correct analysis.

I'll try to capture some thoughts about your proposals:

1) Having `nocapture_memory` on the declaration of storage to indicate store into it won't escape.

We could look to the pointer operand at the store side to find the attribute, which is doable I guess.
We give up on field-sensitivity. We need a declaration to anchor the attribute at, which inherently
comes with a scope.

If this is a global property of the underlying object, why do we need the scopes?

Artur

We talked about this RFC at the AA call briefly. We talked about the
possibility for CaptureTracking to provide
the information about the aliases in addition to capture/nocapture
facts. Johannes is going to see whether this
can be leveraged for this proposal.

A couple of responses inline. Hope this mail client does a better job
with quotation.

Hi Artur,

I'm still having trouble reading your emails. As you see here
https://lists.llvm.org/pipermail/llvm-dev/2021-January/147963.html
there is no visual indication of the quoted text versus the new one. I
will try to answer you but I might simply overlook parts
of your last email. Apologies. Could you try to modify your email
client to indent the quoted text, or something?

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other
problem, let me know.

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because
without it BasicAA would assume that the pointer is not accessed by
the call at all.
Correct.

So, as a workaround you introduce a use which implicitly reads and
writes.
Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which
BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was
clobbered by the store through a’.
Right. But that is not strictly speaking a problem. You can build
things with the annotation
that are nonsensical, though, that is nothing new. Especially if you
employ the annotations
alone you might not find a good use case,
seehttps://reviews.llvm.org/D93189#2485826 .
My concern here is that we miscompile a program which is seemingly
correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to
mark the store as !nocapture.

It looks like you assume a more restrictive definition for
!nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that
the pointer
stored is not captured in the sense that all uses of the pointer are
explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the
pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer
from the location
you stored it using `!nocapture`, there are no uses and "all uses"
are explicitly
marked. If you do load it, you should make sure the use is
"explicitly marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr
!nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but
that instruction is
not annotated with an explicit use of `a`. Now, this could actually
be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load
%ptr ["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr)
["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without
causing it to be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use"
but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of
them is, so far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was
previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the
original pointer.
Not necessarily an alias but a use of the original pointer. More below.
In general case we might have a pointer which is not the original
pointer but an alias for it.

E.g. a select between the original pointer and some other value:
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr
v1 = load s

Or a derived pointer based off the original pointer:
a = new ...
store a, ptr, !nocapture
a' = load ptr
gep = gep a', offset
v1 = load get
Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses
of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if
this is a load in the same function as
the nocapture store).
Right, we could do that.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot
be violated,
as stated right now. Using the annotation changes the program
semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture
metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf

2020 LLVM Developers’ Meeting: A. Pilipenko “Control-flow Sensitive Escape Analysis in Falcon JIT” - YouTube
We are looking into ways we can communicate facts this analysis
computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata
for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to
derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect
from the nocapture property?
You can mark the pointers in the caller nocapture, with all the
benefits that might bring.
For a more direct example, see this
thread:https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
Basically, we pass a bunch of pointers to an API. Since we need to
pass an unknown number, we
do it with an array of void* and a size. The runtime will only load
the pointers and not cause
them to escape, though it looks like they could. The same problem
appears for memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy.
Since you do more than one, you
pass a bunch of pointers via a void* array, which causes them to
escape. This causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be)
linked statically and you need
domain knowledge to deal with this. I split the OpenMP usage code
from the patch but the initial version
still had it:https://reviews.llvm.org/D93189?id=311479 I'll clean
it up and put it on phab again soon.
In general we want to have a way to express the results of some more
powerful escape analysis in the IR
so CaptureTracking/BasicAA can take advantage of this analysis. In
your case this analysis is some domain
specific knowledge, in our case it is a downstream analysis pass.

We can easily express basic facts like capture/nocapture using
attributes and metadata. Things get more
difficult when we need to express aliasing properties. For example,
your case when the call modifies an
otherwise unescaped pointer. While the proposed solution works for
the motivational example, I’m afraid
it’s not really extensible for the general case.

First, we would need to have the ability to attach operand bundles to
instructions of any kind. Second, there
are open questions, like how do we treat uses which are not a memory
operations? How do we deal with
aliases? These things are not addressed in the current proposal.

The best idea I had so far regarding expressing the results of our EA
in the IR was to use noalias and
alias.scope metadata. Essentially every unescaped object can be
assigned to a separate scope and every
memory operation can be marked as noalias wrt the scopes we know it
doesn’t touch.

In fact, the scheme you propose where every user of a !nocapture
pointer is annotated with the original
pointer resembles the alias scopes and noalias metadata. The
difference is alias scopes and noalias use
Indirection through metadata while your proposal relies on operand
bundles to directly link the original
pointer.

Do you think you can apply something like this to you problem?

There is a fundamental difference with the proposal here and alias
scopes that is not
"indirection through metadata". `!nocapture` + `"nocapture_use"` are
independent of the
surrounding. This is crucial because it avoids the scaling problem
that comes with the
bidirectional approach of `noalias.scopes` + `!noalias` metadata. As
part of the Fortran
alias discussions and the AA monthly working group, we have seen how
problematic large
amounts of `noalias` metadata can be. You basically start listing
edges in your
alias/dependence graph at some point.
It sounds like it's a matter of representation. Conceptually in both
cases we are explicitly providing external
aliasing facts. Note that with the proposed `!nocapture` +
`"nocapture_use”` scheme every user of a `!nocapture`
stored pointer needs to be annotated.

Yes. But not every potential aliasing access. So we are talking about
the number of users of the
memory it is stored in, which needs to be alias free or otherwise we
could not annotate it in the
first place. For the main motivational example the memory has only a
single use, a runtime call.
This can be a __kmpc_reduce (see PR48475),
__tgt_target_data_update_mapper ...
If we deduce the property the single uses in pthread_create, a
__kmpc_fork_call, ... become
interesting as well as cases where we have potentially multiple uses:

__attribute__((noinline)) void f1(struct S* s) { int* /* noescape */ p 
= s->p; ... }
__attribute__((noinline)) void f2(struct S* s) { int* /* noescape */ p 
= s->p; ... }
__attribute__((noinline)) void f3(struct S* s) { int* /* noescape */ p 
= s->p; ... }
void g() { struct S s; int x; s.p = &x; f1(&s); f2(&s); f2(&s); /* x 
is not captured here! */ ... }

The key difference is it doesn't depend on the surrounding, that is
the number of other pointers
floating around. It is linear in the uses of the memory stored in,
which is an OK price to
pay given that `noalias.scopes` + `!noalias` metadata can numerate
edges in a graph that grows
quadratic with the number of pointers.

Note that this *does not* manifest aliasing facts but capture facts,
which is arguably different and simpler.

We are looking at alternative solutions but those
are so far unrelated to he use case that is discussed here.

Given an escape analysis you might be interested in the same use
case. A pointer is stored but you can
see from the uses of the stored-to location that it does not escape.
As you pointed out, if those uses
are non-calls we cannot attach "nocapture_use" for now, which is a
problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate
the load, or the memory is already
aliasing something and it is unclear how much we can deduce anyway.
The unfortunately property of this approach is that we can’t simply
compute the analysis and annotate the
IR with the results. First we need to shape the IR in the way so we
can attach the results of the analysis.
This kind of shaping (store load forwarding to eliminate extra
aliases) may require the results of this analysis
itself (this is assuming we want to reuse some of the existing LLVM
transforms to do store load forwarding).

You seem to argue this approach is not a good fit for your analysis,
which I agree. That seems
to be not a problem on its own. Do you have an alternative proposal
that would subsume this
approach?
To me the fact that you need to annonate every single user of a
`!nocapture` stored pointer looks like the most
fragile part of the proposal. It makes it hard to generalize this
scheme. It makes it easy to misuse this scheme,
both for the frontends and for the optimizer. Moreover, current
wording of the proposal doesn't address these
issues.

Hm, OK. So, you don't need to annotate each use, but the common usage
would expect you do.
For DSL usages you might actually not. Assuming you do annotate both
sides, which I can explain
better in the lang ref, this is no different from common techniques,
e.g., `!noalias`. I'm not
sure why this would be any harder. In fact, you don't need to combine
scopes, deal with the
issues of scope duplication, find all side-effect instructions to
annotate that are unrelated,
... so arguably simpler. Lastly, you don't annotate the users of the
memory it was stored in
with `!nocapture`, which is simply following def-use chains. One key
point here is that we might
not see the users of the pointer at all, but just the users of the
memory it was stored in.
In that case, we want to capture the "noescape" property we know from
domain knowledge.

How far can we get with an approach which is conservatively correct
without explicit `"nocapture_use”`
annotations?

Currently there are 2 places where BasicAA uses
isNonEscapingLocalObject:
* Aliasing between isNonEscapingLocalObject and isEscapeSource
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub

* ModRef for a isNonEscapingLocalObject and a call
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub

If we were to support nocapture stores without explicit
`"nocapture_use”` annotations we would need to
differentiate between isNonEscapingLocalObject with
StoreCaptures==true and StoreCaptures==false.
These are two different properties and we would need to handle them
separately.

For the aliasing between isNonEscapingLocalObject and isEscapeSource
we would need to change
isEscapeSource for isNonEscapingLocalObject(StoreCaptures==false)
case. Specifically, if StoreCaptures==false
then isEscapeSource should not consider loads as escape sources.

For the ModRef query around calls it's more compilcated. The easiest
thing to do here is to not use
isNonEscapingLocalObject(StoreCaptures==false). I don't know if this
is too limiting for your example.
But we can come up with some analysis here as well. What we need to
know is the fact that a call doesn’t
capture pointers reachable through its arguments. This would require
a new argument attribute, something
like nocapture_content.

I don't follow what this means. The way I understand this it seems
more complicated and brittle
than my proposal. What I suggested should work if you outline part of
the code, introduce int2ptr
+ ptr2int pairs in between the stores and use of the stored memory,
etc. I am always worried that
"implicit" schemes will break as soon as uses get disturbed and are
hard to track afterwards. That
is why I mark things explicitly:
The store would cause the `nocapture` property to be removed so we
say `!nocapture`.
This break the implicit dependence on the memory use calls which we
need to make explicit.

Indeed. A conservatively correct analysis will be more brittle in terms
of optimization potential.
On the other hand the benefit of this approach is it simplifies the
annotations we need to provide.
E.g. the current proposal requires the IR to be is a specific shape with
store load forwarding done
for nocapture stored pointers.

In general, this is the design space we have: how much do we rely on
explicit annotations vs how
much do we derive what we need from the code?

My next comment is only somewhat related to `"nocapture_use”` issue.
Your current proposal marks stores
as `!nocapture`. But I assume that the store can be marked as
nocapture because the underlying object is
unescaped. Why don't we mark this fact instead? If we have this fact
then any store to an unescaped memory
can be treated as nocapture (assuming we are doing conservatively
correct analysis for StoreCaptures==false).
This will give us a useful fact about the underlying object as well
(it can be used for things other than aliasing).

There is a caveat to this suggestion. The proposed `!nocapture`
metadata makes it possible to express
flow-sensitive facts (a store into the same underlying object might
be nocapture on one path, but capture on some
other). We would give up this flow-sensitivity if we make the
underlying object instead. On the other hand, global
property instead of a flow-sensitive propery will make it easier for
a conservatively correct analysis.

I'll try to capture some thoughts about your proposals:

1) Having `nocapture_memory` on the declaration of storage to indicate
store into it won't escape.

We could look to the pointer operand at the store side to find the
attribute, which is doable I guess.
We give up on field-sensitivity. We need a declaration to anchor the
attribute at, which inherently
comes with a scope. Inlining will become an issue then as the scope is
expanded; ref all the
`noalias.scope` intrinsic stuff we need to do to keep `noalias` scopes
alive.

2) Having `nocaptue_content` on the usage of a pointer in a call.

We give up on field-sensitivity again. This would not help you in the
non-call situations. We would
need to look from to store to all uses of the pointer stored into
every time we ask ourselves if the
memory escaped. This seems doable, albeit more involved for each query
and can be disturbed.

I think I could live with both approaches for the runtime library
case, some deduction cases will also
work. I am doubtful it is better though.

My main argument here - these are useful primitives even outside of the
scope of this RFC.

There are users of PointerMayBeCaptured outside of AA, e.g. LICM. These
users will benefit from
nocapture_memory even if we don't use this fact for AA.

nocaptue_content for call arguments would be useful if we were to
implement a more sophisticated
capture tracking analysis.

BTW, in your motivational example, can the memory passed through
indirection be marked as globally
nocaptured as well?

Here is what I mean:
arg_storage = call new, !nocapture_memory
obj = call new, !nocapture_memory <-- is this object also never captured?
arg_storage.obj = obj
foo(arg_storage)

Artur

We talked about this RFC at the AA call briefly. We talked about the
possibility for CaptureTracking to provide
the information about the aliases in addition to capture/nocapture
facts. Johannes is going to see whether this
can be leveraged for this proposal.

I did discuss the "new" or "extended" encoding below.

A couple of responses inline. Hope this mail client does a better job
with quotation.

Quotation works for me now, thanks a lot.

Hi Artur,

I'm still having trouble reading your emails. As you see here
https://lists.llvm.org/pipermail/llvm-dev/2021-January/147963.html
there is no visual indication of the quoted text versus the new one. I
will try to answer you but I might simply overlook parts
of your last email. Apologies. Could you try to modify your email
client to indent the quoted text, or something?

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other
problem, let me know.

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because
without it BasicAA would assume that the pointer is not accessed by
the call at all.
Correct.

So, as a workaround you introduce a use which implicitly reads and
writes.
Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which
BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was
clobbered by the store through a’.
Right. But that is not strictly speaking a problem. You can build
things with the annotation
that are nonsensical, though, that is nothing new. Especially if you
employ the annotations
alone you might not find a good use case,
seehttps://reviews.llvm.org/D93189#2485826 .
My concern here is that we miscompile a program which is seemingly
correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to
mark the store as !nocapture.

It looks like you assume a more restrictive definition for
!nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that
the pointer
stored is not captured in the sense that all uses of the pointer are
explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the
pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer
from the location
you stored it using `!nocapture`, there are no uses and "all uses"
are explicitly
marked. If you do load it, you should make sure the use is
"explicitly marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr
!nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but
that instruction is
not annotated with an explicit use of `a`. Now, this could actually
be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load
%ptr ["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr)
["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without
causing it to be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use"
but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of
them is, so far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was
previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the
original pointer.
Not necessarily an alias but a use of the original pointer. More below.
In general case we might have a pointer which is not the original
pointer but an alias for it.

E.g. a select between the original pointer and some other value:
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr
v1 = load s

Or a derived pointer based off the original pointer:
a = new ...
store a, ptr, !nocapture
a' = load ptr
gep = gep a', offset
v1 = load get
Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses
of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if
this is a load in the same function as
the nocapture store).
Right, we could do that.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot
be violated,
as stated right now. Using the annotation changes the program
semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture
metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf

2020 LLVM Developers’ Meeting: A. Pilipenko “Control-flow Sensitive Escape Analysis in Falcon JIT” - YouTube
We are looking into ways we can communicate facts this analysis
computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata
for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to
derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect
from the nocapture property?
You can mark the pointers in the caller nocapture, with all the
benefits that might bring.
For a more direct example, see this
thread:https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html
Basically, we pass a bunch of pointers to an API. Since we need to
pass an unknown number, we
do it with an array of void* and a size. The runtime will only load
the pointers and not cause
them to escape, though it looks like they could. The same problem
appears for memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy.
Since you do more than one, you
pass a bunch of pointers via a void* array, which causes them to
escape. This causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be)
linked statically and you need
domain knowledge to deal with this. I split the OpenMP usage code
from the patch but the initial version
still had it:https://reviews.llvm.org/D93189?id=311479 I'll clean
it up and put it on phab again soon.
In general we want to have a way to express the results of some more
powerful escape analysis in the IR
so CaptureTracking/BasicAA can take advantage of this analysis. In
your case this analysis is some domain
specific knowledge, in our case it is a downstream analysis pass.

We can easily express basic facts like capture/nocapture using
attributes and metadata. Things get more
difficult when we need to express aliasing properties. For example,
your case when the call modifies an
otherwise unescaped pointer. While the proposed solution works for
the motivational example, I’m afraid
it’s not really extensible for the general case.

First, we would need to have the ability to attach operand bundles to
instructions of any kind. Second, there
are open questions, like how do we treat uses which are not a memory
operations? How do we deal with
aliases? These things are not addressed in the current proposal.

The best idea I had so far regarding expressing the results of our EA
in the IR was to use noalias and
alias.scope metadata. Essentially every unescaped object can be
assigned to a separate scope and every
memory operation can be marked as noalias wrt the scopes we know it
doesn’t touch.

In fact, the scheme you propose where every user of a !nocapture
pointer is annotated with the original
pointer resembles the alias scopes and noalias metadata. The
difference is alias scopes and noalias use
Indirection through metadata while your proposal relies on operand
bundles to directly link the original
pointer.

Do you think you can apply something like this to you problem?

There is a fundamental difference with the proposal here and alias
scopes that is not
"indirection through metadata". `!nocapture` + `"nocapture_use"` are
independent of the
surrounding. This is crucial because it avoids the scaling problem
that comes with the
bidirectional approach of `noalias.scopes` + `!noalias` metadata. As
part of the Fortran
alias discussions and the AA monthly working group, we have seen how
problematic large
amounts of `noalias` metadata can be. You basically start listing
edges in your
alias/dependence graph at some point.
It sounds like it's a matter of representation. Conceptually in both
cases we are explicitly providing external
aliasing facts. Note that with the proposed `!nocapture` +
`"nocapture_use”` scheme every user of a `!nocapture`
stored pointer needs to be annotated.

Yes. But not every potential aliasing access. So we are talking about
the number of users of the
memory it is stored in, which needs to be alias free or otherwise we
could not annotate it in the
first place. For the main motivational example the memory has only a
single use, a runtime call.
This can be a __kmpc_reduce (see PR48475),
__tgt_target_data_update_mapper ...
If we deduce the property the single uses in pthread_create, a
__kmpc_fork_call, ... become
interesting as well as cases where we have potentially multiple uses:

__attribute__((noinline)) void f1(struct S* s) { int* /* noescape */ p
= s->p; ... }
__attribute__((noinline)) void f2(struct S* s) { int* /* noescape */ p
= s->p; ... }
__attribute__((noinline)) void f3(struct S* s) { int* /* noescape */ p
= s->p; ... }
void g() { struct S s; int x; s.p = &x; f1(&s); f2(&s); f2(&s); /* x
is not captured here! */ ... }

The key difference is it doesn't depend on the surrounding, that is
the number of other pointers
floating around. It is linear in the uses of the memory stored in,
which is an OK price to
pay given that `noalias.scopes` + `!noalias` metadata can numerate
edges in a graph that grows
quadratic with the number of pointers.

Note that this *does not* manifest aliasing facts but capture facts,
which is arguably different and simpler.

We are looking at alternative solutions but those
are so far unrelated to he use case that is discussed here.

Given an escape analysis you might be interested in the same use
case. A pointer is stored but you can
see from the uses of the stored-to location that it does not escape.
As you pointed out, if those uses
are non-calls we cannot attach "nocapture_use" for now, which is a
problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate
the load, or the memory is already
aliasing something and it is unclear how much we can deduce anyway.
The unfortunately property of this approach is that we can’t simply
compute the analysis and annotate the
IR with the results. First we need to shape the IR in the way so we
can attach the results of the analysis.
This kind of shaping (store load forwarding to eliminate extra
aliases) may require the results of this analysis
itself (this is assuming we want to reuse some of the existing LLVM
transforms to do store load forwarding).

You seem to argue this approach is not a good fit for your analysis,
which I agree. That seems
to be not a problem on its own. Do you have an alternative proposal
that would subsume this
approach?
To me the fact that you need to annonate every single user of a
`!nocapture` stored pointer looks like the most
fragile part of the proposal. It makes it hard to generalize this
scheme. It makes it easy to misuse this scheme,
both for the frontends and for the optimizer. Moreover, current
wording of the proposal doesn't address these
issues.

Hm, OK. So, you don't need to annotate each use, but the common usage
would expect you do.
For DSL usages you might actually not. Assuming you do annotate both
sides, which I can explain
better in the lang ref, this is no different from common techniques,
e.g., `!noalias`. I'm not
sure why this would be any harder. In fact, you don't need to combine
scopes, deal with the
issues of scope duplication, find all side-effect instructions to
annotate that are unrelated,
... so arguably simpler. Lastly, you don't annotate the users of the
memory it was stored in
with `!nocapture`, which is simply following def-use chains. One key
point here is that we might
not see the users of the pointer at all, but just the users of the
memory it was stored in.
In that case, we want to capture the "noescape" property we know from
domain knowledge.

How far can we get with an approach which is conservatively correct
without explicit `"nocapture_use”`
annotations?

Currently there are 2 places where BasicAA uses
isNonEscapingLocalObject:
* Aliasing between isNonEscapingLocalObject and isEscapeSource
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub

* ModRef for a isNonEscapingLocalObject and a call
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub

If we were to support nocapture stores without explicit
`"nocapture_use”` annotations we would need to
differentiate between isNonEscapingLocalObject with
StoreCaptures==true and StoreCaptures==false.
These are two different properties and we would need to handle them
separately.

For the aliasing between isNonEscapingLocalObject and isEscapeSource
we would need to change
isEscapeSource for isNonEscapingLocalObject(StoreCaptures==false)
case. Specifically, if StoreCaptures==false
then isEscapeSource should not consider loads as escape sources.

For the ModRef query around calls it's more compilcated. The easiest
thing to do here is to not use
isNonEscapingLocalObject(StoreCaptures==false). I don't know if this
is too limiting for your example.
But we can come up with some analysis here as well. What we need to
know is the fact that a call doesn’t
capture pointers reachable through its arguments. This would require
a new argument attribute, something
like nocapture_content.

I don't follow what this means. The way I understand this it seems
more complicated and brittle
than my proposal. What I suggested should work if you outline part of
the code, introduce int2ptr
+ ptr2int pairs in between the stores and use of the stored memory,
etc. I am always worried that
"implicit" schemes will break as soon as uses get disturbed and are
hard to track afterwards. That
is why I mark things explicitly:
The store would cause the `nocapture` property to be removed so we
say `!nocapture`.
This break the implicit dependence on the memory use calls which we
need to make explicit.

Indeed. A conservatively correct analysis will be more brittle in terms
of optimization potential.
On the other hand the benefit of this approach is it simplifies the
annotations we need to provide.
E.g. the current proposal requires the IR to be is a specific shape with
store load forwarding done
for nocapture stored pointers.

In general, this is the design space we have: how much do we rely on
explicit annotations vs how
much do we derive what we need from the code?

Right. Given the capture tracker usage I expect any kind of
analysis at that point would be too costly. It could be opt-in
arguably, but an explicit representation approach which we then
only have to interpret (=cheap) is more suited for my domain
knowledge use case and works with the tracking analysis as well.

My next comment is only somewhat related to `"nocapture_use”` issue.
Your current proposal marks stores
as `!nocapture`. But I assume that the store can be marked as
nocapture because the underlying object is
unescaped. Why don't we mark this fact instead? If we have this fact
then any store to an unescaped memory
can be treated as nocapture (assuming we are doing conservatively
correct analysis for StoreCaptures==false).
This will give us a useful fact about the underlying object as well
(it can be used for things other than aliasing).

There is a caveat to this suggestion. The proposed `!nocapture`
metadata makes it possible to express
flow-sensitive facts (a store into the same underlying object might
be nocapture on one path, but capture on some
other). We would give up this flow-sensitivity if we make the
underlying object instead. On the other hand, global
property instead of a flow-sensitive propery will make it easier for
a conservatively correct analysis.

I'll try to capture some thoughts about your proposals:

1) Having `nocapture_memory` on the declaration of storage to indicate
store into it won't escape.

We could look to the pointer operand at the store side to find the
attribute, which is doable I guess.
We give up on field-sensitivity. We need a declaration to anchor the
attribute at, which inherently
comes with a scope. Inlining will become an issue then as the scope is
expanded; ref all the
`noalias.scope` intrinsic stuff we need to do to keep `noalias` scopes
alive.

2) Having `nocaptue_content` on the usage of a pointer in a call.

We give up on field-sensitivity again. This would not help you in the
non-call situations. We would
need to look from to store to all uses of the pointer stored into
every time we ask ourselves if the
memory escaped. This seems doable, albeit more involved for each query
and can be disturbed.

I think I could live with both approaches for the runtime library
case, some deduction cases will also
work. I am doubtful it is better though.

My main argument here - these are useful primitives even outside of the
scope of this RFC.

There are users of PointerMayBeCaptured outside of AA, e.g. LICM. These
users will benefit from
nocapture_memory even if we don't use this fact for AA.

nocaptue_content for call arguments would be useful if we were to
implement a more sophisticated
capture tracking analysis.

BTW, in your motivational example, can the memory passed through
indirection be marked as globally
nocaptured as well?

Here is what I mean:
   arg_storage = call new, !nocapture_memory
   obj = call new, !nocapture_memory <-- is this object also never captured?
   arg_storage.obj = obj
   foo(arg_storage)

Yes, above `obj` is the pointer that is not captured by foo in my motivating example.
We can express that `arg_storage` is not captured with the argument attribute. The
problem I'm trying to solve is to ensure that `obj` is not assumed captured (by the
store or call) but at the same time the call retains its modref relationship with
`obj`. If we use `!nocapture` on the store alone, we don't see that `foo` accesses
`obj` anymore. That is why the `"nocapture_use"` exists. The alternatives we discussed:

Allow the user of the NoCaptureTracker to ignore `!nocapture` stores under the condition
the user follows the llvm::uses() of the storage the pointer ended up in. This would
replace the function of the "nocapture_use" set and work for non-call uses as well. The
drawback is that you now perform the capture analysis every time you query, so we are
not manifesting but recomputing a lot.

We could point from the store to the potential aliases of the pointer that is stored.
This is a similar idea as above but less burden is on the user of the NoCaptureTracker
now. This would match the "nocapture_maybe_returned" attribute the Attributor uses
internally in its NoCaptureTracker class [0]. The problem with this is my initial use
case actually. There is no "alias" in the function to tag if the storage is passed to
another function. Still, this would make sense for the store + load case and the "maybe
returned argument" case. So if we go with my initial proposal + this, we would get 3
things out of it. Recap: `!nocapture` on a store with operands that point to potential
reloads and "nocapture_use" operand bundle for calls.

store %ptr, %storage, !nocapture(%l0, %l1)
...
%l0 = load (%cond ? %storage : %_)
call @f(%l0)
...
call @g(%storage) ["nocapture_use"(%ptr)]
...
%q = call @h(nocapture_maybe_returned %ptr)
...
%l1 = load %storage
...

The NoCaptureTracker keeps a potential alias set as the Attributor does [0]. After
analyzing the use the store, %ptr is *not* assumed captured, %l0 and %l1 are added
to the potential alias set. %q is added to the set when the use in the @h call is
analyzed. The use in the "nocapture_use" operand bundle is not capturing.

WDYT? (Assuming we go with this approach, I'd still split it so my initial patch
can go first and an extension of `!nocapture` to take operands can come second,
together with the NoCaptureTracker logic).

~ Johannes

[0] llvm-project/AttributorAttributes.cpp at b2545b71d121ac913e56faff3b704f3957f941b7 · llvm/llvm-project · GitHub

We talked about this RFC at the AA call briefly. We talked about the
possibility for CaptureTracking to provide
the information about the aliases in addition to capture/nocapture
facts. Johannes is going to see whether this
can be leveraged for this proposal.

I did discuss the "new" or "extended" encoding below.

A couple of responses inline. Hope this mail client does a better job
with quotation.

Quotation works for me now, thanks a lot.

Hi Artur,

I'm still having trouble reading your emails. As you see here
https://lists.llvm.org/pipermail/llvm-dev/2021-January/147963.html
there is no visual indication of the quoted text versus the new one. I
will try to answer you but I might simply overlook parts
of your last email. Apologies. Could you try to modify your email
client to indent the quoted text, or something?

The email formatting is somewhat broken. Might be my client.
I try to answer inline, if I missed something or there is any other
problem, let me know.

Hi Artur,

I'm a bit confused with nocapture_use. I guess you need this because
without it BasicAA would assume that the pointer is not accessed by
the call at all.
Correct.

So, as a workaround you introduce a use which implicitly reads and
writes.
Correct, for now. We could add "readonly"/"writeonly" etc. later on.

But this might be a more general problem. For example:

a = new ...
store a, ptr, !nocapture
a' = load ptr
; Now you have 2 pointers to the same object (a' and a ) which
BasicAA considers as no aliasing.
v1 = load a
store 5, a'
v2 = load a

We would happily replace v2 with v1 even though the memory was
clobbered by the store through a’.
Right. But that is not strictly speaking a problem. You can build
things with the annotation
that are nonsensical, though, that is nothing new. Especially if you
employ the annotations
alone you might not find a good use case,
seehttps://reviews.llvm.org/D93189#2485826 .
My concern here is that we miscompile a program which is seemingly
correct. None of the users
of pointer a escape the pointer. So, I assume it should be correct to
mark the store as !nocapture.

It looks like you assume a more restrictive definition for
!nocapture. The proposed lang ref says:
"``!nocapture`` metadata on the instruction tells the optimizer that
the pointer
stored is not captured in the sense that all uses of the pointer are
explicitly
marked otherwise”

a) What do you mean by "uses of the pointer” here? Is it uses of the
pointer value stored by the
Annotated instruction? Is it uses of the memory modified by the store?

It is uses of the stored pointer. So if you never load the pointer
from the location
you stored it using `!nocapture`, there are no uses and "all uses"
are explicitly
marked. If you do load it, you should make sure the use is
"explicitly marked otherwise"
because you do not get a "dependence edge" from the `store %a %ptr
!nocapture` to `%a' = load %ptr`.
In your example, that explicit use is missing. So you load `ptr` but
that instruction is
not annotated with an explicit use of `a`. Now, this could actually
be OK, depending on the use,
but unlikely what you want.

If we would have operand bundles on loads you could do: `%a' = load
%ptr ["nocapture_use"(%a)]`,
which would correspond to the intended use `call @f(%ptr)
["nocapture_use"(%a)]`. That way you would
be able to communicate `%a` through memory (here `%ptr`) without
causing it to be captured.
(This assumes you ensured `%a'` is not captured.)

I think we could require `!nocapture` to be used with "nocapture_use"
but I resisted so far
as it would be more complex.
On the other hand, it would make it clear that usage of only one of
them is, so far,
discouraged since it can easily lead to unexpected results.
So, basically every use of the value loaded from memory which was
previously stored to as !nocapture
needs to have an annotation indicating that this is an alias of the
original pointer.
Not necessarily an alias but a use of the original pointer. More below.
In general case we might have a pointer which is not the original
pointer but an alias for it.

E.g. a select between the original pointer and some other value:
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr
v1 = load s

Or a derived pointer based off the original pointer:
a = new ...
store a, ptr, !nocapture
a' = load ptr
gep = gep a', offset
v1 = load get
Do we need to annotate things like geps/bitcasts?

What if the use is a phi or a select?
a = new ...
store a, ptr, !nocapture
a' = load ptr
s = select c, a', some other ptr ; do we annotate the select?
v1 = load s ; or this load?

It looks like currently we don’t have the means to annotate the uses
of the loaded value. We might
need to prohibit loads of !nocapture-stored values altogether (if
this is a load in the same function as
the nocapture store).
Right, we could do that.

b) Does the example above violate this statement somehow?

So far, there is no violation per se possible. The semantics cannot
be violated,
as stated right now. Using the annotation changes the program
semantic, if that change is not
what you wanted, that is a problem but not a miscompile (IMHO).
The way I’m looking at this is I have a program without !nocapture
metadata and operand bundles
and I want to derive it using some analysis.
(We actually have an escape analysis capable of handling object graphs.
https://llvm.org/devmtg/2020-09/slides/Pilipenko-Falcon-EA-LLVM-Dev-Mtg.pdf

2020 LLVM Developers’ Meeting: A. Pilipenko “Control-flow Sensitive Escape Analysis in Falcon JIT” - YouTube
We are looking into ways we can communicate facts this analysis
computes with the rest of the
optimizer. But we were looking into noalias and alias.scope metadata
for such purpose.)
So, I don’t want to change the behavior using the metadata, I want to
derive it and for that I need
to understand the semantic it implies.

BTW, in your motivating example, what are the benefits you expect
from the nocapture property?
You can mark the pointers in the caller nocapture, with all the
benefits that might bring.
For a more direct example, see this
thread:https://lists.llvm.org/pipermail/cfe-dev/2020-December/067346.html

Basically, we pass a bunch of pointers to an API. Since we need to
pass an unknown number, we
do it with an array of void* and a size. The runtime will only load
the pointers and not cause
them to escape, though it looks like they could. The same problem
appears for memory latency hiding
when you do an OpenMP target map, basically a bulk device memcpy.
Since you do more than one, you
pass a bunch of pointers via a void* array, which causes them to
escape. This causes spurious aliases
that we know cannot exist, though the runtime is not (supposed to be)
linked statically and you need
domain knowledge to deal with this. I split the OpenMP usage code
from the patch but the initial version
still had it:https://reviews.llvm.org/D93189?id=311479 I'll clean
it up and put it on phab again soon.
In general we want to have a way to express the results of some more
powerful escape analysis in the IR
so CaptureTracking/BasicAA can take advantage of this analysis. In
your case this analysis is some domain
specific knowledge, in our case it is a downstream analysis pass.

We can easily express basic facts like capture/nocapture using
attributes and metadata. Things get more
difficult when we need to express aliasing properties. For example,
your case when the call modifies an
otherwise unescaped pointer. While the proposed solution works for
the motivational example, I’m afraid
it’s not really extensible for the general case.

First, we would need to have the ability to attach operand bundles to
instructions of any kind. Second, there
are open questions, like how do we treat uses which are not a memory
operations? How do we deal with
aliases? These things are not addressed in the current proposal.

The best idea I had so far regarding expressing the results of our EA
in the IR was to use noalias and
alias.scope metadata. Essentially every unescaped object can be
assigned to a separate scope and every
memory operation can be marked as noalias wrt the scopes we know it
doesn’t touch.

In fact, the scheme you propose where every user of a !nocapture
pointer is annotated with the original
pointer resembles the alias scopes and noalias metadata. The
difference is alias scopes and noalias use
Indirection through metadata while your proposal relies on operand
bundles to directly link the original
pointer.

Do you think you can apply something like this to you problem?

There is a fundamental difference with the proposal here and alias
scopes that is not
"indirection through metadata". `!nocapture` + `"nocapture_use"` are
independent of the
surrounding. This is crucial because it avoids the scaling problem
that comes with the
bidirectional approach of `noalias.scopes` + `!noalias` metadata. As
part of the Fortran
alias discussions and the AA monthly working group, we have seen how
problematic large
amounts of `noalias` metadata can be. You basically start listing
edges in your
alias/dependence graph at some point.
It sounds like it's a matter of representation. Conceptually in both
cases we are explicitly providing external
aliasing facts. Note that with the proposed `!nocapture` +
`"nocapture_use”` scheme every user of a `!nocapture`
stored pointer needs to be annotated.

Yes. But not every potential aliasing access. So we are talking about
the number of users of the
memory it is stored in, which needs to be alias free or otherwise we
could not annotate it in the
first place. For the main motivational example the memory has only a
single use, a runtime call.
This can be a __kmpc_reduce (see PR48475),
__tgt_target_data_update_mapper ...
If we deduce the property the single uses in pthread_create, a
__kmpc_fork_call, ... become
interesting as well as cases where we have potentially multiple uses:

__attribute__((noinline)) void f1(struct S* s) { int* /* noescape */ p
= s->p; ... }
__attribute__((noinline)) void f2(struct S* s) { int* /* noescape */ p
= s->p; ... }
__attribute__((noinline)) void f3(struct S* s) { int* /* noescape */ p
= s->p; ... }
void g() { struct S s; int x; s.p = &x; f1(&s); f2(&s); f2(&s); /* x
is not captured here! */ ... }

The key difference is it doesn't depend on the surrounding, that is
the number of other pointers
floating around. It is linear in the uses of the memory stored in,
which is an OK price to
pay given that `noalias.scopes` + `!noalias` metadata can numerate
edges in a graph that grows
quadratic with the number of pointers.

Note that this *does not* manifest aliasing facts but capture facts,
which is arguably different and simpler.

We are looking at alternative solutions but those
are so far unrelated to he use case that is discussed here.

Given an escape analysis you might be interested in the same use
case. A pointer is stored but you can
see from the uses of the stored-to location that it does not escape.
As you pointed out, if those uses
are non-calls we cannot attach "nocapture_use" for now, which is a
problem. On the other hand, if they
are in the same function, we could forward the pointer and eliminate
the load, or the memory is already
aliasing something and it is unclear how much we can deduce anyway.
The unfortunately property of this approach is that we can’t simply
compute the analysis and annotate the
IR with the results. First we need to shape the IR in the way so we
can attach the results of the analysis.
This kind of shaping (store load forwarding to eliminate extra
aliases) may require the results of this analysis
itself (this is assuming we want to reuse some of the existing LLVM
transforms to do store load forwarding).

You seem to argue this approach is not a good fit for your analysis,
which I agree. That seems
to be not a problem on its own. Do you have an alternative proposal
that would subsume this
approach?
To me the fact that you need to annonate every single user of a
`!nocapture` stored pointer looks like the most
fragile part of the proposal. It makes it hard to generalize this
scheme. It makes it easy to misuse this scheme,
both for the frontends and for the optimizer. Moreover, current
wording of the proposal doesn't address these
issues.

Hm, OK. So, you don't need to annotate each use, but the common usage
would expect you do.
For DSL usages you might actually not. Assuming you do annotate both
sides, which I can explain
better in the lang ref, this is no different from common techniques,
e.g., `!noalias`. I'm not
sure why this would be any harder. In fact, you don't need to combine
scopes, deal with the
issues of scope duplication, find all side-effect instructions to
annotate that are unrelated,
... so arguably simpler. Lastly, you don't annotate the users of the
memory it was stored in
with `!nocapture`, which is simply following def-use chains. One key
point here is that we might
not see the users of the pointer at all, but just the users of the
memory it was stored in.
In that case, we want to capture the "noescape" property we know from
domain knowledge.

How far can we get with an approach which is conservatively correct
without explicit `"nocapture_use”`
annotations?

Currently there are 2 places where BasicAA uses
isNonEscapingLocalObject:
* Aliasing between isNonEscapingLocalObject and isEscapeSource
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub

* ModRef for a isNonEscapingLocalObject and a call
llvm-project/BasicAliasAnalysis.cpp at main · llvm/llvm-project · GitHub

If we were to support nocapture stores without explicit
`"nocapture_use”` annotations we would need to
differentiate between isNonEscapingLocalObject with
StoreCaptures==true and StoreCaptures==false.
These are two different properties and we would need to handle them
separately.

For the aliasing between isNonEscapingLocalObject and isEscapeSource
we would need to change
isEscapeSource for isNonEscapingLocalObject(StoreCaptures==false)
case. Specifically, if StoreCaptures==false
then isEscapeSource should not consider loads as escape sources.

For the ModRef query around calls it's more compilcated. The easiest
thing to do here is to not use
isNonEscapingLocalObject(StoreCaptures==false). I don't know if this
is too limiting for your example.
But we can come up with some analysis here as well. What we need to
know is the fact that a call doesn’t
capture pointers reachable through its arguments. This would require
a new argument attribute, something
like nocapture_content.

I don't follow what this means. The way I understand this it seems
more complicated and brittle
than my proposal. What I suggested should work if you outline part of
the code, introduce int2ptr
+ ptr2int pairs in between the stores and use of the stored memory,
etc. I am always worried that
"implicit" schemes will break as soon as uses get disturbed and are
hard to track afterwards. That
is why I mark things explicitly:
The store would cause the `nocapture` property to be removed so we
say `!nocapture`.
This break the implicit dependence on the memory use calls which we
need to make explicit.

Indeed. A conservatively correct analysis will be more brittle in terms
of optimization potential.
On the other hand the benefit of this approach is it simplifies the
annotations we need to provide.
E.g. the current proposal requires the IR to be is a specific shape with
store load forwarding done
for nocapture stored pointers.

In general, this is the design space we have: how much do we rely on
explicit annotations vs how
much do we derive what we need from the code?

Right. Given the capture tracker usage I expect any kind of
analysis at that point would be too costly. It could be opt-in
arguably, but an explicit representation approach which we then
only have to interpret (=cheap) is more suited for my domain
knowledge use case and works with the tracking analysis as well.

My next comment is only somewhat related to `"nocapture_use”` issue.
Your current proposal marks stores
as `!nocapture`. But I assume that the store can be marked as
nocapture because the underlying object is
unescaped. Why don't we mark this fact instead? If we have this fact
then any store to an unescaped memory
can be treated as nocapture (assuming we are doing conservatively
correct analysis for StoreCaptures==false).
This will give us a useful fact about the underlying object as well
(it can be used for things other than aliasing).

There is a caveat to this suggestion. The proposed `!nocapture`
metadata makes it possible to express
flow-sensitive facts (a store into the same underlying object might
be nocapture on one path, but capture on some
other). We would give up this flow-sensitivity if we make the
underlying object instead. On the other hand, global
property instead of a flow-sensitive propery will make it easier for
a conservatively correct analysis.

I'll try to capture some thoughts about your proposals:

1) Having `nocapture_memory` on the declaration of storage to indicate
store into it won't escape.

We could look to the pointer operand at the store side to find the
attribute, which is doable I guess.
We give up on field-sensitivity. We need a declaration to anchor the
attribute at, which inherently
comes with a scope. Inlining will become an issue then as the scope is
expanded; ref all the
`noalias.scope` intrinsic stuff we need to do to keep `noalias` scopes
alive.

2) Having `nocaptue_content` on the usage of a pointer in a call.

We give up on field-sensitivity again. This would not help you in the
non-call situations. We would
need to look from to store to all uses of the pointer stored into
every time we ask ourselves if the
memory escaped. This seems doable, albeit more involved for each query
and can be disturbed.

I think I could live with both approaches for the runtime library
case, some deduction cases will also
work. I am doubtful it is better though.

My main argument here - these are useful primitives even outside of the
scope of this RFC.

There are users of PointerMayBeCaptured outside of AA, e.g. LICM. These
users will benefit from
nocapture_memory even if we don't use this fact for AA.

nocaptue_content for call arguments would be useful if we were to
implement a more sophisticated
capture tracking analysis.

BTW, in your motivational example, can the memory passed through
indirection be marked as globally
nocaptured as well?

Here is what I mean:
arg_storage = call new, !nocapture_memory
obj = call new, !nocapture_memory <-- is this object also never captured?
arg_storage.obj = obj
foo(arg_storage)

Yes, above `obj` is the pointer that is not captured by foo in my motivating example.
We can express that `arg_storage` is not captured with the argument attribute. The
problem I'm trying to solve is to ensure that `obj` is not assumed captured (by the
store or call) but at the same time the call retains its modref relationship with
`obj`. If we use `!nocapture` on the store alone, we don't see that `foo` accesses
`obj` anymore. That is why the `"nocapture_use"` exists. The alternatives we discussed:

Allow the user of the NoCaptureTracker to ignore `!nocapture` stores under the condition
the user follows the llvm::uses() of the storage the pointer ended up in. This would
replace the function of the "nocapture_use" set and work for non-call uses as well. The
drawback is that you now perform the capture analysis every time you query, so we are
not manifesting but recomputing a lot.

We could point from the store to the potential aliases of the pointer that is stored.
This is a similar idea as above but less burden is on the user of the NoCaptureTracker
now. This would match the "nocapture_maybe_returned" attribute the Attributor uses
internally in its NoCaptureTracker class [0]. The problem with this is my initial use
case actually. There is no "alias" in the function to tag if the storage is passed to
another function. Still, this would make sense for the store + load case and the "maybe
returned argument" case. So if we go with my initial proposal + this, we would get 3
things out of it. Recap: `!nocapture` on a store with operands that point to potential
reloads and "nocapture_use" operand bundle for calls.

store %ptr, %storage, !nocapture(%l0, %l1)
...
%l0 = load (%cond ? %storage : %_)
call @f(%l0)
...
call @g(%storage) ["nocapture_use"(%ptr)]
...
%q = call @h(nocapture_maybe_returned %ptr)
...
%l1 = load %storage
...

The NoCaptureTracker keeps a potential alias set as the Attributor does [0]. After
analyzing the use the store, %ptr is *not* assumed captured, %l0 and %l1 are added
to the potential alias set. %q is added to the set when the use in the @h call is
analyzed. The use in the "nocapture_use" operand bundle is not capturing.

WDYT? (Assuming we go with this approach, I'd still split it so my initial patch
can go first and an extension of `!nocapture` to take operands can come second,
together with the NoCaptureTracker logic).

This is hard to implement, unsure why I did not realize that before. I basically
broke SSA -.-. However, if we don't mark the aliases or uses of %ptr explicitly
we are always going to have the same problem as with encoding some kind of
"nocapture globally" property, it breaks via outlining and such:

%a = alloca

; Outline the bitcast and we get a new alias of %a either via the return of the
; callee or through a pointer-pointer argument.
%p = bitcast %a

use(%p)

So my last attempt (for now) is to annotate the loads and the calls, here with
either operand bundles (which do not exist for non-calls yet) or new "tag" intrinsic.
Either way, you could follow potential aliases as described in the last mail above.

store %ptr, %storage, !nocapture
...
%l0 = load (%cond ? %storage : %_) ["potential_alias_of"(%ptr)]
call @f(%l0)
...
call @g(%storage) ["nocapture_use"(%ptr)]
...
%q = call @h(nocapture_maybe_returned %ptr)
...
%l1 = load %storage
call @llvm.tag.potential.alias(%l1, %ptr)
use(%l1)
...

~ Johannes