Memcpy nocapture?

My colleague Madhur and I have been discussing nocapture semantics. We don’t understand why memcpy has nocapture argument attributes.

For this discussion we need some definitions first. Informally, capturing is about inspecting the pointer, escaping is about inspecting the contents, which is also nicely summarised in [1]. And this is of course inline with LangRef [2]:

…, a pointer is captured by the call if it makes a copy of any part of the pointer that outlives the call.

With the definitions out of the way, let’s look at the memcpy signature and let’s focus on the source pointer because that’s probably easier:

declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg) 

The source pointer has attributes noalias nocapture readonly. The question I have is why the source pointer has nocapture when memcpy is about copying the contents? The lifetime of the source pointer does not exceed the lifetime of the caller, I think. So, I think it could be argued that the nocapture is correct: the pointer isn’t captured, but it is escaping.

Where things get confusing and perhaps problematic is with our interpretation of capture/nocapture, as also observed in [3] on the llvm dev list which is a follow up reply of [1]:

It seems that we indeed assume that “capture” implies “escape” in LLVM (conservatively).

If capturing is interpreted as escaping, and thus if nocapture is interpreted as noescape, then having nocapture on memcpy is wrong? Something is not well defined here?

[1] [llvm-dev] [GSoC 2016] Capture Tracking Improvements - BackgroundInformation
[2] LLVM Language Reference Manual — LLVM 17.0.0git documentation
[3] [llvm-dev] [GSoC 2016] Capture Tracking Improvements - BackgroundInformation

See LLVM Language Reference Manual — LLVM 17.0.0git documentation for the definition of pointer capture in LangRef, which looks reasonably accurate to me.

The distinction between the terms “capture” and “escape” is new to me – as far as I know, we use these two terms completely interchangeably in LLVM.

memcpy does not capture either of its pointers, because memcpy both a) does not depend on the (bitwise) pointer address and b) does not store the pointer beyond the lifetime of the call. The fact that memcpy reads/writes through the pointer is not relevant for captures – this aspect is modeled by memory effect attributes.

Hi @nikic
Thanks for your reply.

as far as I know, we use these two terms completely interchangeably in LLVM.

And that is what is confusing to us. LLVM’s langref provides definition of “capture” but it is indeed what is called as “Escape” in literature. The “Escape Analysis for Java” [2] paper defines escapement as below.

“Let O be an object instance and M be a method invocation. O is said to escape M, denoted as Escapes(O, M), if the lifetime of O may exceed the lifetime of M.”

If we look at langref then it purely defines escapement but names it as “capture” which debatable.

As we are looking at [1] above Philip says,

““capture” - can anyone inspect the bits of this pointer?”

and provides the further illustrative example

“Illustrative examples: A function which returns the alignment of a pointer captures a pointer, but does not cause it to escape or become non-thread local”

This means “capture” means broader scenarios and does NOT imply escapement as the function only reads the bit representation of the pointer but does NOT make a copy that outlives function body. Hence, the act of capture here and escapement are different.

IMO, for C/C++ like languages, escapement implies capture but capture does NOT necessarily imply escapement (as the above example provides one such scenario)

Using “capture” and “escape” interchangeably in LLVM codebase and documentation is confusing as it fails to convey the right semantics if we follow the literature.

I think we should doing the following:

  1. The term capture needs to be refined. The current definition of “capture” on langref page is really all about “escapement” as per the literature.
  2. It probably makes sense to rename “nocapture” attribute to “noescape”.

[1] [llvm-dev] [GSoC 2016] Capture Tracking Improvements - BackgroundInformation
[2] https://faculty.cc.gatech.edu/~harrold/6340/cs6340_fall2009/Readings/choi99escape.pdf

Just to be clear, in the way you are using the terms, “nocapture” refers to both capture and escape, not just one of them.

I think it would in principle be fine to split nocapture into two attributes, one for address capture and one for provenance escape.

However, this would not be terribly useful right now, because we don’t really have ways to inspect the address of a pointer without possibly also leaking its provenance. If you do a ptrtoint to get address, that will also leak the pointer provenance and the integer may later be used to access the pointer (in ways we cannot track).

For this reason, both notions end up being essentially the same in practice. Escape implies capture (because it may be captured once escaped) and capture implies escape (because our current ways of capturing imply provenance escape). As such, splitting these notions is unlikely to be useful unless we introduce and use ways of extracting the address of a pointer without leaking its provenance.

Ok, I think I got all of that.
But can I ask again just to double check what that means for memcpy that has nocapture? If we would have had an optimisation that assumes nocapture means noescape, could it make the wrong decision because of this:

If I know something is unescaped:

  • I can change the representation of the contents. (Even if the pointer value has been captured.)*

Small correction here: We do have at least one way, which is using the pointer in an icmp. This will (in the general case) capture the address, but not leak provenance (I think – not entirely sure about our semantics here). So that’s a case where we could determine “noescape” without also having “nocapture”.

Overall, I think I’d be in favor of splitting the attributes, mainly in the interest of making it clear what optimizations actually depend on, but also allowing slightly more optimization power.

TBH, I don’t understand how the capture/escape distinction helps you with structure representation optimization at all. If a function call doesn’t escape or capture the pointer, but just accesses it in some unknown way, you already cannot perform the optimization.

For memcpy you don’t have any problems with captures/escapes, but you do still have a memory access, and a struct representation optimization would have to know how to rewrite memcpy’s appropriately. If the memcpy were just some random nocapture call, you would not be able to optimize it, because you don’t know how it is going to access the memory.

FWIW, such a function is nocapture_maybe_return (for some definition thereof). And while I don’t understand what your distinction between capture and escape is supposed to be, the function probably does both. I can use it to leak/copy all bits of a pointer if I so choose.

I’m also not sure what makes the first argument of memcpy special. Would you apply the same reasoning to a pointer argument of a load?

The Clang AST has a RecordDecl. You can query its linkage. If the linkage is internal, then the aggregate does not escape from the TU and you can optimize its layout.

Hi @jdoefert,

And while I don’t understand what your distinction between capture and escape is supposed to be, the function probably does both. I can use it to leak/copy all bits of a pointer if I so choose.

We can look at the example from Escape Analysis & Capture Tracking in LLVM

 int f(void* p) {
   return ((unsigned long)p & 15) == 0;
 }

Moreover, I do see, the below comments on https://lists.llvm.org/pipermail/llvm-dev/2016-June/100781.html:

  • “capture” - can anyone inspect the bits of this pointer?
    "…One small thing to watch out for: “capture” and “escape” are NOT the same thing. "

If we treat “capture” and “escape” interchangebly, It is confusing to say that “This function captures p but does not cause its value to escape.” on the blog.

Is it possible in LLVM IR to have something captured but not escaped, if two are different terms?

@Nikic
The pointer provenance dimension is new to me and I am not sure which of the semantics - PNVI plain, PNVI address-exposed (PNVI-ae) and PNVI address-exposed user-disambiguation (PNVI-ae-udi), or provenance-via-integers (PVI) model is used or under consideration.

That’s not true. Local linkage function can capture the pointer via a global or any out parameter and return value. Even exceptions and unwinding can be used.

1 Like

RecordDecl is a struct. How can the struct escape from the TU if it is internal linkage?

namespace {
struct AbsurdlyHuge { /// <- internal linkage
////
};
}

The blog post is not our lang ref. AFAIK we do not distinguish in the lang ref. If we want to, we need to argue why and define the terms. I’m still not sure why we wanted to in the first place.

Because it doesn’t matter if the outside knows the type decl. For all we care they can read it byte-wise. The type is local, for some definition of it, but any object of the type, for all we care, is unrelated to the type definition lifetime. Opaque pointers imply this, IMHO, nicely.

Nobody outside of the TU knows the Decl. If I would change the layout of the struct, it would not be observable from outside of the TU.

No. There is nothing preventing the user from defining it twice with internal linkage in different tu’s and passing a pointer to the object via a void* through a shared global variable. You can also serialize the object via memcpy into a file, read it out by a different process. A changed layout will result in problems. Pointer escaping != Internal object/type.

On Mon, May 29, 2023, 20:37 Thorsten Schütt via LLVM Discussion Forums <notifications@llvm.discoursemail.com> wrote:

tschuett
May 29

Nobody outside of the TU knows the Decl. If I would change the layout of the struct, it would not be observable from outside of the TU.


Visit Topic or reply to this email to respond.


In Reply To

jdoerfert
May 29

Because it doesn’t matter if the outside knows the type decl. For all we care they can read it byte-wise. The type is local, for some definition of it, but any object of the type, for all we care, is unrelated to the type definition lifetime. Opaque pointers imply this, IMHO, nicely.

Previous Replies

jdoerfert
May 29

Because it doesn’t matter if the outside knows the type decl. For all we care they can read it byte-wise. The type is local, for some definition of it, but any object of the type, for all we care, is unrelated to the type definition lifetime. Opaque pointers imply this, IMHO, nicely.

jdoerfert
May 29

madhur13490:

If we treat “capture” and “escape” interchangebly, It is confusing to say that “This function captures p but does not cause its value to escape.” on the blog.

The blog post is not our lang ref. AFAIK we do not distinguish in the lang ref. If we want to, we need to argue why and define the terms. I’m still not sure why we wanted to in the first place.

tschuett
May 29
namespace {
struct AbsurdlyHuge { /// <- internal linkage
////
};
}

tschuett
May 29

RecordDecl is a struct. How can the struct escape from the TU if it is internal linkage?

jdoerfert
May 29

That’s not true. Local linkage function can capture the pointer via a global or any out parameter and return value. Even exceptions and unwinding can be used.


Visit Topic or reply to this email to respond.

To unsubscribe from these emails, click here.

I believe it is a mix of misunderstanding and disagreement:
file A:

namespace {
struct AbsurdlyLarge {};
}

file B:

namespace {
struct AbsurdlyLarge{};
}

How can you argue that they are the same type? Passing through a global void pointer sounds unsafe and akin to type punning.

I’m not arguing they are the same type. I’m not saying it’s not unsafe. And I’m not saying it’s not akin to type punning. I’m saying pretty much the opposite to all of them, but I’m also saying it’s perfectly legal and to be persevered behavior. Capture tracking is designed to help us argue legality of transformations. If a pointer is (really) captured, you can’t change the layout, no matter the definition.

However, in this case Clang Sema will guarantee that no typed pointer will escape. void* is a different story.

I don’t disagree. I think the approaches hinge on escape analysis, and being able to tell that objects are accessed in a known way.

Yeah, it is not really about the memcpy function. The memcpy was more of an example and exercise to try and understand the semantics of capturing/escaping. I.e., the only thing we were trying to test is whether the nocapture for memcpy is wrong (which again seemed the case to me).

About random other nocapture calls, yes, that’s the thing I am really interesting in. Ideally, functions with nocapture should guarantee that structure layout transformations are safe. I am just reading the literature, am not yet proposing anything, but I see that in the structure layout optimisation papers additional “practical checks” are defined that ensure no other aliases are created and memory is accessed in a predictable way.