Type based escape analysis

I am investigating whether it would be possible to implement a simple/practical/conservative form of type-based escape analysis at the LLVM IR level. It looks like the framework to do that is mostly there in LLVM, e.g.:

  • LTO and internalisation that makes functions/globals private could help our
    case in deciding whether pointers escape outside the current module.
  • CaptureTracking is available and the PointerMayBeCaptured APIs provide a
    convenient interface to query the capture status of pointers.

But what seem to be missing though is the ability to discover pointee information. I.e., we are not only interested if pointers escapes, but also which types escape. This seemed possible to determine before opaque pointers (not sure if that was the best solution though), but with opaque pointers some information is lost that we won’t be able to recover. A minimal example is a function with a pointer argument that is passed to another function:

define void @test(ptr noundef %a) {
entry:
  tail call void @use(ptr noundef %a)
  ret void
}
declare void @use(ptr noundef)

The type information for pointer %a cannot be derived from other instructions such as load/stores, and no TBAA information is emitted that could help. So far I have mainly tried sketching the context, this is essentially the same observation also made earlier in this post.

The things that I am reading is “don’t do this on IR”, which is pretty clear, DWARF might be used and mangled names too but are not always available in my case, so that pretty much leaves the Clang front-end option on the table. I am going to investigate if I can emit additional TBAA information so that pointer types can be retrieved later in an IR analysis pass. TBAA is usually attached to instructions, so will have to see how that works for functions and arguments, and I am also interested in pointer casts and how that could work, so have a number of things to investigate. So I was curious if someone had some advice or could share ideas/experiences in this area.

Could you please explain in more detail what “type based escape analysis” is and how it is reconcilable with the relevant standards?

1 Like

Thanks for reading and replying @nikic and apologies if terminology was confusing.

To analyse objects of aggregate types, a field-sensitive points-to analysis could be conservatively implemented with checks that check for casts from/to different types and if the address of a field is taken or not. That’s basically what I meant with “type-based” in “type-based escape analysis”. So I don’t think this would be violating standards, but please let me know if you have a different opinion or if I can explain more.

Your initial example is too minimal. There are no loads and stores. With a larger example with memory access and thus types, you can reason about types.

Passing a pointer that is not accessed in the caller is not a very contrived example. And I need one case for the approach to fall apart. Another example would be pointer casts, for which we don’t generate bit casts anymore. So, I thought it was fairly undisputed that some information is lost.

What do you mean by pointer casts? You can use the same pointer p to read i8 and write i32.

BTW, i tried to implement:
Efficient Flow-Sensitive of Pointer-Induced Interprocedural Computation Aliases and Side Effects from Choi. It is really costly and precise, but I failed.

Hi @nikic and @tschuett,

Thanks for your reply. Our main concern is all about pointee’s type information getting lost as shown in the above example. There is no way we can deduce that ptr is/was pointing to int/struct/float etc given that LLVM is now shifted to OpaquePointers.

We have a couple of options to access pointee’s type:

  1. Extract using the relevant load/store or other instructions as advised at Opaque Pointers — LLVM 17.0.0git documentation
  2. Access TBAA metadata.

However, given the above example, we don’t have any of those and this is what our primary concern is.

TBAA metadata is not generated in all the cases. We have seen that people have tried to force generate it but that did not go anywhere. I am referring to âš™ D36833 Support -fforce-tbaa command-line option which is later abandoned.

In short, there are cases where it is almost impossible to retrieve pointee’s type.

We’d like to know community’s opinion on this. Is there a reliable, robust and definitive way to access pointee’s type in the world of opaque pointers?

As @sjoerdmeijer pointed in the post, current capture tracking is the avenue where we would like to use this information and be able to build custom capture tracking or what we can call “Type-Based Escape Analysis”.

This is kind of the point of pointers. They point to memory and maybe items, like e.g, functions. If you have a pointer that is never used, how do you want to reason about the pointer?

Even in C-like languages:

unsigned *pointer;

If the pointer is never used, is it really an unsigned pointer?

If you want type-based escape analysis, you should probably talk about precision. In the first example, you cannot say anything about the types. If you have a function full of memory accesses and structs, you can say more about the escaped types. If you look at the whole module, maybe you can increase the precision.

Sorry, I still don’t get it. Could you maybe show a bit of sample code to illustrate?

Specifically, I don’t understand how you can do type-based escape analysis in a language that allows you to freely cast between pointer types. You have restrictions on accesses (where things like strict aliasing / TBAA apply), but the pointers themselves are freely castable.

In your IR example, if use() accepts Foo * then there is no guarantee that it isn’t just going to cast that to an entirely unrelated Bar * type and perform an access on that / capture it.

I’m probably completely misunderstanding what this about.

Worse yet, you can cast Foo * into an int pas the int around, then later cast the int back to bar *.

But if you do something like this, I think the compiler should assume that that pointer can point anywhere.

So the real question is what are the tracking boundaries where you give up trying to optimize.

Sorry, I still don’t get it. Could you maybe show a bit of sample code to illustrate?

Specifically, I don’t understand how you can do type-based escape analysis in a language that allows you to freely cast between pointer types. You have restrictions on accesses (where things like strict aliasing / TBAA apply), but the pointers themselves are freely castable.

In your IR example, if use() accepts Foo * then there is no guarantee that it isn’t just going to cast that to an entirely unrelated Bar * type and perform an access on that / capture it.

I’m probably completely misunderstanding what this about.

My bad. Let me start at the very beginning of this. I am reading this paper Practical structure layout optimization and advice and am wondering if e.g. the unused structure field elimination optimisation would be feasible to implement. This explains why I mentioned identifying structure types and for this transformation to be legal, the (structure) type is not allowed to escape. The paper defines a number of (practical) legality checks:

  • a cast to a type has been found,
  • cast from a record type has been found,
  • the address of a field has been taken.

There are a few more, but the point of these practical tests is that they implement a conservative field-sensitive points-to analysis according to the paper and these checks are not as expensive as doing whole program alias analysis.

I hope this clarifies the context and motivation and perhaps my first messages makes more sense now: I am trying to figure out what is available to implement this (LTO could help with visibility, and PointerMayBeCaptured) with capture/alias analysis, but not having pointee information available looks like a roadblock to implement this.

I can certainly provide more code examples if that helps, but will need to do that tomorrow.

I found another one:
Structure Layout Optimizations in the Open64 Compiler: Design, Implementation and Measurements

Gautam Chakrabarti, Fred Chow

Assuming there are no aggregate optimizations in LLVM.

  • Are aggregate optimizations desirable?
  • What is gcc doing?
    What are the tools for optimizations:
  • escape analysis
  • PGO: what are the hot fields
  • LTO: Note that this field is never used.
  • ?

Data layout transformations are very tricky to do in LLVM; it ends up being too low-level for such transformations to be reliable. Many of the source languages to LLVM tend to assume that type punning is relentlessly common (in the case of Rust, type punning is strictly legal; in C and C++, it’s technically legal only under certain narrow circumstances, but breaking the ability to do it in general is likely to lead to lots of user complaints).

The LLVM type system has never been reliable enough, not even when typed pointers existed, to be able to insert the kind of legality checks based on type casts. You need to perform these checks at a level where type information is more reliable, and if you read the papers you cite carefully, you’ll notice that in both cases, they seem to rely on the frontend for the type analysis, rather than the middle or backends.

As for gcc, it used to have a struct field reorganization pass (which was based on their equivalent to LLVM IR), but it was removed because “They did not always work correctly, nor did they work with link-time optimization (LTO), hence were only applicable to programs consisting of a single translation unit.”

I would not recommend trying to implement data layout transformations in LLVM IR.

I read the Open64 article and they seem to rely on the equivalent of ThinLTO. They put the struct layouts into the summaries. There was no mention of the frontend being involved. They showed significant speedups. I wouldn’t drop the topic immediately. Rather, I want to discuss how to achieve struct layout optimizations in small incremental steps.

Can a struct escape from a TU?

Yep, I noticed that too (which is why I am posting here in the Clang Frontend group). Type information is gathered in the front-end (the legality checks are done there) and then this information is passed on.

I would not recommend trying to implement data layout transformations in LLVM IR.

So, I guess you mean that both front-end and middle-end are involved, but the actual transformation is performed on IR?

Yes, exactly. The literature that I have been reading so far have checks for that and would mark a type as “invalid” when a cast occurs. So it will give up at this point, is conservative, and objects with these types won’t be candidates for data layout optimisations.

Communicating this sort of type/legality information from back-end to middle-end seem quite a bit of a challenge on its own to me.

Clang can today already reorder the fields of an aggregate for security reasons.

Maybe, we can teach Clang to optimize aggregates. It will be on higher abstractions than LLVM IR. Clang Sema knows a lot about the C/C++ program and escape analysis will be complete different than on LLVM IR. Splitting aggregates in Clang will be safer than in LLVM IR.

WDYT?

randstruct

I’d think that “clang optimizing aggregates” would work only within a compilation unit. Anytime you need to pass data between CUs you need an ABI agreement which makes struct reordering pretty tough. randstruct addresses this by using a consistent reordering scheme across CUs, specified by a seed value passed on the command line. Hard to envision anything like that working for a front-end based optimization, which ought to be driven by some kind of whole-program analysis.

namespace {
struct AbsurdlyLarge {};
}

If you have such an aggregate, then its layout is not observable in other TUs.