Proposal for safe-to-execute meta-data for heap accesses

Hi!

Previously in the "Add a 'notrap' function attribute?” thread we had discussed a way to use meta-data to specify that a load does not trap under certain conditions. Andy and Hal and I talked about this more in private and I just wanted to summarize what I think we arrived at. First I’ll summarize the original !notrap meta-data, then I’ll just mention why it’s hard to get it right, and finally I’ll show the new safe-to-execute proposal.

PROBLEM

The reason for this is that without such meta-data, it’s difficult to hoist loads around branches. Here’s a great example:

  for (…)
    if (%c)
      %v = load %p

Let’s say that %c is *not* loop invariant, but %p is loop invariant and the loop never clobbers the location pointed-to by %p. Let’s assume that we already know that %c is usually try and so we’d like to hoist the load out of the loop. But in this scenario, any phase that wants to hoist the load must conservatively assume that %c might guard whether the load is safe-to-execute. There is no other way for an LLVM phase to determine what makes any load safe-to-execute, so every branch on any path to that load is conservatively assumed to be a safety guard. Even if %c is known to be unrelated to whether or not %p is a valid pointer - something that arises trivially in many non-C languages - LLVM will still fail to hoist in this case.

PREVIOUS PROPOSAL

Previously we had debated a meta-data format like:

  %v = load %p !notrap !{ %c }

The idea being that if there exists a program point P that is dominated by %c and where %c is provably true, then it is safe to emit that load at P. Ideally this would allow an LLVM frontend to express what constraints must be satisfied for the load to be safe-to-execute. For example, a Java frontend might use !{ %c } to describe a null check.

The downside of such meta-data is that !{ %c } is not an ordinary SSA use. You can’t reason about it the way you would ordinarily reason about uses. For example:

  if (%c)
    %v = load %p !notrap !{ %c }

If !{ %c } were an ordinary use then you could transform this to:

  if (%c)
    %v = load %p !notrap !{ 1 }

But then the world breaks: it would seem that the load is hoistable to anywhere. So, we would have to come up with some special new rules for what you can do to a !{ %c } use; likely this would involve conservatively erasing the meta-data if you did *anything* to %c.

NEW PROPOSAL

The solution is to introduce meta-data that is explicit about how the safe-to-execute condition ought to be evaluated. Instead of an SSA use, we can have meta-data that says:

  %v = load %p !notrap !{ @f, <args> }

where @f is a function in the current module and this function returns i1, and <args> is zero or more arguments to pass to @f. As with any meta-data, this doesn’t imply anything different if you wanted to just execute the code: executing the load doesn’t imply calling @f; indeed if you dropped the meta-data the execution of this would still be the same.

What the meta-data is telling you is two-fold:

  1) If you could prove that at some program point P, a call to @f with arguments <args> would return true, then it is safe to execute the load at P and doing so will definitely not trap.

  2) Wherever you see a load with !notrap !{ @f, <args> }, it means that someone has already proved that @f(<args>) is true at that point and execution is undefined (i.e. your computer can do improv) if @f(<args>) was false at that point.

Here are two examples of how you can use this to optimize things.

- Hoisting loads out of a loop for IR generated from a type-safe language that has NULL. Consider a loop like:

  if (%p == 0)
    call <something that throws a Null Pointer Exception>
  for (…) {
    if (%c) {
      %p2 = gep %p, ...
      %v = load %p2 !notrap !{ @isNotNull, %p }
    }
  }

And let’s assume that @isNotNull just does a null comparison on its pointer argument. In this code, it would now be possible to hoist the load out of the loop because we can indeed prove that at the loop pre-header, @isNotNull must return true because of the condition on %p.

- Control flow simplification from already-proven things:

  %p2 = gep %p, ...
  %v = load %p2 !notrap !{ @isNotNull, %p }
  if (%p == 0)
    some dead code

It is safe for a phase to conclude that the if statement cannot be taken because %p cannot be null after the load.

Thoughts?

-Filip

So, first a clarifying question:

Is the expectation that to utilize this metadata an optimization pass would have to inspect the body of @f and reason about its behavior given ?

If so, then I think this is pretty bad. If we ever want to parallelize function passes, then they can’t inspect the innards of other functions. So this would significantly constrain the utility here.

Also, this would create uses of the arguments that were “ephemeral” uses. It’s not clear how that is better than any of the other proposals to represent constraint systems in the IR via “ephemeral” uses.

So, first a clarifying question:

Is the expectation that to utilize this metadata an optimization pass would have to inspect the body of @f and reason about its behavior given ?

Yes.

If so, then I think this is pretty bad. If we ever want to parallelize function passes, then they can’t inspect the innards of other functions.

I must be missing something. Can’t you do some simple locking? Lock a function if it’s being transformed, or if you want to inspect it…

So this would significantly constrain the utility here.

I think we can engineer around this problem. For example, the function @f is meant to contain basically hand-written IR; it ought not be necessary to optimize it in order to make use of it for safe-to-execute. It’s also reasonable to expect these to be small.

Hence you can imagine freezing a copy of those functions that are used in this meta-data.

Also, this would create uses of the arguments that were “ephemeral” uses.

I think they’re ephemeral in a very different sense than the previous !notrap; for example here the used continue to be meaningful even after replaceAllUsesWith.

NEW PROPOSAL

The solution is to introduce meta-data that is explicit about how the
safe-to-execute condition ought to be evaluated. Instead of an SSA
use, we can have meta-data that says:

%v = load %p !notrap !{ @f, <args> }

where @f is a function in the current module and this function
returns i1, and <args> is zero or more arguments to pass to @f. As
with any meta-data, this doesn’t imply anything different if you
wanted to just execute the code: executing the load doesn’t imply
calling @f; indeed if you dropped the meta-data the execution of
this would still be the same.

So, first a clarifying question:

Is the expectation that to utilize this metadata an optimization pass
would have to inspect the body of @f and reason about its behavior
given <args>?

Yes.

If so, then I think this is pretty bad. If we ever want to
parallelize function passes, then they can't inspect the innards of
other functions.

I must be missing something. Can't you do some simple locking? Lock a
function if it's being transformed, or if you want to inspect it...

I think we'd exclude these functions from being acted upon by the regular optimization passes. These functions would need to be special anyway: they are functions with a special internal linkage that should not be deleted as dead, even if 'unused' (likely we'd want them to survive in memory through CodeGen), but CodeGen itself should ignore the functions (no code for them should ever be generated).

So this would significantly constrain the utility here.

I think we can engineer around this problem. For example, the
function @f is meant to contain basically hand-written IR; it ought
not be necessary to optimize it in order to make use of it for
safe-to-execute. It's also reasonable to expect these to be small.

Hence you can imagine freezing a copy of those functions that are
used in this meta-data.

Also, this would create uses of the arguments that were "ephemeral"
uses.

I think they're ephemeral in a very different sense than the previous
!notrap; for example here the used continue to be meaningful even
after replaceAllUsesWith.

I think that, to Chandler's point, it would be the responsibility of the function creator to insure that the special 'functions' would not need any non-constant values without other reasonable uses. It seems like this can be arranged for things like pointer alignment checks, pointer not-null assertions, pointer dereferencability (indicated by a load in the function I suppose), simple value constraints. I think that covers most of the intended use cases.

-Hal

NEW PROPOSAL

The solution is to introduce meta-data that is explicit about how the
safe-to-execute condition ought to be evaluated. Instead of an SSA
use, we can have meta-data that says:

%v = load %p !notrap !{ @f, }

where @f is a function in the current module and this function
returns i1, and is zero or more arguments to pass to @f. As
with any meta-data, this doesn’t imply anything different if you
wanted to just execute the code: executing the load doesn’t imply
calling @f; indeed if you dropped the meta-data the execution of
this would still be the same.

So, first a clarifying question:

Is the expectation that to utilize this metadata an optimization pass
would have to inspect the body of @f and reason about its behavior
given ?

Yes.

If so, then I think this is pretty bad. If we ever want to
parallelize function passes, then they can’t inspect the innards of
other functions.

I must be missing something. Can’t you do some simple locking? Lock a
function if it’s being transformed, or if you want to inspect it…

I think we’d exclude these functions from being acted upon by the regular optimization passes. These functions would need to be special anyway: they are functions with a special internal linkage that should not be deleted as dead, even if ‘unused’ (likely we’d want them to survive in memory through CodeGen), but CodeGen itself should ignore the functions (no code for them should ever be generated).

So this would significantly constrain the utility here.

I think we can engineer around this problem. For example, the
function @f is meant to contain basically hand-written IR; it ought
not be necessary to optimize it in order to make use of it for
safe-to-execute. It’s also reasonable to expect these to be small.

Hence you can imagine freezing a copy of those functions that are
used in this meta-data.

Also, this would create uses of the arguments that were “ephemeral”
uses.

I think they’re ephemeral in a very different sense than the previous
!notrap; for example here the used continue to be meaningful even
after replaceAllUsesWith.

I think that, to Chandler’s point, it would be the responsibility of the function creator to insure that the special ‘functions’ would not need any non-constant values without other reasonable uses. It seems like this can be arranged for things like pointer alignment checks, pointer not-null assertions, pointer dereferencability (indicated by a load in the function I suppose), simple value constraints. I think that covers most of the intended use cases.

Yeah, I think it is reasonable to say that given anything like:

%v = load %p !notrap !{ @f, %a1, %a2, …, %an }

It must be the case that %a1…%an are reachable from %p in the sense that ADCE could only kill %a1…%an if it killed %p.

Here are some concrete examples:

  • Load that requires a pointer to be null-checked:

%p = getelementptr %object,
%v = load %p !notrap !{ @isNotNull, %object }

  • Load that requires a null check and an array bounds check:

%p = getelementptr %object, %index (possibly other things also depending on your array object model)
%v = load %p !notrap !{ @isNotNullAndInBounds,%object, %index }

  • Load that requires tagged value encoding and that an object to have a particular shape and involves a doubly-indirected object model (for example when implementing dynamic language heap accesses):

%object = inttoptr %value
%p1 = getelementptr %object,
%p2 = load %p1 !notrap !{ @isObject, %value }
%p3 = getelementptr %p3,
%v = load %p3 !notrap !{ @hasShape, %object, 0xstuff } // 0xstuff will be a compile-time constant

I can’t think of any examples where you’d want to use safe-to-execute and where Hal’s rule won’t hold.

-Filip

NEW PROPOSAL

The solution is to introduce meta-data that is explicit about how the
safe-to-execute condition ought to be evaluated. Instead of an SSA
use, we can have meta-data that says:

%v = load %p !notrap !{ @f, }

where @f is a function in the current module and this function
returns i1, and is zero or more arguments to pass to @f. As
with any meta-data, this doesn’t imply anything different if you
wanted to just execute the code: executing the load doesn’t imply
calling @f; indeed if you dropped the meta-data the execution of
this would still be the same.

So, first a clarifying question:

Is the expectation that to utilize this metadata an optimization pass
would have to inspect the body of @f and reason about its behavior
given ?

Yes.

If so, then I think this is pretty bad. If we ever want to
parallelize function passes, then they can’t inspect the innards of
other functions.

I must be missing something. Can’t you do some simple locking? Lock a
function if it’s being transformed, or if you want to inspect it…

I think we’d exclude these functions from being acted upon by the regular optimization passes. These functions would need to be special anyway: they are functions with a special internal linkage that should not be deleted as dead, even if ‘unused’ (likely we’d want them to survive in memory through CodeGen), but CodeGen itself should ignore the functions (no code for them should ever be generated).

So this would significantly constrain the utility here.

I think we can engineer around this problem. For example, the
function @f is meant to contain basically hand-written IR; it ought
not be necessary to optimize it in order to make use of it for
safe-to-execute. It’s also reasonable to expect these to be small.

Hence you can imagine freezing a copy of those functions that are
used in this meta-data.

Also, this would create uses of the arguments that were “ephemeral”
uses.

I think they’re ephemeral in a very different sense than the previous
!notrap; for example here the used continue to be meaningful even
after replaceAllUsesWith.

I think that, to Chandler’s point, it would be the responsibility of the function creator to insure that the special ‘functions’ would not need any non-constant values without other reasonable uses. It seems like this can be arranged for things like pointer alignment checks, pointer not-null assertions, pointer dereferencability (indicated by a load in the function I suppose), simple value constraints. I think that covers most of the intended use cases.

Yeah, I think it is reasonable to say that given anything like:

%v = load %p !notrap !{ @f, %a1, %a2, …, %an }

It must be the case that %a1…%an are reachable from %p in the sense that ADCE could only kill %a1…%an if it killed %p.

Here are some concrete examples:

  • Load that requires a pointer to be null-checked:

%p = getelementptr %object,
%v = load %p !notrap !{ @isNotNull, %object }

  • Load that requires a null check and an array bounds check:

%p = getelementptr %object, %index (possibly other things also depending on your array object model)
%v = load %p !notrap !{ @isNotNullAndInBounds,%object, %index }

  • Load that requires tagged value encoding and that an object to have a particular shape and involves a doubly-indirected object model (for example when implementing dynamic language heap accesses):

%object = inttoptr %value
%p1 = getelementptr %object,
%p2 = load %p1 !notrap !{ @isObject, %value }
%p3 = getelementptr %p3,

%p3 = getelementptr %object,

%v = load %p3 !notrap !{ @hasShape, %object, 0xstuff } // 0xstuff will be a compile-time constant

I can’t think of any examples where you’d want to use safe-to-execute and where Hal’s rule won’t hold.

This is way more awesome then any of our other proposals. All of these examples have a data dependence from the argument to the !notrap instruction. That makes it much less likely for the meta-data to become invalid. And not being tied to the CFG is great.

We still have to be careful with RAUW.

Say we copy a block containing the @Check argument ‘a’ splitting the control flow (maybe we tail-dup, peel or unswitch):

call @Check(x)
if (p) {
a = p ? x : y
b = a + ofs
call @Check(a)
}
else {
a’ = p ? x : y
b’ = a’ + ofs
call @Check(a’)
}
b-merge = phi(b, b’)
= *b-merge !notrap @Check a

Then we simplify:

call @Check(x)

if (p) {
b = x + ofs
call @Check(x)
}
else {
b’ = y + ofs
call @Check(y)
}
b-merge = phi(b, b’)
= *b-merge !notrap @Check x

Now we can incorrectly hoist *b-merge above Check(y).

I think this is workable because, in general, I don’t think we should replace meta-data uses without checking dominance. We can either fail to update the !notrap use, which I think should be safe, or just drop it if we’re not sure.

-Andy

Is the expectation that to utilize this metadata an optimization pass
would have to inspect the body of @f and reason about its behavior given
<args>?

Yes.

If so, then I think this is pretty bad. If we ever want to parallelize
function passes, then they can't inspect the innards of other functions.

I must be missing something. Can't you do some simple locking? Lock a
function if it's being transformed, or if you want to inspect it...

I really, *really* don't like this.

I do *not* want parallelizing LLVM to require careful locking protocols to
be followed. Instead, I want the design to naturally arrange for different
threads to operate on different constructs and for the interconnecting
interfaces to be thread safe. The best system we have yet devised for this
is based around function passes not digging into tho bodies of other
functions. Instead we rely on attributes to propagate information about the
body of another function to a caller.

So this would significantly constrain the utility here.

I think we can engineer around this problem. For example, the function @f
is meant to contain basically hand-written IR; it ought not be necessary to
optimize it in order to make use of it for safe-to-execute. It's also
reasonable to expect these to be small.

Hence you can imagine freezing a copy of those functions that are used in
this meta-data.

At this point, you are essentially proposing that these functions are a
similar but not quite the same IR... They will have the same concepts but
subtly different constraints or "expectations".

I'm not yet sure how I feel about this. It could work really well, or it
could end up looking a lot like ConstantExpr and being a pain for us going
forward. I'm going to keep thinking about this though and see if I can
contribute a more positive comment. =]

So, I’m relatively new to LLVM, but I’m not new to parallelizing a compiler - I’ve done it before. And when I did it, it (a) did use locking in a bunch of places, (b) wasn’t a big deal, and (c) reliably scaled to 8 cores (the max number of cores I had at the time - I was a grad student and it was, like, the last decade).

Is there any documented proposal that lays out the philosophy? I’d like to understand why locks are such a party pooper.

I kind of get what you’re aiming at, but I’m curious what other constraints are in play. When I last wrote a parallel compiler, I had the notion of predesignating functions that were candidates for cross-thread IR introspection, and freeze-drying their IR at a certain phase that preceded a global barrier. It just so happened that in my case, I did this for inlining candidates - which is different than what we’re proposing here but the same tricks apply.

Sort of. I’m only proposing that they get treated differently from the standpoint of the compilation pipeline. But, to clarify, the IR inside them still has the same semantics as LLVM IR.

It’s interesting that this is the second time that the thought of “special” functions has arisen in my LLVM JIT adventures. The other time was when I wanted to create a module that contained one function that I wanted to compile (i.e. it was a function that carried the IR that I actually wanted to JIT) but I wanted to pre-load that module with runtime function that were inline candidates. I did not want the JIT to compile those functions except if they were inlined.

I bring this up not because I have any timetable for implementing this other concept, but because I find it interesting that LLVM’s “every function in a module is a thing that will get compiled and be part of the resulting object file” rule is a tad constraining for a bunch of things I want to do that don’t involve a C-like language.

Fair enough! I look forward to hearing more feedback.

-Filip

FYI, we (DeLesley Hutchins at Google and my group at CERT) have a new static analysis under development that will be able to express and enforce this kind of property (once the analysis is up and working). When we get to the point where it can handle the LLVM & Clang sources, implementer will be able to count on having code that actually follows the rules, too.

Dean Sutherland
dsutherland@cert.org

[SNIP]

Is the expectation that to utilize this metadata an optimization pass
would have to inspect the body of @f and reason about its behavior given
<args>?

Yes.

If so, then I think this is pretty bad. If we ever want to parallelize
function passes, then they can't inspect the innards of other functions.

I must be missing something. Can't you do some simple locking? Lock a
function if it's being transformed, or if you want to inspect it...

I really, *really* don't like this.

I do *not* want parallelizing LLVM

So, I'm relatively new to LLVM, but I'm not new to parallelizing a
compiler - I've done it before. And when I did it, it (a) did use locking
in a bunch of places, (b) wasn't a big deal, and (c) reliably scaled to 8
cores (the max number of cores I had at the time - I was a grad student and
it was, like, the last decade).

Because it will merely trade a datarace for non-determinism. It is also
really, really slow.

Is there any documented proposal that lays out the philosophy? I'd like
to understand why locks are such a party pooper.

No concrete proposal.

The fundamental idea is that we'd like for passes to have relatively
constrained domains that they operate on so that synchronization is very
rare and the majority of work in the optimizer can proceed in parallel.
This applies mostly to function passes and CGSCC passes, which account for
most of the time in LLVM (codegen is a function pass).

Hence you can imagine freezing a copy of those functions that are used in

this meta-data.

At this point, you are essentially proposing that these functions are a
similar but not quite the same IR... They will have the same concepts but
subtly different constraints or "expectations".

Sort of. I'm only proposing that they get treated differently from the
standpoint of the compilation pipeline. But, to clarify, the IR inside
them still has the same semantics as LLVM IR.

It's interesting that this is the second time that the thought of
"special" functions has arisen in my LLVM JIT adventures. The other time
was when I wanted to create a module that contained one function that I
wanted to compile (i.e. it was a function that carried the IR that I
actually wanted to JIT) but I wanted to pre-load that module with runtime
function that were inline candidates. I did not want the JIT to compile
those functions except if they were inlined.

I bring this up not because I have any timetable for implementing this
other concept, but because I find it interesting that LLVM's "every
function in a module is a thing that will get compiled and be part of the
resulting object file" rule is a tad constraining for a bunch of things I
want to do that don't involve a C-like language.

I'm not yet sure how I feel about this. It could work really well, or it
could end up looking a lot like ConstantExpr and being a pain for us going
forward. I'm going to keep thinking about this though and see if I can
contribute a more positive comment. =]

Fair enough! I look forward to hearing more feedback.

Sorry, still haven't really had time to think more about this. I have
mentioned this thread to Nick who should also chime in when he has sometime.