RFC: Strong GC References in LLVM

(Note, this list is stuff in the Scalar directory only. Everything in
Vectorize would n eed it, etc)

And in case folks want more evidence that reasonable llvm developers need
something that just gives easy and right answers, see
https://reviews.llvm.org/D22547 from just yesterday :slight_smile:

To move this along, i will go build the analysis (I already have it mostly
done, actually). If someone updates our docs to make this stuff clear and
obvious, that would be wonderful :slight_smile:

The analysis currently computes, internally, for a given BB:

EarliestLoadHoistBarrier (used to see if you can move something out of a
block)
LatestLoadHoistBarrier (used to find the latest safe insertion point in a
block, if any)
EarliestStoreSinkBarrier (insertion)
LatestStoreSinkBarrier (movement)

(stores are maythrow dependent, loads are
isGuaranteedToTransferExecutionToSuccessor dependent)

I'm still coming up with the external API, the users all want to know
either:

1. Can i move a load up out of this block to a direct predecessor
2. Can i move a store down out of this block to a direct successor

GVN's current load PRE is complex to get "right" from it's current
standpoint, the APi that will provide the easiest way to fix it will be:

3. What is the latest insertion point in a block for a load, if it is safe
(IE the block does not end in an instruction you cannot move the load
before).

Nothing is doing global store sinking right now that i see, so nothing
needs the analogous store api.

I would argue that we should have this analysis only report the EBB information. Doing this in the form of the information you’ve mentioned is fine, but I would not want to see this become our deref analysis for instance. I think that needs to be a separate analysis which is well layered on top of this one. (i.e. once I encounter a hoist barrier, I need to then ask if a load is safe to speculate beyond that point as a separate question.)

Okay, so it sounds like it might actually be better to be even more low level, call it “ExtendedBBInfo” or something, and rename what it provides to be more clearly structural:

A. Inst * FirstIsGuaranteedToTransferExecutionToSuccessor(BB) (naming bikeshed open on this one :P)
B. Inst * LastIsGuaranteedToTransferExecutionToSuccessor(BB)
C. Inst *FirstMayThrow(BB)
D. Inst *LastMayThrow(BB)

Most things want to know if a given inst is before or after these.

Since we have to touch the entire set of instructions for a bb anyway, we could also provide dominates (like orderedbasicblock) to give you the answer to that question for free (otherwise everything has to rewalk the entire inst list again)

Rather than make it part of the API for this class, this would basically be making OrderedBasicBlock an interface, and this class happens to be able to provide a pointer to something satisfying that interface as well.

(IE getOrderedBasicBlock() on EBBInfo will return you something that has locallyDominates())

Hi Daniel,

Okay, so it sounds like it might actually be better to be even more low
level, call it "ExtendedBBInfo" or something, and rename what it provides to
be more clearly structural:

A. Inst * FirstIsGuaranteedToTransferExecutionToSuccessor(BB) (naming
bikeshed open on this one :P)
B. Inst * LastIsGuaranteedToTransferExecutionToSuccessor(BB)
C. Inst *FirstMayThrow(BB)
D. Inst *LastMayThrow(BB)

Why would something be separately interested in C over A or D over B?
IOW, if you're looking for a "side exit" then isn't a may-throw call
is just one form of NOT(IsGuaranteedToTransferExecutionToSuccessor)?

Btw, I think todays implementation of
IsGuaranteedToTransferExecutionToSuccessor is fairly conservative
(IIRC it also considers volatile loads / stores as potentially
non-terminating).

-- Sanjoy

Hi Daniel,

Okay, so it sounds like it might actually be better to be even more low
level, call it "ExtendedBBInfo" or something, and rename what it provides to
be more clearly structural:

A. Inst * FirstIsGuaranteedToTransferExecutionToSuccessor(BB) (naming
bikeshed open on this one :P)
B. Inst * LastIsGuaranteedToTransferExecutionToSuccessor(BB)
C. Inst *FirstMayThrow(BB)
D. Inst *LastMayThrow(BB)

Why would something be separately interested in C over A or D over B?
IOW, if you're looking for a "side exit" then isn't a may-throw call
is just one form of NOT(IsGuaranteedToTransferExecutionToSuccessor)?

Ah, never mind, we can DSE across non-terminating calls but can't do
that across maythrow:

*x = 40;
while(1); // behind a nounwind call
*x = 50;

==>

// *x = 40;
while(1); // behind a nounwind call
*x = 50;

is okay (though mildly questionable), but

*x = 40;
may_throw();
*x = 50;

==>

// *x = 40;
may_throw(); // the frame that catches the exception can inspect *x
*x = 50;

is not okay, so maythrow is not "just another form of non-terminating
action" as I just said.

I wonder if there are optimizations that are legal across maythrow
calls that don't work across may_never_terminate calls.

-- Sanjoy

It sounds like this new analysis could basically subsume the existing OrderedBasicBlock.

Thinking about update semantics, what types of updates need to be recorded to make this efficient?

  • Adding maythrow instructions is obvious required, figuring out how to update clearly requires OBB info.
  • Adding arithmetic instructions - only requires OBB update

I’m thinking that some form of lazy update might be the best path forward here. Have the results re-calculated via a linear scan of the block only once a) a change has happened and been marked and b) the request information is required. This would allow batch updates for instance. We could potentially try to incremental update the motion barriers, but that seems like potentially too much work.

Also, I’m now thinking that first and last isn’t enough information. Once I ask the speculation question, I think need to ask what the next ordering barrier is. I thinking code of the following variety:
for block I want to skip:
if not has ordering barrier:
continue
if can speculate over block
continue
barrier = last barrier in block
while can speculate over barrier:
barrier = previous barrier
return next(barrier) as insert pt
return first instruction in function entry

It sounds like this new analysis could basically subsume the existing
OrderedBasicBlock.

Thinking about update semantics, what types of updates need to be recorded
to make this efficient?
- Adding maythrow instructions is obvious required, figuring out how to
update clearly requires OBB info.

Yup.

- Adding arithmetic instructions - only requires OBB update

This is possible to do by skip numbering and using uint64.

If a given pass performs only insertafter calls, you can show that you can
add tons of instructions with O(1) updates by skip-numbering
Same with only insertbefore.
You can use the same numbering scheme for both, and just add/subtract 1 in
the right places until you run out of space. With int64 as the numbers,
it's "a lot of places" :slight_smile:

If you allow interspersing of insertbefore/insertafter, the numbering
scheme is more complex, and you can only insert max of log(n) instructions
between two instructions before you must renumber.

(this is just using previous inst number+next inst number/2 as the new
number)

Whether any of this is worth it, ¯\_(ツ)_/¯

I'll probably start without it and we can do it if it's too slow.

I'm thinking that some form of lazy update might be the best path forward
here. Have the results re-calculated via a linear scan of the block only
once a) a change has happened and been marked and b) the request
information is required. This would allow batch updates for instance. We
could potentially try to incremental update the motion barriers, but that
seems like potentially too much work.

Yup.

Also, I'm now thinking that first and last isn't enough information. Once
I ask the speculation question, I think need to ask what the *next*
ordering barrier is. I thinking code of the following variety:
for block I want to skip:
  if not has ordering barrier:
    continue
  if can speculate over block
    continue
  barrier = last barrier in block
  while can speculate over barrier:
    barrier = previous barrier
  return next(barrier) as insert pt
return first instruction in function entry

So, so far nothing tries to do that.

It's easy enough to do though, we can just maintain unordered maps of
inst, number, and ordered maps of number, inst (assuming we allow for
incremental update) per block.

Or at least, this is the "simple" way. You can do it in a better time query
time bound by unordered maps of inst, {number, next/last inst} (IE you give
us the current thing we gave you, we tell you the next or last thing from
it).
This is not sane to incrementally update though.