IR Liveness Analysis?

Is anyone aware of an existing framework for asking liveness questions about SSA values in the IR? I'm looking for something more precise than the trivial definition provided by SSA itself. I can write something myself (and will if need be), but it seemed like a generic enough problem that I was surprised I couldn't find something already in tree. Anyone know of something I've missed?

The problem I'm trying to solve is identifying pointers which might be live at a garbage collection safepoint. This in the context of transforming the IR to insert a statepoint intrinsic to explicitly represent a possible object relocation. We've been using a very straigh forward and, due to implementation simplicity, expensive estimation based on simple reachability*. This suffices for correctness, but is not ideal from a performance perspective. (To put it mildly.)

* For those curious, you can find the current implement by searching for "isLiveAtSafepoint" in https://github.com/AzulSystems/llvm-late-safepoint-placement/blob/master/lib/Transforms/Scalar/SafepointPlacementPass.cpp. It's a trivial reachability algorithm with special cases for inbound values into a phi and uses that are encountered along paths which contain the original definition. We rerun the search once per potentially live value. As you might imagine, that's a bit expensive. :slight_smile:

Philip

Philip,

To clarify, you want an analysis that answers this question: Are there are any uses of a value, V, that are dominated by an instruction, I. Is that right?

-Hal

Is anyone aware of an existing framework for asking liveness questions
about SSA values in the IR? I'm looking for something more precise
than the trivial definition provided by SSA itself. I can write
something myself (and will if need be), but it seemed like a generic
enough problem that I was surprised I couldn't find something already
in tree. Anyone know of something I've missed?

The problem I'm trying to solve is identifying pointers which might be
live at a garbage collection safepoint. This in the context of
transforming the IR to insert a statepoint intrinsic to explicitly
represent a possible object relocation. We've been using a very
straigh forward and, due to implementation simplicity, expensive
estimation based on simple reachability*. This suffices for
correctness, but is not ideal from a performance perspective. (To put
it mildly.)

* For those curious, you can find the current implement by searching
for "isLiveAtSafepoint" in
https://github.com/AzulSystems/llvm-late-safepoint-placement/blob/master/lib/Transforms/Scalar/SafepointPlacementPass.cpp.
It's a trivial reachability algorithm with special cases for inbound
values into a phi and uses that are encountered along paths which
contain the original definition. We rerun the search once per
potentially live value. As you might imagine, that's a bit expensive.
:slight_smile:

For those that don't have the repository locally: the actual implementation lives in lib/Analysis/CFG.cpp (function isPotentiallyReachableNotViaDef) currently.

https://github.com/AzulSystems/llvm-late-safepoint-placement/blob/master/lib/Analysis/CFG.cpp#L306

Not quite. In fact, that definition is fundamentally flawed as a measure of liveness. Consider:
BB1:
   def a
   br i1 undef, BB2, BB3
BB2:
   Instruction I //Is a live here?
   br BB3
BB3:
   use a

In the above, "use a" is not dominated by I, but is live in that there is a path from "def a" to "use a" which includes "I".

The question I want to answer is: "Given a value, V, and a interesting location, L, is there a possible path from L to any use of V which does not pass the definition of V?" An alternate formulation is "Is there a dynamic path from V to some use of V which includes L?" In particular, I want to be able to answer this question for large numbers of values, moderate numbers of locations, and do it efficiently.

I'm fine with a slightly conservative answer (i.e. "yes" when dynamically the answer is "no"), but not an incorrect answer ("no" when there is such a path at runtime). This is pretty much the textbook definition of variable liveness.

Philip

FWIW, "Computing Liveness Sets for SSA" by Boissinot is my favorite paper on liveness:
http://hal.inria.fr/docs/00/58/53/03/PDF/RR-7503.pdf

They describe a nice 2-pass data flow algorithm. I'm not aware that anyone has implemented this for LLVM. It works well if you can represent liveness as a bitset, but you'll probably be stuck with a DenseMap of pointer values which would not be efficient.

LLVM's own LiveVariables is a path-exploration that computes a set of blocks for each variable. You could actually add pointers to the safepoints as you go this way and don't need to store any liveness results. You just need a set of visited blocks as you traverse. The easiest thing to do is probably to port this from Machine IR to LLVM IR.

-Andy

We did some further work on the lines of the Boissinot paper. You can have a look at:

Efficient liveness computation using merge sets and DJ-graphs (in TACO, Jan 2012)
http://dl.acm.org/citation.cfm?id=2086706