Post-dominance analysis for multiple-exit functions

Many published analyses which build on post-dominance assume a
canonical single-dominator-tree form induced by unifying all exits
(and often adding a virtual edge from START to END). In contrast, it
seems that the current LLVM post-dominator analysis only operates in a
mode in which it generates a forest of post-dominator trees, with one
rooted at each exit node. The problem this can cause is that, by
nature, these trees omit some nodes which are not post-dominated by
any one exit. This invalidates analyses which build on a traversal of
the post-dominator tree.

(In my case, I'm implementing Pingali & Bilardi's optimal control
dependence algorithms: http://doi.acm.org/10.1145/256167.256217.)

Consider, for example, a simple function with a single branch and two
exits:

    define i32 @f(i32 %X) {
    entry:
      %0 = icmp eq i32 %X, 0
      br i1 %0, label %exit1, label %exit2
    exit1:
      ret i32 1
    exit2:
      ret i32 0
    }

The existing PostDominatorTree analysis will produce a forest of two
trees, each containing one node (exit1 and exit2, respectively).
Critically, entry will not be present in either of these trees,
because it is not post-dominated by either exit1 or exit2.

Traditional analyses seem to assume that a shared virtual END node is
added after all exit nodes, which would create a single tree like:

        [ END ]
        / | \
    entry exit1 exit2

containing all nodes in the original function.

Clearly it is possible to do this *destructively*, using the existing
return unification transform, but for many uses it would seem useful
to allow this as a non-destructive mode of operation of the post-
dominator tree analysis.

Am I missing something? Has this just not been an issue for previous
analyses? Is there guidance on what the right thing to do should be to
facilitate analyses which require complete post-dominance information?

I was hoping to submit a control dependence analysis for consideration
in the trunk at some point, since it seems to come up semi-frequently
on the list as something people want to use, but if it requires
significantly extending the existing post-dominator analysis, I would
hope to be able to do so both with a minimum of hackery, and in an
officially sanctioned style.

Thanks.

Many published analyses which build on post-dominance assume a
canonical single-dominator-tree form induced by unifying all exits
(and often adding a virtual edge from START to END). In contrast, it
seems that the current LLVM post-dominator analysis only operates in a
mode in which it generates a forest of post-dominator trees, with one
rooted at each exit node. The problem this can cause is that, by
nature, these trees omit some nodes which are not post-dominated by
any one exit. This invalidates analyses which build on a traversal of
the post-dominator tree.

(In my case, I'm implementing Pingali & Bilardi's optimal control
dependence algorithms: http://doi.acm.org/10.1145/256167.256217.)

Consider, for example, a simple function with a single branch and two
exits:

   define i32 @f(i32 %X) {
   entry:
     %0 = icmp eq i32 %X, 0
     br i1 %0, label %exit1, label %exit2
   exit1:
     ret i32 1
   exit2:
     ret i32 0
   }

The existing PostDominatorTree analysis will produce a forest of two
trees, each containing one node (exit1 and exit2, respectively).
Critically, entry will not be present in either of these trees,
because it is not post-dominated by either exit1 or exit2.

Traditional analyses seem to assume that a shared virtual END node is
added after all exit nodes, which would create a single tree like:

       [ END ]
       / | \
   entry exit1 exit2

containing all nodes in the original function.

Clearly it is possible to do this *destructively*, using the existing
return unification transform, but for many uses it would seem useful
to allow this as a non-destructive mode of operation of the post-
dominator tree analysis.

Am I missing something? Has this just not been an issue for previous
analyses? Is there guidance on what the right thing to do should be to
facilitate analyses which require complete post-dominance information?

Running opt -disable-output -analyze -postdomtree on your example gives

Inorder PostDominator Tree:
   [1] <<exit node>> {4294967295,4294967295}
     [2] %exit1 {4294967295,4294967295}
     [2] %entry {4294967295,4294967295}
     [2] %exit2 {4294967295,4294967295}

The PostDominatorTree pass uses an artificial "exit node" to
represent the post-dominator parent of all the exits. Is this
not what you're looking for?

I was hoping to submit a control dependence analysis for consideration
in the trunk at some point, since it seems to come up semi-frequently
on the list as something people want to use, but if it requires
significantly extending the existing post-dominator analysis, I would
hope to be able to do so both with a minimum of hackery, and in an
officially sanctioned style.

Sounds great,

Dan

The PostDominatorTree pass uses an artificial "exit node" to
represent the post-dominator parent of all the exits. Is this
not what you're looking for?

You are correct.

This falls under the rubric of "am I missing something?" -- I was missing this:

   /// getRootNode - This returns the entry node for the CFG of the function. If
   /// this tree represents the post-dominance relations for a function, however,
   /// this root may be a node with the block == NULL. This is the case when
   /// there are multiple exit nodes from a particular function. Consumers of
   /// post-dominance information must be capable of dealing with this
   /// possibility.

and improperly assuming that I had to use the getRoots() interface for sane results from a post-dominator tree, because I was caught up with:

   /// getRoots - Return the root blocks of the current CFG. This may include
   /// multiple blocks if we are computing post dominators. For forward
   /// dominators, this will always be a single block (the entry node).

The world is sane once again (and my control dependence analysis consistently works without requiring a prior MergeReturns pass). Thanks for indulging my oversight.