Path condition propagation

Hi all,

Consider this C code:

#include <stdbool.h>

bool func(int n1, int n2, bool b) {
    bool greater = n1 > n2;
    if (greater && b) {
        if (n1 == n2) {
            return false; // unreachable
        }
    }
    return true;
}

The line marked unreachable cannot be reached, however currently LLVM does not optimize it out. I believe this is because LLVM does not recognize that meeting path conditions like, for example, X && Y logically means that X is true and Y is true.

I'm interested in creating a patch to remedy this; is there a file or function I should look at?

Thanks in advance.

- CL

Hi all,

Consider this C code:

#include <stdbool.h>

bool func(int n1, int n2, bool b) {
    bool greater = n1 > n2;
    if (greater && b) {
        if (n1 == n2) {
            return false; // unreachable
        }
    }
    return true;
}

The line marked unreachable cannot be reached, however currently LLVM does
not optimize it out

???
Yes it does.
[dannyb@dannyb-macbookpro3 18:39:18] ~/sources/llvm
(git-svn)-[newgvn-predicates]- :frowning: $ clang -c -emit-llvm ~/greater.c -O1
[dannyb@dannyb-macbookpro3 18:39:22] ~/sources/llvm
(git-svn)-[newgvn-predicates]- :slight_smile: $ debug-build/bin/llvm-dis greater.bc
[dannyb@dannyb-macbookpro3 18:39:24] ~/sources/llvm
(git-svn)-[newgvn-predicates]- :slight_smile: $ cat greater.ll
; Function Attrs: norecurse nounwind readnone ssp uwtable
define zeroext i1 @func(i32, i32, i1 zeroext) #0 {
  ret i1 true
}

opt -simplifycfg -instcombine does the same thing to it if you use -O0 with
clang

I believe this is because LLVM does not recognize that meeting path

conditions like, for example, X && Y logically means that X is true and Y
is true.

Yes it does. See both GVN's propagateequality and
correlatedvaluepropagation, among other things :slight_smile:

In this case, simplifycfg +instcombine will do it

The new predicate support i'm building for GVN will also do it.

My apologies, you’re right; I was comparing against an older version of Clang that didn’t do this.

  • CL

Hi all,

Consider this C code:

#include <stdbool.h>

bool func(int n1, int n2, bool b) {
    bool greater = n1 > n2;
    if (greater && b) {
        if (n1 == n2) {
            return false; // unreachable
        }
    }
    return true;
}

The line marked unreachable cannot be reached, however currently LLVM
does not optimize it out

???
Yes it does.

It seems like we get this almost by accident though. I find that I need
`-mem2reg -instcombine -simplifycfg -instcombine` (on clang -O0 IR; the
`-mem2reg -instcombine` are just cleanup) and essentially it boils down to
simplifycfg merging everything into a single branch-free expression and
then instcombine algebraically merging the comparisons.

A small modification defeats LLVM's optimizer:

bool func(int n1, int n2, bool b) {
    bool greater = n1 > n2;
    if (greater && b) {
        foo();
        if (n1 == n2) {
            return false; // unreachable
        }
    }
    return true;
}

In this case, simplifycfg doesn't go wild merging everything into a single
branch-free expression and so we don't get it.

CorrelatedValuePropagation doesn't get this because its processCmp is quite
weak (it bails out if one operand isn't a constant). JumpThreading is the
only other pass that uses LazyValueInfo and it can't fold this since it
can't thread a jump around the side-effecting `foo()` call.

I'm not familiar with GVN but it doesn't seem to help for this modified
test case either.

Carlos, in answer to your original question, you may want to see if you can
make LLVM get this case by modifying processCmp in
lib/Transforms/Scalar/CorrelatedValuePropagation.cpp

-- Sean Silva

The current gvn equality propagation is not powerful enough to get this because it doesn’t try to infer values in predicates based on other predicates, so it never realizes a>b → a !=b in a useful way.

It otherwise would get this

The current gvn equality propagation is not powerful enough to get this
because it doesn't try to infer values in predicates based on other
predicates, so it never realizes a>b -> a !=b in a useful way.

It otherwise would get this

Sorry if this is a bit of a stupid question, but what exactly is the scope
of what a GVN pass does?

I've read various "textbook" definitions of GVN but they all seem to
revolve around proving that different expressions in the program are
equivalent. What you're talking about here seems to go outside that; e.g.
being more like what one would expect CorrelatedValuePropagation to do.

-- Sean Silva

The current gvn equality propagation is not powerful enough to get this
because it doesn't try to infer values in predicates based on other
predicates, so it never realizes a>b -> a !=b in a useful way.

It otherwise would get this

Sorry if this is a bit of a stupid question, but what exactly is the scope
of what a GVN pass does?

Depends on who you ask. The most general correct answer i can give is:
prove whatever things it can to be equivalent by computing their value.

If you could prove everything equivalent that is possible, you would,
subject to compile time :slight_smile:

Think of GVN as an analysis whose goal is to prove what things in the
domain of the program have equivalent values.

I've read various "textbook" definitions of GVN but they all seem to
revolve around proving that different expressions in the program are
equivalent.

Most textboox definitions ignore conditionals.
For example, "complete" value numbering would be discovery of all herbrand
equivalences.
This ignores conditionals and what equivalences they can be used to prove
(they treat conditionals as always non-deterministic).

Predicated GVN algorithms instead try to take predicates into account, in
order to infer the equality of more values. If you were, for example, to
drop the predicate support current GVN has, you'd heavily decrease it's
effectiveness.

Note that a lot of research papers have to be read very carefully to
discover what their domain is.

If you read, http://arxiv.org/pdf/1302.6325.pdf, they point out most of the
papers restrict the problem to discovering the equality of variables in the
program, and not discovering the equality of *expressions*.

What you're talking about here seems to go outside that; e.g. being more
like what one would expect CorrelatedValuePropagation to do.

You are just thinking about it in terms of existing program expressions.

Necula proves (here:
http://people.eecs.berkeley.edu/~necula/Papers/gvndet_sas04.pdf) that there
is a significant distinction to be made between computing all herbrand
equivalences and computing all herbrand equivalences among sub-expressions
of the program.

In this case, GVN would just be proving n1 == n2 is equivalent to the
expression false.
If it has some form of redundancy elimination built on top of it, it
usually then replaces it with that value.
It would then do nothing else.

It's possible to unify gvn with dead code elimination and constant
propagation (pretty easily in fact).

In such a formulation, it will discover that n1 == n2 is equivalent to
false, and thus that the block it leads to is unreachable.

Again, the new gvn itself does nothing with this info other than use it to
further prove things about values (IE for example, that only one edge of a
phi is live and thus the phi has the same value as that live edge's
incoming value).

Elimination can, of course, trivially just delete all the unreachable
blocks (or mark them unreachable if it wants).

Our current GVN already does predication, just not in a very
algorithmically complete way.

Doing it in a complete way would eliminate most of
correlatedValuePropagation. You could eliminate the rest of it by
integrating proper range analysis as you want. That's just "another source
of predicates".

In *practice*, what i care about is getting good enough results quickly,
and making GVN a real analysis used by other things (PRE, etc) while making
it fast.

(If i ever finish, i'll then likely move on to replacing LVI with some
extended SSA + range analysis :P)

Thanks for the explanation!

Where would I look to change the equality propagation?

  • CL

PropagateEquality in gvn.cpp

However, if you are going to do it, remember the goal is to make the code simpler and easier, not just pile on to the current mess to catch more cases :slight_smile:

That seems ominous; should I not bother?

  • CL

It’s going to be really hard to do something sane in the current infrastructure.
Its possible, but it would also be slow. You would have to go looking at uses of variables compared in predicates in PropagateEquality and if the uses appear in a comparison that is dominated by the true or false edge of the existing predicate, see if it tells you something about the dominated one.

It looks like there’s already something similar in PropagateEquality which eg X >= Y == true and replaces X < Y == false, which is somewhat similar - could I base an addition off of that?

  • CL

Sure

I have this patch to GVN.cpp, but it doesn’t seem to be working; any ideas?

diff --git a/lib/Transforms/Scalar/GVN.cpp b/lib/Transforms/Scalar/GVN.cpp
index a963b2f…97d7b4b 100644
— a/lib/Transforms/Scalar/GVN.cpp
+++ b/lib/Transforms/Scalar/GVN.cpp
@@ -2036,6 +2036,22 @@ bool GVN::propagateEquality(Value *LHS, Value *RHS, const BasicBlockEdge &Root,
if (RootDominatesEnd)
addToLeaderTable(Num, NotVal, Root.getEnd());

  • // If “A > B” or “A < B”, then propagate “(A == B) == false”.
  • if (ICmpInst *ICmp = dyn_cast(Cmp)) {
  • if (ICmp->isRelational() &&
  • ((isKnownTrue && Cmp->isFalseWhenEqual()) ||
  • (isKnownFalse && Cmp->isTrueWhenEqual()))) {
  • Worklist.push_back(
  • std::make_pair(CmpInst::Create(
  • Cmp->getOpcode(),
  • CmpInst::Predicate::ICMP_EQ,
  • A,
  • B
  • ), ConstantInt::getFalse(Cmp->getContext()))
  • );
  • }
  • }

I’ve submitted a working patch for review: http://reviews.llvm.org/D22076

  • CL