Tuning up constraint elimination?

constraint elimination is a super impressive pass but there are some holes in what it’s able to do, often apparently related to our canonicalization choices. for example at present it cannot prove that this assert is unreachable:

void f(long a, long b, long c) {
  if (a>b)
    if (b>c)
      assert(a>c);
}

the IR at the relevant point is:

define void @f(i64 noundef %a, i64 noundef %b, i64 noundef %c) {
entry:
  %cmp = icmp sle i64 %a, %b
  %cmp1 = icmp sle i64 %b, %c
  %or.cond.not11 = or i1 %cmp, %cmp1
  %cmp3 = icmp sgt i64 %a, %c
  %or.cond10 = or i1 %cmp3, %or.cond.not11
  br i1 %or.cond10, label %if.end6, label %if.else

if.else:                                          ; preds = %entry
  tail call void @__assert_fail(ptr noundef nonnull @.str, ptr noundef nonnull @.str.1, i32 noundef 6, ptr noundef nonnull @__PRETTY_FUNCTION__.f) #2
  unreachable

if.end6:                                          ; preds = %entry
  ret void
}

there’s more like this, that’s seemingly well within the purview of this pass, that we don’t optimize. I’ve been looking for these examples using a little script that generates a constraint system and then – if it can be optimized away, according to Alive – uses llvm-reduce to come up with a minimal driver for that particular missed optimization.

is anyone interested in working on this? I would be happy to help, particularly with doing an ongoing search for missed optimizations. if this sounds fun for someone, I can start posting these in our issue tracker.

(edit: I can’t prove that this work will matter, but my strong suspicion is that at a time when there’s a lot of interest in safe languages – that tend to generate a lot of unsatisfiable constraints of the kind that this pass can eliminate – it is in our community’s interests to thoroughly and predictably do this job, for the class of constraints that this pass is designed to attack)

13 Likes

@fhahn

1 Like

That would be super useful! Its effectiveness heavily depends on being able to decompose (canonical) LLVM IR expressions to constraints.

Finding such cases automatically would be great, I think there would be a number of people interested in fixing those. Filing issues would be a good start, I’d be happy to help people submit fixes.

2 Likes

ok, here’s one! I don’t want to load up the system with these so I’ll go slowly here and also I’ll try to keep these within the set of stuff that we pretty clearly want to support. also, some students in my advanced compilers class are looking at these so perhaps we’ll get some help! :slight_smile:

1 Like

I am willing to do this. I used to write a tool that extracts boolean expressions from an LLVM IR corpus and use Alive2 to check whether they are tautology or unsatisfiable: llvm-tools/deadcode.cpp at main · dtcxzyw/llvm-tools · GitHub
It has discovered some missed optimization opportunities: Issues · dtcxzyw/llvm-tools · GitHub

I think it is easy to extend the tool to support dominating conditions.

2 Likes

sounds good! I also have a tool that looks for missing stuff (also using Alive) so maybe we can both look for things here

I’d love to work on this! It seems like a good first issue. I’m one of John’s advanced compilers students.

3 Likes

looks like @dtcxzyw has signed up for it already but there are a bunch more where this came from!

Very excited to see more of those cases! I’d expect some of them to be quite good for people starting to contribute to LLVM

3 Likes

Update: I have updated my script to support dominating conditions and assumptions. Alive2 reported 778 redundant branches on ir snippets extracted from my real-world llvm ir corpus.

An example: [InstSimplify] Missed optimization: `X != Y` implies `X | Y != 0` · Issue #117436 · llvm/llvm-project · GitHub
I will file more interesting issues after deduplication :slight_smile:

2 Likes

ok! here’s another one, where we don’t eliminate a constraint, this time the root cause appears to be not adding constraints that come from icmp eq and icmp ne:

and here’s an orthogonal one where a canonicalization is obscuring the constraints that we need to be picking up:

I have checked all the missed-optimization reports generated by my tool.
Here are some interesting frequently used patterns:

For the complete list, please refer to Issues · dtcxzyw/llvm-tools · GitHub.

2 Likes