[RFC] Inject invariant conditions to loops to enable unswitching (and constraint elimination?)

I’m still thinking about range checks against values that are neither loop invariants nor induction variables. As discussed previously in [RFC] Alternative approach of dealing with implications from comparisons through POS analysis, we have a ConstraintElimination pass that potentially solves such issues. But it requires that we have a domination condition from which we can infer some non-trivial facts in loop.

But what if we are lacking such conditions in explicit form? We may know them from business logic, but they are not directly available to the compiler.

My motivation case comes from the following example from Java world, but I guess it’s not something uncommon to other languages:

int foo(int limit, int[] table, int[] indices) {
  for (int index : indices) {
    if (index >=u limit)
      throw new Exception ("Bad index");
    int x = table[index]; // range check
  }
}

In Java, when accessing table[index], we do a range check, so in fact the code looks like:

int foo(int limit, int[] table, int[] indices) {
  for (int index : indices) {
    if (index >=u limit)
      throw new Exception ("Bad index");
    if (index >=u table.length)
      throw new OutOfBoundsException();
    int x = table[index];
  }
}

So in fact, we check index twice: against limit and against table.length(). From business logic, we know that table.length is at least as big as limit, so in fact, the range check is not needed. The problem is that the compiler is completely missing this knowledge and cannot reason about it.

How do we solve this? The most straightforward way is the following: let’s just take existing code and version it by introducing a new condition limit <=u table.length

int foo(int limit, int[] table, int[] indices) {
  for (int index : indices) {
    if (index >=u limit)
      throw new Exception ("Bad index");
    if (limit <= table.length) {
      if (index >=u table.length)
        throw new OutOfBoundsException();
    } else {
      if (index >=u table.length)
        throw new OutOfBoundsException();
    }
    int x = table[index];
  }
}

In terms of IR, the proposed transform is the following: Compiler Explorer

So, we intoduced an extra check, doesn’t look very cool. Good news that the new check is loop-invariant. So we can do the following:

  • Use LICM to hoist it out
  • Unswitch the loop by this new condition
  • Use constraint-elimination to eliminate range check in one of the versions
  • Simplify CFG to see what happened

Result is like: Compiler Explorer

After all transforms we ended up with extra loop-invariant check, but removed the range check in one of loop versions.

Instead of using constraint-elimination, we could instead manually do its job. The transform is a bit smarter in this case: Compiler Explorer

But we can go without constraint-elimination: Compiler Explorer

I’ve described a particular case of this, but generalization might be the following: “we can inject new loop-invariant conditions if it helps to remove a non-loop-invariant condition”, whether it’s through leveraging constraint-elimination or not.

Off course there should be a profitability consideration, but the worst thing we can end up with is introducing one extra check above the loop and do nothing in the loop itself, which doesn’t seem fatal.

The biggest downside I see here is that we rely on loop-unswitch to actually do its job, otherwise what we did is strictly non-profitable. So this whole thing is coupled with unswitching. Can it be done as part of it maybe?

Currently I use LICM and it has a notion about poisons (I marked everything as noundef, otherwise it falls apart because of freezes).

I’ve prototyped this in a dummy pass, and it seems that it can handle the case. Will do studies how useful it is.
https://reviews.llvm.org/D135265

This indeed resembles loop unswitch a lot. We duplicate a loop and guard one of the copies with a loop invariant condition so as to eliminate a condition in this copy. It makes sense to try implementing this in loop-unswitch. This way you can reuse the existing loop unswitch cost model (as I said you are doing more or less the same thing, replicating a loop to eliminate a condition inside of the loop).

If you do it outside of loop-unswitch you would need to come up with some very similar but separate cost model (or maybe factor out the one in unswitch).

Sounds good to me in general, but I think it will be tricky to get the cost-model right. One thought I had was that perhaps the frontend could mark runtime checks as unlikely to help guide insertions for those?

Why do you think the cost model should be significantly different from non-trivial unswitch? We are introducing a branch before the loop and a copy of the loop in order to eliminate a condition inside of the loop.