A bug related with undef value when bootstrap MemorySSA.cpp

This is a bug found by internal compiler bootstrap, and possibly it is
the root cause of https://bugs.llvm.org/show_bug.cgi?id=33652 and
https://bugs.llvm.org/show_bug.cgi?id=33626.

Here is the testcase 1.c. The original code is at MemorySSA.cpp:586
for rL307764.

------------------------- 1.c ---------------------------
long a, c, d, e, f, m, cnt, i_hasval;
volatile long b;
void goo(long);
void hoo();
long ioo();

void foo(long hasval) {
  long i, k;
  if (hasval) {
    i = b;
  }

  k = 0;
  do {
    if (i_hasval != hasval)
      return;
    if (i_hasval) {
      m = m + 1;
      if (a == i) {
        c = a + d + e + f;
        return;
      }
    }
    c = a + b;
    i_hasval = 3;
    k++;
  } while (k < cnt);
}

void hoo() {
  foo(0);
}

Hi Wei,

[+CC the other undef folks]

This is the same bug as https://bugs.llvm.org/show_bug.cgi?id=31652.
The short answer here is that the loop unswitch transform and the GVN
transform are justified via conflicting specifications of undef (that
is, LoopUnswitch and GVN don't agree on the definition of undef). The
long(er) answer is in the bug.

Unfortunately, there is no real way to fix this in the IR today
(beyond hacking GVN / LoopUnswitch to back off on these transforms
enough to not trigger this case). We will need to specify and
implement a consistent definition of undef / poison to truly fix this,
one of which we've already proposed.

Thanks!
-- Sanjoy

Cool, thanks for debugging this issue and letting us know.

We have a few patches to fix this issue:
- Introduce freeze in IR: ⚙ D29011 [IR] Add Freeze instruction
- Lowering freeze: ⚙ D29014 [SelDag] Add FREEZE
- Fix loop unswitch: ⚙ D29015 [SimpleLoopUnswitch] Fix introduction of UB when hoisted condition may be undef or poison

Bonus patches to recover perf:
- Be less conservative in loop unswitching: ⚙ D29016 [LoopUnswitch] Do not freeze condition if hoisted branch is guaranteed to be reachable
- Instcombine support for freeze: ⚙ D29013 Add InstCombine/InstructionSimplify support for Freeze Instruction

It would be great if you could try out these patches to see if the bug goes away.

Thanks,
Nuno

Sanjoy and Nuno, thanks for pointing me to the existing bug and the
related discussion, and thanks for the great effort to try to solve it
fundamentally in IR.

I tried the patches and it compiled the small case and also
MemorySSA.cpp correctly.

I wish we can have the patches in ASAP but they are some big patches
and I guess it still takes some time to get in. The issue when
compiling MemorySSA.cpp can be easily exposed by other changes, and it
is very time consuming to triage and extract the issue, so we like to
have some temporary solution, like to make loop unswitching more
conservative. If it sounds ok, I can work on it. My plan is to enhance
isGuaranteedNotToBeUndefOrPoison and use it as a precondition in loop
unswitching.

Thanks,
Wei.

The issue blocks another optimization patch and Wei has spent huge amount of effort isolating the the bootstrap failure to this same problem. I agree with Wei that other developers may also get hit by the same issue and the cost of leaving this issue open for long can be very high to the community.

David

Hi,

The issue blocks another optimization patch and Wei has spent huge amount of
effort isolating the the bootstrap failure to this same problem. I agree
with Wei that other developers may also get hit by the same issue and the
cost of leaving this issue open for long can be very high to the community.

I can't speak for others, but I am fine with adding a workaround for
this. However, I suspect isGuaranteedNotToBeUndefOrPoison in
LoopUnswitch may regress other benchmarks.

Thanks!
-- Sanjoy

Any other thoughts on a more minimal fix?
Otherwise, it sounds like we can only try to find the fix that does the
least damage.
We can't just leave it broken given it's triggering even in llvm bootstraps
now :frowning:

I can't think of any good answers here.

The semantic we want is "branching on poison or undef is undefined
behavior" (which lets us do propagateEquality). Given that, we could
be clever about implementing isGuaranteedNotToBeUndefOrPoison; for
instance if we have:

do {
  if (a != b) {
    ...
  }
} while (...);

then we can use post-domination to prove that "a != b" is not undef
(otherwise the loop is guaranteed to have undefined behavior).

(I hate to say this), a more aggressive fix is to introduce a "freeze"
intrinsic that freezes only an i1. LoopUnswitch would wrap the loop
invariant condition in this after hoisting the branch. This would be
a poor man's version of the more invasive patches Juneyoung already
has developed.

-- Sanjoy

Hello, some of the patches had conflicts with LLVM head, so I updated them. If you experienced patch failure before then you can try it again.

I compiled your code (1.c) with LLVM r308173 with the 5 patches applied, and it generated assembly like this. Now it contains store to c(%rip).
It tries to store a(%rip) + b(%rip) to c(%rip). I wish this translation is now correct.

73 .globl hoo # -- Begin function hoo
74 .p2align 4, 0x90
75 .type hoo,@function
76 hoo: # @hoo
77 .cfi_startproc
78 # BB#0:
79 movq a(%rip), %rax
80 movq cnt(%rip), %rcx
81 cmpq $0, i_hasval(%rip)
82 sete %sil
83 xorl %edx, %edx
84 .p2align 4, 0x90
85 .LBB1_1: # =>This Inner Loop Header: Depth=1
86 testb $1, %sil
87 je .LBB1_3
88 # BB#2: # in Loop: Header=BB1_1 Depth=1
89 movq b(%rip), %rsi
90 addq %rax, %rsi
91 movq %rsi, c(%rip)
92 movq $3, i_hasval(%rip)
93 incq %rdx
94 xorl %esi, %esi
95 cmpq %rcx, %rdx
96 jl .LBB1_1
97 .LBB1_3:
98 retq

IMHO, enhancing isGuaranteedNotToBeUndefOrPoison and using it as a precondition in loop unswitching is
not enough. undef (and poison) value can be stored into memory, and also be passed by a function argument.
isGuaranteedNotToBeUndefOrPoison will virtually return false for all cases except the value is
some integer constant. Sanjoy’s suggestion might be helpful (if I understood correctly), but I’m
not sure how much it will be.

Best Regards,
Juneyoung Lee

I think everyone agrees pretty much everything short of “Fix undef” will not fix the problem for good.
I think we are more trying to hide it well enough that we get the months we need for various folks to work on the larger proposals.
Which sucks, but not sure we have a better answer, because i don’t think we are going to commit the freeze/etc patches tomorrow.

It seems MemorySSA.cpp is the only real code where we found the
problem happening. Is it possible to change the source of
MemorySSA.cpp to hide the problem and buy some time for now? Now we
use an empty generic_def_path_iterator with Optional<ListIndex> N
being initialized by None as the end of range. Can we initialize the
Optional var with a special value instead of None?

  iterator_range<def_path_iterator> def_path(ListIndex From) {
    return make_range(def_path_iterator(this, From), def_path_iterator());
  }

It seems MemorySSA.cpp is the only real code where we found the
problem happening.

This is doubtful, ¸FWIW :slight_smile:

Is it possible to change the source of
MemorySSA.cpp to hide the problem and buy some time for now? Now we
use an empty generic_def_path_iterator with Optional<ListIndex> N
being initialized by None as the end of range. Can we initialize the
Optional var with a special value instead of None?

  iterator_range<def_path_iterator> def_path(ListIndex From) {
    return make_range(def_path_iterator(this, From), def_path_iterator());
  }

Why does this work?

Hi,

It seems MemorySSA.cpp is the only real code where we found the
problem happening.

This is doubtful, ¸FWIW :slight_smile:

Is it possible to change the source of
MemorySSA.cpp to hide the problem and buy some time for now? Now we
use an empty generic_def_path_iterator with Optional<ListIndex> N
being initialized by None as the end of range. Can we initialize the
Optional var with a special value instead of None?

  iterator_range<def_path_iterator> def_path(ListIndex From) {
    return make_range(def_path_iterator(this, From), def_path_iterator());
  }

Why does this work?

Fwiw, I don't think this is a good idea. If it can happen in
MemorySSA it can happen in other organic C++ source code too - think
about what you're going to tell other folks who run into this issue,
where such a workaround may be difficult to achieve or explain.

-- Sanjoy

It seems MemorySSA.cpp is the only real code where we found the
problem happening.

This is doubtful, ¸FWIW :slight_smile:

Is it possible to change the source of
MemorySSA.cpp to hide the problem and buy some time for now? Now we
use an empty generic_def_path_iterator with Optional<ListIndex> N
being initialized by None as the end of range. Can we initialize the
Optional var with a special value instead of None?

  iterator_range<def_path_iterator> def_path(ListIndex From) {
    return make_range(def_path_iterator(this, From), def_path_iterator());
  }

Why does this work?

Because condition "a == b" being loop unswitched described in the
problem above is generated from (*N == *O.N) below.

template <typename T, typename Walker>
struct generic_def_path_iterator
    ...
    bool operator==(const generic_def_path_iterator &O) const {
      if (N.hasValue() != O.N.hasValue())
        return false;
      return !N.hasValue() || *N == *O.N;
    }
}

The outer loop is generated by the find_if below. It is trying to
compare current iterator with the end of range and see if the
searching has been finished.

        // Find the node we started at. We can't search based on N->Last, since
        // we may have gone around a loop with a different MemoryLocation.
        auto Iter = find_if(def_path(Blocker->LastNode), [&](const DefPath &N) {
          return defPathIndex(N) < PriorPathsSize;
        });

And we use an empty iterator as the end of range. The
optional<ListIndex> N in struct generic_def_path_iterator contains
undef data field when it is initialized by None.

  iterator_range<def_path_iterator> def_path(ListIndex From) {
    return make_range(def_path_iterator(this, From), def_path_iterator());
  }

Wei.

Well, I don't think there's an easy fix for this..
The only suggestion I can help to improve isGuaranteedNotToBeUndefOrPoison() is to tweak the pass manager to say if all inter-procedural stuff is done such that this function can consider all arguments and global as non-poisonous. That probably will improve the precision significantly.
I can't think of any easy hack otherwise.

Nuno

Hi,

It seems MemorySSA.cpp is the only real code where we found the
problem happening.

This is doubtful, ¸FWIW :slight_smile:

Is it possible to change the source of
MemorySSA.cpp to hide the problem and buy some time for now? Now we
use an empty generic_def_path_iterator with Optional<ListIndex> N
being initialized by None as the end of range. Can we initialize the
Optional var with a special value instead of None?

  iterator_range<def_path_iterator> def_path(ListIndex From) {
    return make_range(def_path_iterator(this, From), def_path_iterator());
  }

Why does this work?

Fwiw, I don't think this is a good idea. If it can happen in
MemorySSA it can happen in other organic C++ source code too - think
about what you're going to tell other folks who run into this issue,
where such a workaround may be difficult to achieve or explain.

-- Sanjoy

Talked with David offline and he suggested a simple workaround
solution. For the case in MemorySSA.cpp, we have "if (a == b)" with b
being defined by a phi, and the phi has an undef operand. We can
recognize the simple pattern and give up loop unswitch for it. The
idea is not to use isGuaranteedNotToBeUndefOrPoison to prove
correctness, but just to rule out some patterns where such error may
appear.

I admit the solution is far from ideal, but at least it is very simple
to implement, and serves to mitigate the immediate correctness issue
while avoiding performance regression. What do you think?

Thanks,
Wei.

At runtime, without the hoisting, the undef value will never be referenced, but the speculative hoisting of the comparison introduces the real undef reference. Another workaround is to disable such speculative LICM in the first place. If it is hoisted, it should really be like:

if (some_cond)
t1 = (a == i)
else
t2 = undef
t = phi(t1, t2)

In this case, the condition is loop variant, so it is not possible.

David

Hi,

It seems MemorySSA.cpp is the only real code where we found the
problem happening.

This is doubtful, ¸FWIW :slight_smile:

Is it possible to change the source of
MemorySSA.cpp to hide the problem and buy some time for now? Now we
use an empty generic_def_path_iterator with Optional<ListIndex> N
being initialized by None as the end of range. Can we initialize the
Optional var with a special value instead of None?

  iterator_range<def_path_iterator> def_path(ListIndex From) {
    return make_range(def_path_iterator(this, From), def_path_iterator());
  }

Why does this work?

Fwiw, I don't think this is a good idea. If it can happen in
MemorySSA it can happen in other organic C++ source code too - think
about what you're going to tell other folks who run into this issue,
where such a workaround may be difficult to achieve or explain.

-- Sanjoy

Talked with David offline and he suggested a simple workaround
solution. For the case in MemorySSA.cpp, we have "if (a == b)" with b
being defined by a phi, and the phi has an undef operand. We can
recognize the simple pattern and give up loop unswitch for it. The
idea is not to use isGuaranteedNotToBeUndefOrPoison to prove
correctness, but just to rule out some patterns where such error may
appear.

I admit the solution is far from ideal, but at least it is very simple
to implement, and serves to mitigate the immediate correctness issue
while avoiding performance regression. What do you think?

Thanks,
Wei.

I tried the idea and it worked for the simple case, but didn't work
when compiling MemorySSA.cpp. That is because for the following
pattern:

foo(c) {
  b = phi(c, undef)
  t = (a == b)
  loop:
    if (t)
  end loop
}

cfgsimplify will convert it to
foo(c) {
  b = select i1 cond i32 c, undef
  t = (a == b)
  loop:
    if (t)
  end loop
}

And instcombine can remove the select and generate:
foo(c) {
  t = (a == c)
  loop:
    if (t)
  end loop
}

Now c is a param so loop unswitch kicks in. However, the argument of
foo is undef too, so we run into the problem again.

Wei.

I doubt it is possible for us to try and make any fix which is predicated on eagerly treating undef in a particular way, refinement will always cause these problems to come about…

Given what I’ve seen in LLVM (and what I’ve learned from other compilers), we probably have two choices:

  1. Rework undef so that it is an instruction, not a constant. This will make it easy to unswitch, etc. because it would be stable. This approach is used by other production quality compilers and is workable. Would also result in a lot of churn.
  2. Keep undef as a constant, introduce freeze. Churn would mostly be restricted to correctness fixes instead of to core IR representation.

Note that option #1 is basically a short-hand for freeze(undef).

We don’t need to fix all the problems at the same time but I think we need to start somewhere. I don’t think there are any good shortcuts.

I agree; I don’t think there are any good shortcuts here. Sadly, I think we should follow our traditional policy: revert to green. Whatever commit actually triggers the self-hosting bug, revert it (or effectively revert it by disabling whatever it is by default). Then we should proceed with fixing the underlying problem with all due haste. -Hal