the nsw story, revisited

John,

Sanjoy,
Nuno,
David,
Thanks for the tip, below are the relevant posts from the archives.

I am suggesting something similar to Dan’s third option below (Tue Nov 29 2011
"the nsw story”, Dan Gohman), when hoisting an instruction with ‘nsw’ it looses
the ‘nsw’ attribute, but without saying “add-nsw is a fully side-effecting
instruction”.

This option was back then seen by Dan as too much effort, but the current
proposal requires at least the same amount of effort. To be specific the
current proposal requires adding “freeze” instead of removing “nsw”. It is
pretty much the same amount of editing work on the compiler sources either
way as far as add/sub/mul/shl are concerned.

The difference is that removing “nsw” from an operation is simpler and cleaner
than adding “freeze” to it, and in software engineering we are always striving
for the simplest and cleanest solution.

IIUC option three does not inhibit Dan’s original goal of promoting 32-bit
induction variables to 64-bit on an LP64 target. And I don’t agree with Dan’s
description of this being “suboptimal” for any other reason, the lost
information does not seem to be useable AFAICT, but if you disagree then simply
insert an “llvm.assume" in the spot where the operation is being hoisted from.

So I propose that we revisit this option. It was never fully explored in Dan’s
original email, nor in any subsequent emails, and now especially it seems to
deserve it.

Thoughts ?
Comments ?
Questions ?

Peter Lawrence.

2011: =========================================================================

I guess an important thing to realize is that poison flows throughout the data-flow graph; it's not something that you can control locally. Simply removing nsw from an operation doesn't mean that its result will be non-poisonous.
Right now there's no way to stop propagation of poison; freeze adds that functionality.

Imagine you want to hoist an 'add nsw', drop nsw, and then do loop unswitching (your suggestion):

f(a, b) {
  while (..) {
    if (a +nsw 1 > b) { S } else { T }
  }
}

=>

f(a, b) {
  // drop nsw
  if (a + 1 > b) {
     while (...) S;
  else
     while (...) T;
  }
}

Is this correct? No! Dropping nsw doesn't give any guarantee. To see why, imagine that function f is now inlined in g:
g() {
  f(poison, poison);
}

after inlining:
g() {
  if (poison) ...
}

We've just introduced a branch on poison when the loop condition is false. Since we've decided to make branch-on-poison UB, the transformation we did above is incorrect. Dropping nsw doesn't help. You need a way to stop poison from being propagated. LLVM currently doesn't have this mechanism.

Nuno

P.S.: Please fix your address book entry for my email address; a colleague of mine is receiving your emails not me.

Nuno,
          A very thought provoking example, but I don’t think I understand your argument yet

It seems you are saying the transformations are illegal because we’ve turned it into
“undefined behavior” based on the proposed definition of “branch on poison is undefined
behavior”, IE your argument seems to hinge on the definition of “branch on poison”
and not on anything related to “nsw"

Or to put it another way, changing the function f(a,b) from using “+nsw” to just plain “+”
doesn’t seem to change anything, we still introduce UB when there wasn’t any originally,
or am I missing something ?

so I return to my previous claim that this definition isn’t justified, because
it appears as though you've just provided an example for why it isn’t.

Thoughts ?
Comments ?
Questions ?

Peter Lawrence.