Hello,

I was wondering under which circumstances ScalarEvolution will propagate

the no wrap flags from binary operators. In particular I looked at

non-loop carried code, e.g., as in the following function:

int add(int x, int y) { return x + y; }

for which clang uses an "add nsw" instruction but ScalarEvolution does

not propagate this information. The -analyze output looks like this:

Classifying expressions for: @add

%add = add nsw i32 %x, %y

--> (%x + %y) U: full-set S: full-set

The piece of code that causes this (for me) unexpected result is located

in the ScalarEvolution::getNoWrapFlagsFromUB(...) function and looks

like this:

if (InnermostContainingLoop == nullptr ||

InnermostContainingLoop->getHeader() != BinOp->getParent())

return SCEV::FlagAnyWrap;

If the "InnermostContainingLoop" is a nullptr, thus if the binary

operator is not contained in a loop, all overflow annotations are simply

discarded.

I do not know if I am missing something or if simply nobody cared so

far. In any case, I would appreciate any documentation on the current

implementation as well as pointer to were it could be improved.

Cheers,

Johannes

P.S.

I have another problem with the way we currently discard the "nsw"

information for accesses ("gep inbounds" apparently only implies nuw).

I will probably start another thread for that one [after I read the

GEP handling in more detail].

[+CC Bjarke who wrote getNoWrapFlagsFromUB + related bits]

Hi Johannes,

One fundamental reason why we cannot reason about NoWrap flags in SCEV

for arithmetic outside of loops is to avoid issues like this:

if (cond) {

val[x +nsw y] = 42;

} else {

val[x + y] = 42;

}

Now *both* the "x +nsw y" and "x + y" will get mapped to the same SCEV

expression (this is by design [0]), and therefore we cannot mark *the*

SCEV expression for "x + y" with NSW, otherwise we may incorrectly

simplify the "val[x + y] = 42" branch.

(Btw, I'm sure you're aware of this, but "x +nsw y" by itself does not

guarantee no-wrap, it only does so if "x +nsw y" is used in a

side-effecting way.)

getNoWrapFlagsFromUB() avoids the problem above by adding NSW

(resp. NUW) to operations on add recurrences (i.e. an add-rec + 2,

say) that are guaranteed to execute at least once per loop iteration.

Thus we are guaranteed that the *arithmetic operation* (not //that//

specific instruction, but the arithmetic itself) does not overflow on

any iteration and thus can be safely marked as non-wrapping, since if

it didn't the program is guaranteed to be UB.

[0]: IOW, we don't "key" on the no-wrap tag, we just key on the

arithmetic bits of the operation and mutate SCEV expression in-place

when we discover NUW / NSW.

-- Sanjoy

Hey Sanjoy,

Thanks for the quick repsonse.

[+CC Bjarke who wrote getNoWrapFlagsFromUB + related bits]

Also thanks.

One fundamental reason why we cannot reason about NoWrap flags in SCEV

for arithmetic outside of loops is to avoid issues like this:

if (cond) {

val[x +nsw y] = 42;

} else {

val[x + y] = 42;

}

Now *both* the "x +nsw y" and "x + y" will get mapped to the same SCEV

expression (this is by design [0]), and therefore we cannot mark *the*

SCEV expression for "x + y" with NSW, otherwise we may incorrectly

simplify the "val[x + y] = 42" branch.

OK. I might have to look at the IR then since I am interested in the

particular instructions that make up the computation not the abstract

"arithmetic".

(Btw, I'm sure you're aware of this, but "x +nsw y" by itself does not

guarantee no-wrap, it only does so if "x +nsw y" is used in a

side-effecting way.)

I was not (in particular) but that makes sense. For my work the poison

value will (almost) always cause a side-effect to have undefined

behaviour so that should not be a problem.

getNoWrapFlagsFromUB() avoids the problem above by adding NSW

(resp. NUW) to operations on add recurrences (i.e. an add-rec + 2,

say) that are guaranteed to execute at least once per loop iteration.

Thus we are guaranteed that the *arithmetic operation* (not //that//

specific instruction, but the arithmetic itself) does not overflow on

any iteration and thus can be safely marked as non-wrapping, since if

it didn't the program is guaranteed to be UB.

Mh, I have to think about this a bit...

Is there any plan to use e.g., post-dominance information to

propagate wrapping flags? If x +nsw y post-dominates the entry block

[and is used in some side-effect instruction] the SCEV could be marked

as non-wrapping, couldn't it? How do you determine if the operation will

poison a side-effect instruction anyway?

[0]: IOW, we don't "key" on the no-wrap tag, we just key on the

arithmetic bits of the operation and mutate SCEV expression in-place

when we discover NUW / NSW.

-- Sanjoy

Cheers,

Johannes

Johannes Doerfert wrote:

> Is there any plan to use e.g., post-dominance information to

> propagate wrapping flags?

None that I'm aware of.

> If x +nsw y post-dominates the entry block

> [and is used in some side-effect instruction] the SCEV could be marked

> as non-wrapping, couldn't it?

Yes, but we have to be careful to account for exceptions, infinitely

looping functions, exit(0) etc. I.e. in

void foo() {

use(x + y);

exit(0);

use(x +nsw y);

}

the nsw in the later x+y cannot be transferred to the former x+y

(assume use is a side effecting use).

> How do you determine if the operation will

> poison a side-effect instruction anyway?

SCEV uses llvm::isKnownNotFullPoison for this.

-- Sanjoy

Hi Johannes,

Sanjoy has given you great information already.

Hi Bjarke,

Thanks for your response. I agree that Sanjoy's email already answered

my question. To be honest, I am not 100% sure about the loop code and

the consequences but that doesn't matter to much at the moment. What

matters is that I have made progress on explaining the mismatch between

the ScalarEvolution results and what I would have expected, or better,

liked. At the moment I think that the context in-sensitivity of

ScalarEvolution is what is preventing me from the results I would like

to see. Therefor, I have to look at the IR during Polly's context

sensitive analysis/modeling step. As our "context" includes information

about possibly "non-returning function calls", infinite loops, etc. I

think that should allow to get precise but also sound results.

Thank you both again,

Johannes

Replace context with flow in the last mail... sry.