Inferring nsw/nuw flags for increment/decrement based on relational comparisons

Matti_Niemenmaa2 · September 20, 2016, 7:05pm

Hi everyone,

I posted some questions related to implementing inference of nsw/nuw
flags based on known icmp results to Bug 30428 (
https://llvm.org/bugs/show_bug.cgi?id=30428 ) and it was recommended
that I engage a wider audience by coming here. The minimal context is
the following, please see the bug report for more detail:

> 1. If (X s< Y), then both X + 1 and Y - 1 are nsw.
> 2. If (X u< Y), then both X + 1 and Y - 1 are nuw.

This is basically what I might be interested in implementing, or at
least mapping out an implementation approach for, for the sake of
eliminating some bounds checks in Rust that the Go compiler is
apparently able to eliminate:
https://github.com/rust-lang/rust/issues/35981

I'll quote my questions about the possible implementation below and try
to add some more context.

> DominatorTree and AssumptionCache do seem to be sufficient and already
> present. computeOverflowForSignedAdd and computeOverflowForUnsignedAdd
> look like the kind of places to extend with this analysis based on
> that info.

ValueTracking has access to DominatorTree and AssumptionCache
information, and provides the named two functions. InstCombine calls
them and sets nsw/nuw flags based on the result.

> Two questions about that approach though:
>
> 1. Going back to the predsimplify problems: performance. Walking up
> the CFG (even only up to the MaxDepth = 6 defined in
> ValueTracking.cpp) and doing dominance queries on certain icmps seems
> like a fairly expensive operation to perform on every add/sub
> instruction. At the minimum I'd imagine the icmps that we know to be
> true should be cached for the basic block. ...and this is more or less
> equivalent to the suggestion about llvm.assume generation that was
> shot down in the discussion you linked?

The link in question is the following:
http://lists.llvm.org/pipermail/llvm-dev/2015-January/080666.html

Perhaps these operations are cheaper than I think and such caching is
not needed? Alternatively they could be put behind -O3 i.e. the
ExpensiveCombines variable Sanjay pointed out.

> 2. InstCombiner::visitAdd only calls into ValueTracking for the
> unsigned case, i.e. computeOverflowForUnsignedAdd. There are no
> computeOverflowFor*Sub functions that InstCombiner::visitSub even
> could make use of. Instead, InstCombiner has its own
> WillNotOverflow{S,UnS}igned{Add,Sub} functions. The relationship
> between these and the computeOverflow* functions is unclear to me.
> They look like they overlap to an extent.

Is some refactoring warranted here or does this mean that the
InstCombiner functions and not the ValueTracking ones are the ones to
edit? It seems a bit strange that InstCombine has extra logic here since
ValueTracking is used from other locations as well and seems like the
right place to provide the ultimate truth. InstCombiner::visitAdd has a
nearby TODO implying performance concerns, is that what it's about?

— Matti

preames · September 26, 2016, 11:28pm

Hi everyone,

I posted some questions related to implementing inference of nsw/nuw
flags based on known icmp results to Bug 30428 (
30428 – Infer nsw/nuw flags for increment/decrement if relational comparison succeeded ) and it was recommended
that I engage a wider audience by coming here. The minimal context is
the following, please see the bug report for more detail:

> 1. If (X s< Y), then both X + 1 and Y - 1 are nsw.
> 2. If (X u< Y), then both X + 1 and Y - 1 are nuw.

If this is the only case you want to support, this sounds like a fairly straight forward extension to the LazyValueInfo analysis. In particular, take a look at getValueFromICmpCondition. I'd be happy to help review a patch here once you've got something working.

The basic idea would be that (X s<Y ) implies X s< INT_MAX since Y must be INT_MAX or smaller and X is less than that. We can tell this without needing to know anything about Y.

Fair warning, we're actively working through issues related to nsw/nuw inference causing overall regressions. I think we've got the key one identified and a patch is under review, but I suspect you'll stumble across the same thing.

Matti_Niemenmaa · September 27, 2016, 4:55pm

I posted some questions related to implementing inference of nsw/nuw
flags based on known icmp results to Bug 30428 (
30428 – Infer nsw/nuw flags for increment/decrement if relational comparison succeeded ) and it was recommended
that I engage a wider audience by coming here. The minimal context is
the following, please see the bug report for more detail:

> 1. If (X s< Y), then both X + 1 and Y - 1 are nsw.
> 2. If (X u< Y), then both X + 1 and Y - 1 are nuw.

If this is the only case you want to support, this sounds like a fairly
straight forward extension to the LazyValueInfo analysis. In
particular, take a look at getValueFromICmpCondition. I'd be happy to
help review a patch here once you've got something working.

The basic idea would be that (X s<Y ) implies X s< INT_MAX since Y must
be INT_MAX or smaller and X is less than that. We can tell this without
needing to know anything about Y.

Looks like a good idea, but I'm not sure how LazyValueInfo's interface would support this case. Did you mean synthesizing the INT_MAX constant and then checking specifically for "X s< INT_MAX" using LazyValueInfo::getPredicateAt? At a glance that seems like it would work, but it feels like an odd way of doing it.

Initially I was looking at LVI::getConstantRange but its "at the end of the specified block" interface seems too restrictive. The block containing the comparison may end in a conditional br and so surely LVI can't prove anything there. And the block containing the increment/decrement instruction may contain some later information that LVI can prove at the end of the block, but is not true at the instruction?

CorrelatedValuePropagation and JumpThreading appear to be the only transformation passes making use of LVI at the moment, and that's probably something we don't want to change. This kind of nsw/nuw flag inference doesn't really fit in either, but CVP is definitely the closer match and it should be possible to shoehorn it in there.

Fair warning, we're actively working through issues related to nsw/nuw
inference causing overall regressions. I think we've got the key one
identified and a patch is under review, but I suspect you'll stumble
across the same thing.

Interesting, so adding nsw/nuw flags is pessimizing the generated code? Can you provide any links?

preames · September 27, 2016, 11:06pm

I posted some questions related to implementing inference of nsw/nuw
flags based on known icmp results to Bug 30428 (
30428 – Infer nsw/nuw flags for increment/decrement if relational comparison succeeded ) and it was recommended
that I engage a wider audience by coming here. The minimal context is
the following, please see the bug report for more detail:

> 1. If (X s< Y), then both X + 1 and Y - 1 are nsw.
> 2. If (X u< Y), then both X + 1 and Y - 1 are nuw.

If this is the only case you want to support, this sounds like a fairly
straight forward extension to the LazyValueInfo analysis. In
particular, take a look at getValueFromICmpCondition. I'd be happy to
help review a patch here once you've got something working.

The basic idea would be that (X s<Y ) implies X s< INT_MAX since Y must
be INT_MAX or smaller and X is less than that. We can tell this without
needing to know anything about Y.

Looks like a good idea, but I'm not sure how LazyValueInfo's interface would support this case. Did you mean synthesizing the INT_MAX constant and then checking specifically for "X s< INT_MAX" using LazyValueInfo::getPredicateAt? At a glance that seems like it would work, but it feels like an odd way of doing it.

Initially I was looking at LVI::getConstantRange but its "at the end of the specified block" interface seems too restrictive. The block containing the comparison may end in a conditional br and so surely LVI can't prove anything there. And the block containing the increment/decrement instruction may contain some later information that LVI can prove at the end of the block, but is not true at the instruction?

CorrelatedValuePropagation and JumpThreading appear to be the only transformation passes making use of LVI at the moment, and that's probably something we don't want to change. This kind of nsw/nuw flag inference doesn't really fit in either, but CVP is definitely the closer match and it should be possible to shoehorn it in there.

Will respond to this later once I have a bit more time to put thoughts together.

Fair warning, we're actively working through issues related to nsw/nuw
inference causing overall regressions. I think we've got the key one
identified and a patch is under review, but I suspect you'll stumble
across the same thing.

Interesting, so adding nsw/nuw flags is pessimizing the generated code? Can you provide any links?

⚙ D24280 [IndVarSimplify] Wisely choose sext or zext when widening IV is the active review. It's been accepted and should go in shortly.

preames · September 28, 2016, 2:50pm

I posted some questions related to implementing inference of nsw/nuw
flags based on known icmp results to Bug 30428 (
30428 – Infer nsw/nuw flags for increment/decrement if relational comparison succeeded ) and it was recommended
that I engage a wider audience by coming here. The minimal context is
the following, please see the bug report for more detail:

> 1. If (X s< Y), then both X + 1 and Y - 1 are nsw.
> 2. If (X u< Y), then both X + 1 and Y - 1 are nuw.

If this is the only case you want to support, this sounds like a fairly
straight forward extension to the LazyValueInfo analysis. In
particular, take a look at getValueFromICmpCondition. I'd be happy to
help review a patch here once you've got something working.

The basic idea would be that (X s<Y ) implies X s< INT_MAX since Y must
be INT_MAX or smaller and X is less than that. We can tell this without
needing to know anything about Y.

Looks like a good idea, but I'm not sure how LazyValueInfo's interface would support this case. Did you mean synthesizing the INT_MAX constant and then checking specifically for "X s< INT_MAX" using LazyValueInfo::getPredicateAt? At a glance that seems like it would work, but it feels like an odd way of doing it.

Initially I was looking at LVI::getConstantRange but its "at the end of the specified block" interface seems too restrictive. The block containing the comparison may end in a conditional br and so surely LVI can't prove anything there. And the block containing the increment/decrement instruction may contain some later information that LVI can prove at the end of the block, but is not true at the instruction?

CorrelatedValuePropagation and JumpThreading appear to be the only transformation passes making use of LVI at the moment, and that's probably something we don't want to change. This kind of nsw/nuw flag inference doesn't really fit in either, but CVP is definitely the closer match and it should be possible to shoehorn it in there.

CVP actually already supports this case in ToT; it's just hidden behind an option while we address the regression previously mentioned. See -cvp-dont-process-adds and the relevant guarded code.

Your point about the end of block bit can be split into two pieces:
1) Asking a question about an add guarded by a condition in a *previous* block. This case LVI handles elegantly. It's pretty much it's reason for existence.
2) An add which is guarded by an assume or guard in the *same* block does look concerning. From a quick glance at the (off by default) functionality, it looks like there might be a bug here? (Artur, please write a test case to check and submit it.)

Matti_Niemenmaa · September 29, 2016, 4:57pm

I posted some questions related to implementing inference of nsw/nuw
flags based on known icmp results to Bug 30428 (
30428 – Infer nsw/nuw flags for increment/decrement if relational comparison succeeded ) and it was recommended
that I engage a wider audience by coming here. The minimal context is
the following, please see the bug report for more detail:

> 1. If (X s< Y), then both X + 1 and Y - 1 are nsw.
> 2. If (X u< Y), then both X + 1 and Y - 1 are nuw.

If this is the only case you want to support, this sounds like a fairly
straight forward extension to the LazyValueInfo analysis. In
particular, take a look at getValueFromICmpCondition. I'd be happy to
help review a patch here once you've got something working.

The basic idea would be that (X s<Y ) implies X s< INT_MAX since Y must
be INT_MAX or smaller and X is less than that. We can tell this without
needing to know anything about Y.

Looks like a good idea, but I'm not sure how LazyValueInfo's interface
would support this case. Did you mean synthesizing the INT_MAX
constant and then checking specifically for "X s< INT_MAX" using
LazyValueInfo::getPredicateAt? At a glance that seems like it would
work, but it feels like an odd way of doing it.

Initially I was looking at LVI::getConstantRange but its "at the end
of the specified block" interface seems too restrictive. The block
containing the comparison may end in a conditional br and so surely
LVI can't prove anything there. And the block containing the
increment/decrement instruction may contain some later information
that LVI can prove at the end of the block, but is not true at the
instruction?

CorrelatedValuePropagation and JumpThreading appear to be the only
transformation passes making use of LVI at the moment, and that's
probably something we don't want to change. This kind of nsw/nuw flag
inference doesn't really fit in either, but CVP is definitely the
closer match and it should be possible to shoehorn it in there.

CVP actually already supports this case in ToT; it's just hidden behind
an option while we address the regression previously mentioned. See
-cvp-dont-process-adds and the relevant guarded code.

Well then, seems like I'm a few months late. I tried out passing -cvp-dont-process-adds=false and the optimization I initially described is performed exactly as I would expect it to be. I'll close Bug 30428.

The only missing thing is handling subtractions; this aspect of CVP is now restricted to additions. This makes a difference only for the nuw case, and only if a canonicalization change is made first: X - 1 is currently canonicalized to X + -1, which are equivalent in terms of nsw, but only the former could be correctly marked as nuw based on a previously true X u> Y (or Y u< X). (This also came up in my slightly related patch at D24700: ⚙ D24700 [InstCombine] optimize unsigned icmp of increment )

Your point about the end of block bit can be split into two pieces:
1) Asking a question about an add guarded by a condition in a *previous*
block. This case LVI handles elegantly. It's pretty much it's reason
for existence.
2) An add which is guarded by an assume or guard in the *same* block
does look concerning. From a quick glance at the (off by default)
functionality, it looks like there might be a bug here? (Artur, please
write a test case to check and submit it.)

It seems like this was noticed already at the original bug report at 28620 – [indvars] Redundant loop increment after IV widening — 'Hopefully "fixing" this will just require adjusting the documentation a bit.' Please do test it though.

Artur_Pilipenko · September 29, 2016, 5:20pm

I posted some questions related to implementing inference of nsw/nuw
flags based on known icmp results to Bug 30428 (
https://llvm.org/bugs/show_bug.cgi?id=30428 ) and it was recommended
that I engage a wider audience by coming here. The minimal context is
the following, please see the bug report for more detail:

If (X s< Y), then both X + 1 and Y - 1 are nsw.

If (X u< Y), then both X + 1 and Y - 1 are nuw.

If this is the only case you want to support, this sounds like a fairly
straight forward extension to the LazyValueInfo analysis. In
particular, take a look at getValueFromICmpCondition. I’d be happy to
help review a patch here once you’ve got something working.

The basic idea would be that (X s<Y ) implies X s< INT_MAX since Y must
be INT_MAX or smaller and X is less than that. We can tell this without
needing to know anything about Y.

Looks like a good idea, but I’m not sure how LazyValueInfo’s interface would support this case. Did you mean synthesizing the INT_MAX constant and then checking specifically for “X s< INT_MAX” using LazyValueInfo::getPredicateAt? At a glance that seems like it would work, but it feels like an odd way of doing it.

Initially I was looking at LVI::getConstantRange but its “at the end of the specified block” interface seems too restrictive. The block containing the comparison may end in a conditional br and so surely LVI can’t prove anything there. And the block containing the increment/decrement instruction may contain some later information that LVI can prove at the end of the block, but is not true at the instruction?

CorrelatedValuePropagation and JumpThreading appear to be the only transformation passes making use of LVI at the moment, and that’s probably something we don’t want to change. This kind of nsw/nuw flag inference doesn’t really fit in either, but CVP is definitely the closer match and it should be possible to shoehorn it in there.

CVP actually already supports this case in ToT; it’s just hidden behind an option while we address the regression previously mentioned. See -cvp-dont-process-adds and the relevant guarded code.

Your point about the end of block bit can be split into two pieces:

Asking a question about an add guarded by a condition in a previous block. This case LVI handles elegantly. It’s pretty much it’s reason for existence.

An add which is guarded by an assume or guard in the same block does look concerning. From a quick glance at the (off by default) functionality, it looks like there might be a bug here? (Artur, please write a test case to check and submit it.)

CVP doesn’t look at instructions which operands are in the same basic block at the instruction. The motivation is not to spend time for cases which can be handled without expensive LVI machinery.

This test checks that LVI uses information from guard intrinsics:
https://github.com/llvm-mirror/llvm/blob/master/test/Transforms/CorrelatedValuePropagation/guards.ll

Artur

Topic		Replies	Views
A question about everyone's favorite constructs: NSW and NUW LLVM Dev List Archives	11	106	January 21, 2014
Improving SCEV's behavior around IR level no-wrap flags LLVM Dev List Archives	13	107	September 24, 2016
Request for review: SimplifyIndVar simplify out of loop Beginners	0	63	June 6, 2024
[llvm] r303387 - [InstCombine] add more tests for xor-of-icmps; NFC LLVM Dev List Archives	8	78	July 21, 2017
Question about nsw and nuw flags LLVM Dev List Archives	2	155	January 13, 2011

Inferring nsw/nuw flags for increment/decrement based on relational comparisons

Related topics