Hi
As I understand live variable analysis will set the def/kill
properties of operands. In that case, is it still needed to set the
kill flags when possible during lowering?
thanks
dz
Hi
As I understand live variable analysis will set the def/kill
properties of operands. In that case, is it still needed to set the
kill flags when possible during lowering?
thanks
dz
Are any passes before register allocation using the kill flags?
LiveVariables is not run in the -O0 pipeline. The fast register allocator should work fine without kill flags.
/jakob
Hi all!
I've been tinkering with LLVM's code-base for a few days, hoping to start on one of the ideas mentioned in the "Open Projects" page (I was told 'Improving the current system'/'Miscellaneous Improvements'/5 would be a good start).
While I was at it, I also took a stab at finishing up one of the TODOs. I've attached the patch for review.
0.diff (1.92 KB)
While I was at it, I also took a stab at finishing up one of the TODOs. I've
attached the patch for review.
Comments inline.
For those of you following at home, this code is in
InstCombiner::WillNotOverflowSignedAdd(), and the first line of the
initial comment is:
// If one of the operands only has one non-zero bit, and if the
other operand
--- lib/Transforms/InstCombine/InstCombineAddSub.cpp (revision 126747)
+++ lib/Transforms/InstCombine/InstCombineAddSub.cpp (working copy)
@@ -77,9 +77,55 @@
// has a known-zero bit in a more significant place than it (not including the
// sign bit) the ripple may go up to and fill the zero, but won't change the
// sign. For example, (X & ~4) + 1.
-
- // TODO: Implement.
-
+
+ int32_t power;
+
+ {
+ int width = LHS->getType()->getScalarSizeInBits();
This should be an unsigned, like the result type of getScalarSizeInBits().
+ APInt mask(width, 0), zeroes(width, 0), ones(width, 0);
+ mask.setAllBits();
An easier way to get an all-one mask would be to construct it as
'APInt mask(width, -1, true)'.
+ ComputeMaskedBits(LHS, mask, zeroes, ones);
+ zeroes.flipAllBits();
+
+ if ((power = ones.exactLogBase2()) != -1 && zeroes == ones) {
It would probably be cleaner to assign 'power' outside the condition.
I don't think you're handling the case where the only set bit is the
sign bit correctly; in that case the value is the most negative number
possible, so you only need to check that the other side is
non-negative. See below for how to handle this.
This is also the only place where above 'zeroes' is used at the
moment. It might be better not change zeroes itself here so it can be
reused in the next block (see below). Use something like '~zeroes ==
ones'. Alternatively something like '(zeroes | ones).isAllOnesValue()'
would work too, since they won't share any bits.
+ int width = RHS->getType()->getScalarSizeInBits();
This is the same as the previous width since the types of the LHS and
RHS are required to be equal for instructions like add. Just delete
this line.
+ APInt mask(width, 0), zeroes(width, 0), ones(width, 0);
These shadow (have the same name as) the ones before. It's better to
reuse the mask and rename the other two. The earlier ones should be
LHSKnownZero and LHSKnownOne and these should be RHSKnownZero and
RHSKnownOne. This is more consistent with the way they're named
elsewhere in LLVM.
+ mask.clearAllBits();
This is redundant here because it's already been initialized to zero
in the constructor. If you'd reuse the old mask variable as suggested
above you'd need this, but see below.
+
+ // Disregarding the sign bit
+ for (int i = (width - 2); i > power; i--)
+ mask.setBit(i);
I think this is equivalent to
if (power < width-2)
mask = APInt::getBitsSet(width, power+1, width-2);
else
mask.clearAllBits();
(This would mean the clearAllBits() above would again be redundant)
However, a nice way to handle the signbit-only case would be to wrap
that in an extra if as follows:
if (power == width - 1)
mask = APInt::getSignBit(width); // Alternatively: LHSKnownOne,
which should be equivalent.
else if // ... the code above ...
so that for signbit-only LHS the check below tests whether the RHS is
non-negative.
+ ComputeMaskedBits(RHS, mask, zeroes, ones);
+
+ // At least one 0
+ if (zeroes.countPopulation())
This should be 'if (RHSKnownZero.getBoolValue())' / 'if
(!!RHSKnownZero)' or similar. You don't need the actual number of set
bits here, you just want to know whether it's zero.
+ return true;
+ }
+ }
+
+ {
+ int width = RHS->getType()->getScalarSizeInBits();
This has already been calculated in the previous block. Reuse it.
+ APInt mask(width, 0), zeroes(width, 0), ones(width, 0);
+ mask.setAllBits();
+ ComputeMaskedBits(RHS, mask, zeroes, ones);
If you calculated this beforehand and stored it in RHSKnownZero and
RHSKnownOne, you'd only need two ComputeMaskedBits calls instead of
four in this code. Inside the 'if's you can use e.g. (Mask &
RHSKnownZero) to get the values you're currently getting there.
+ zeroes.flipAllBits();
+
+ if ((power = ones.exactLogBase2()) != -1 && zeroes == ones) {
+ int width = LHS->getType()->getScalarSizeInBits();
+ APInt mask(width, 0), zeroes(width, 0), ones(width, 0);
+ mask.clearAllBits();
+
+ // Disregarding the sign bit
+ for (int i = (width - 2); i > power; i--)
+ mask.setBit(i);
+ ComputeMaskedBits(LHS, mask, zeroes, ones);
+
+ // At least one 0
+ if (zeroes.countPopulation())
+ return true;
+ }
+ }
Much of the same comments apply to this block as to the one above it
since it appears to be duplicated code with LHS and RHS switched.
Depending on how much code remains in the end, it might be better to
factor it out to a static function taking the changing values as
parameters.
+
return false;
}
You're missing tests for this new functionality. Add some to
test/Transforms/InstCombine/ after searching for
'WillNotOverflowSignedAdd' calls to see the patterns it's used for.
Hi!
I've attached a patch which takes care of the issues mentioned (and adds two tests).
ripple-bucket.diff (3.24 KB)
Hi!
I've attached a patch which takes care of the issues mentioned (and adds two
tests).
Index: test/Transforms/InstCombine/sext.ll
--- test/Transforms/InstCombine/sext.ll (revision 127153)
+++ test/Transforms/InstCombine/sext.ll (working copy)
@@ -126,3 +126,16 @@
; CHECK-NEXT: store <2 x i16>
; CHECK-NEXT: ret
}
+
+define i64 @test12(i32 %x) {
+ %a = and i32 %x, -5
+ %b = sext i32 %a to i64
+ %c = add i64 %b, 1
+ ret i64 %c
+
+; CHECK: @test12
+; CHECK-NEXT: and i32
+; CHECK-NEXT: add nsw
Why not check for 'i32' after the add? Isn't the entire point of this
patch to shrink the add in cases like this?
+; CHECK-NEXT: sext i32
+; CHECK-NEXT: ret i64
+}
Index: test/Transforms/InstCombine/add-sitofp.ll--- test/Transforms/InstCombine/add-sitofp.ll (revision 127153)
+++ test/Transforms/InstCombine/add-sitofp.ll (working copy)
@@ -1,4 +1,5 @@
; RUN: opt < %s -instcombine -S | grep {add nsw i32}
+; RUN: opt < %s -instcombine -S | grep sitofp | count 2
When adding to old tests like this one it's better to migrate them
from grep to FileCheck.
RUN: opt < %s -instcombine -S | FileCheck
define double @x(i32 %a, i32 %b) nounwind {
CHECK: @x
CHECK: add nsw i32
%m = lshr i32 %a, 24
@@ -7,3 +8,12 @@
%p = fadd double %o, 1.0
ret double %p
}
+
+define double @y(i32 %x, i32 %y) {
CHECK: @y
CHECK: <something>
Note that FileCheck is a nice way to perform more precise checking here :).
In particular, if I recall the transformation this is being used for
correctly, you want to check this does an integer addition instead of
a floating-point one?
Though in this case the 'add' later gets optimized to an 'or', so
maybe you should replace -4 to -5 to keep the test clearer (and be
able to check for the 'nsw' flag).
+ %p = and i32 %x, -4
+ %q = and i32 %y, 1
CHECK: add nsw i32
CHECK: sitofp
CHECK-NOT: sitofp
CHECK-NOT: fadd
(Those last two are to confirm you removed the other sitofp and the
floating-point add)
+ %a = sitofp i32 %p to double
+ %b = sitofp i32 %q to double
+ %result = fadd double %a, %b
+ ret double %result
+}
Index: lib/Transforms/InstCombine/InstCombineAddSub.cpp--- lib/Transforms/InstCombine/InstCombineAddSub.cpp (revision 127153)
+++ lib/Transforms/InstCombine/InstCombineAddSub.cpp (working copy)
@@ -57,6 +57,30 @@
}+static bool RippleBucketExists(APInt &thisKnownZero, APInt &thisKnownOne,
+ APInt &otherKnownZero, unsigned width) {
+ APInt mask;
+ // First try to to take care of the case
+ // (X & ~4) + (Y & 1)
+ int32_t power = (~thisKnownZero).exactLogBase2();
+ if (power == -1) {
+ if (~thisKnownZero == thisKnownOne)
+ power = thisKnownOne.exactLogBase2();
Why did you introduce the extra log here? This assignment is a no-op
when ~thisKnownZero == thisKnownOne...
+ if (power == -1)
... which means this check is redundant.
+ return false;
+ }
+
+ if (power == (width - 1))
Hmm.. I know I said 'width' should be unsigned, but now I get a
warning here for comparing a signed integer to an unsigned one (and
LLVM is supposed to compile without warnings).
So I guess changing width back to int isn't a disaster. (Who needs
integers with over 2 million bits anyway?)
Alternatively, since you know 'power' is non-negative here you can
safely cast it to unsigned for the comparison.
+ mask = APInt::getSignBit(width);
+ else
+ mask = APInt::getBitsSet(width, power + 1, width - 2);
You removed the 'if (power < width-2)' case from the code I mentioned
as equivalent to your loop.
Because of this, if power + 1 > width - 2 (or equivalently in this
case, if power == width - 2) this will create a "wrapped" bit set: the
upper bit and the lower width-2 bits will be set, leaving only bit
width-2 zero. That isn't what you want, right?
+
+ if ((mask & otherKnownZero).getBoolValue())
+ return true;
+
+ return false;
This can be more concisely written as just
return (mask & otherKnownZero).getBoolValue();
Hi!
Thanks for the feedback. Have attached a (smaller / simpler) patch which addresses the issues pointed out.
2.diff (3.13 KB)