Evaluate constant condition of if statement has no effect

Simple Example:

int main()
{
int x;
if (x || 1) {}
}

Clang can not evaluate this condition and will emit IR like this:

define signext i32 @main() #0 {
entry:
%retval = alloca i32, align 4
%x = alloca i32, align 4
store i32 0, i32* %retval, align 4
%0 = load i32, i32* %x, align 4
%tobool = icmp ne i32 %0, 0
br i1 %tobool, label %if.then, label %lor.lhs.false

lor.lhs.false: ; preds = %entry
br i1 true, label %if.then, label %if.end

if.then: ; preds = %lor.lhs.false, %entry
br label %if.end

if.end: ; preds = %if.then, %lor.lhs.false
%1 = load i32, i32* %retval, align 4
ret i32 %1
}

However, when we swap the position of LHS and RHS(i.e. if (1 || x), Clang can recognize it:

define signext i32 @main() #0 {
entry:
%x = alloca i32, align 4
ret i32 0
}

I also find the root issue and propose one potential solution in comment 2 of this link: https://bugs.llvm.org/show_bug.cgi?id=34229

Any idea?

Thanks in advance.

There will always be some example of something that we generate less efficiently at -O0 than we could. Why is this example specifically worth optimizing?

John.

This is not specific example. If we don’t fix it, user will feel strange why LHS of if stmt condition could do constant folding but RHS couldn’t.

From the source code view’s point, this is our compiler bug.

In the ExprConstant.cpp:

case Job::BinOpKind: {

if (!VisitBinOpLHSOnly(Result, Bop, SuppressRHSDiags)) {
Queue.pop_back();
return;
}

job.Kind = Job::BinOpVisitedLHSKind;
enqueue(Bop->getRHS());
return;
}

case Job::BinOpVisitedLHSKind: {

Result.Failed = !VisitBinOp(job.LHSResult, RHS, Bop, Result.Val);
Queue.pop_back();
return;
}

If we can not evaluate LHS successfully, we will evaluate RHS (function VisitBinOp). In the function VisitBinOp:

if (E->isLogicalOp()) {
bool lhsResult, rhsResult;
bool LHSIsOK = HandleConversionToBool(LHSResult.Val, lhsResult);
bool RHSIsOK = HandleConversionToBool(RHSResult.Val, rhsResult);
if (LHSIsOK) {
if (RHSIsOK) {
if (E->getOpcode() == BO_LOr)
return Success(lhsResult || rhsResult, E, Result);
else
return Success(lhsResult && rhsResult, E, Result);
}
} else {
if (RHSIsOK) {
// We can’t evaluate the LHS; however, sometimes the result
// is determined by the RHS: X && 0 → 0, X || 1 → 1.
if (rhsResult == (E->getOpcode() == BO_LOr)) {
return Success(rhsResult, E, Result);
}
}
}

We can see that our previous implementation has consider this condition of RHS constant folding. But because of ignore the parameter Expr::SE_AllowSideEffects in the function ConstantFoldsToSimpleInteger:

bool CodeGenFunction::ConstantFoldsToSimpleInteger(const Expr *Cond,
llvm::APSInt &ResultInt,
bool AllowLabels) {
// FIXME: Rename and handle conversion of other evaluatable things
// to bool.
llvm::APSInt Int;
if (!Cond->EvaluateAsInt(Int, getContext(), Expr::SE_AllowSideEffects)) -------> allow side effect
return false; // Not foldable, not integer or not fully evaluatable.

We will not make the RHS constant folding happened.

bool DataRecursiveIntBinOpEvaluator::
VisitBinOpLHSOnly(EvalResult &LHSResult, const BinaryOperator *E,
bool &SuppressRHSDiags) {
….

if (E->isLogicalOp()) {
bool LHSAsBool;
E->getLHS()->dumpColor();
if (!LHSResult.Failed && HandleConversionToBool(LHSResult.Val, LHSAsBool)) {
// We were able to evaluate the LHS, see if we can get away with not
// evaluating the RHS: 0 && X → 0, 1 || X → 1
if (LHSAsBool == (E->getOpcode() == BO_LOr)) {
Success(LHSAsBool, E, LHSResult.Val);
return false; // Ignore RHS
}
} else {
LHSResult.Failed = true;

// Since we weren’t able to evaluate the left hand side, it
// might have had side effects.
if (!Info.noteSideEffect()) { --------> HERE!!!
return false;
}

The detail source code analysis could be find in the comment 2 of https://bugs.llvm.org/show_bug.cgi?id=34229

So, I think we should fix it.

发件人: John McCall
发送时间: 2017-08-23 01:30:45
收件人: Frozen
抄送: cfe-dev@lists.llvm.org
主题: Re: [cfe-dev] Evaluate constant condition of if statement has no effect

This is not specific example. If we don’t fix it, user will feel strange why LHS of if stmt condition could do constant folding but RHS couldn’t.

From the source code view’s point, this is our compiler bug.

In the ExprConstant.cpp:

case Job::BinOpKind: {

if (!VisitBinOpLHSOnly(Result, Bop, SuppressRHSDiags)) {
Queue.pop_back();
return;
}

job.Kind = Job::BinOpVisitedLHSKind;
enqueue(Bop->getRHS());
return;
}

case Job::BinOpVisitedLHSKind: {

Result.Failed = !VisitBinOp(job.LHSResult, RHS, Bop, Result.Val);
Queue.pop_back();
return;
}

If we can not evaluate LHS successfully, we will evaluate RHS (function VisitBinOp). In the function VisitBinOp:

if (E->isLogicalOp()) {
bool lhsResult, rhsResult;
bool LHSIsOK = HandleConversionToBool(LHSResult.Val, lhsResult);
bool RHSIsOK = HandleConversionToBool(RHSResult.Val, rhsResult);
if (LHSIsOK) {
if (RHSIsOK) {
if (E->getOpcode() == BO_LOr)
return Success(lhsResult || rhsResult, E, Result);
else
return Success(lhsResult && rhsResult, E, Result);
}
} else {
if (RHSIsOK) {
// We can’t evaluate the LHS; however, sometimes the result
// is determined by the RHS: X && 0 → 0, X || 1 → 1.
if (rhsResult == (E->getOpcode() == BO_LOr)) {
return Success(rhsResult, E, Result);
}
}
}

We can see that our previous implementation has consider this condition of RHS constant folding. But because of ignore the parameter Expr::SE_AllowSideEffects in the function ConstantFoldsToSimpleInteger:

bool CodeGenFunction::ConstantFoldsToSimpleInteger(const Expr *Cond,
llvm::APSInt &ResultInt,
bool AllowLabels) {
// FIXME: Rename and handle conversion of other evaluatable things
// to bool.
llvm::APSInt Int;
if (!Cond->EvaluateAsInt(Int, getContext(), Expr::SE_AllowSideEffects)) -------> allow side effect
return false; // Not foldable, not integer or not fully evaluatable.

We will not make the RHS constant folding happened.

bool DataRecursiveIntBinOpEvaluator::
VisitBinOpLHSOnly(EvalResult &LHSResult, const BinaryOperator *E,
bool &SuppressRHSDiags) {
….

if (E->isLogicalOp()) {
bool LHSAsBool;
E->getLHS()->dumpColor();
if (!LHSResult.Failed && HandleConversionToBool(LHSResult.Val, LHSAsBool)) {
// We were able to evaluate the LHS, see if we can get away with not
// evaluating the RHS: 0 && X → 0, 1 || X → 1
if (LHSAsBool == (E->getOpcode() == BO_LOr)) {
Success(LHSAsBool, E, LHSResult.Val);
return false; // Ignore RHS
}
} else {
LHSResult.Failed = true;

// Since we weren’t able to evaluate the left hand side, it
// might have had side effects.
if (!Info.noteSideEffect()) { --------> HERE!!!
return false;
}

The detail source code analysis could be find in the comment 2 of https://bugs.llvm.org/show_bug.cgi?id=34229

So, I think we should fix it.

Are you arguing that we should generally ignore side-effects when constant-evaluating an if condition?

John.

Not. Original implementation used the default parameter value, i.e. No Side Effects

/// EvaluateAsInt - Return true if this is a constant which we can fold and
/// convert to an integer, using any crazy technique that we want to.
bool EvaluateAsInt(llvm::APSInt &Result, const ASTContext &Ctx,
SideEffectsKind AllowSideEffects = SE_NoSideEffects) const;

But, now, we should allow side effects and pass the computation to the function Info.noteSideEffect(). As the comment of source code said:

bool DataRecursiveIntBinOpEvaluator::
VisitBinOpLHSOnly(EvalResult &LHSResult, const BinaryOperator *E,
bool &SuppressRHSDiags) {
if (E->isLogicalOp()) {
bool LHSAsBool;
if (!LHSResult.Failed && HandleConversionToBool(LHSResult.Val, LHSAsBool)) {
// We were able to evaluate the LHS, see if we can get away with not
// evaluating the RHS: 0 && X → 0, 1 || X → 1
if (LHSAsBool == (E->getOpcode() == BO_LOr)) {
Success(LHSAsBool, E, LHSResult.Val);
return false; // Ignore RHS
}
} else {
LHSResult.Failed = true;

// Since we weren’t able to evaluate the left hand side, it
// might have had side effects.
if (!Info.noteSideEffect())
return false;

// We can’t evaluate the LHS; however, sometimes the result
// is determined by the RHS: X && 0 → 0, X || 1 → 1.
// Don’t ignore RHS and suppress diagnostics from this arm.
SuppressRHSDiags = true;
}

return true;
}

So we might have had side effects when constant-evaluation in if stmt condition , so we should use Expr::SE_AllowSideEffects, then use Info.noteSideEffect() to compute the result.

在 2017-08-24 02:02:23,“John McCall” rjmccall@apple.com 写道:

Not. Original implementation used the default parameter value, i.e. No Side Effects

/// EvaluateAsInt - Return true if this is a constant which we can fold and
/// convert to an integer, using any crazy technique that we want to.
bool EvaluateAsInt(llvm::APSInt &Result, const ASTContext &Ctx,
SideEffectsKind AllowSideEffects = SE_NoSideEffects) const;

But, now, we should allow side effects and pass the computation to the function Info.noteSideEffect(). As the comment of source code said:

bool DataRecursiveIntBinOpEvaluator::
VisitBinOpLHSOnly(EvalResult &LHSResult, const BinaryOperator *E,
bool &SuppressRHSDiags) {
if (E->isLogicalOp()) {
bool LHSAsBool;
if (!LHSResult.Failed && HandleConversionToBool(LHSResult.Val, LHSAsBool)) {
// We were able to evaluate the LHS, see if we can get away with not
// evaluating the RHS: 0 && X → 0, 1 || X → 1
if (LHSAsBool == (E->getOpcode() == BO_LOr)) {
Success(LHSAsBool, E, LHSResult.Val);
return false; // Ignore RHS
}
} else {
LHSResult.Failed = true;

// Since we weren’t able to evaluate the left hand side, it
// might have had side effects.
if (!Info.noteSideEffect())
return false;

// We can’t evaluate the LHS; however, sometimes the result
// is determined by the RHS: X && 0 → 0, X || 1 → 1.
// Don’t ignore RHS and suppress diagnostics from this arm.
SuppressRHSDiags = true;
}

return true;
}

So we might have had side effects when constant-evaluation in if stmt condition , so we should use Expr::SE_AllowSideEffects, then use Info.noteSideEffect() to compute the result.

Info.noteSideEffect() is not given the expression to evaluate, nor does it call back to IR-generation in a way that would allow us to emit the side effect. A more promising approach would be to modify IRGen’s logic for emitting branches on conditions, which is already a special-case expression emitter due to the desire to directly branch on && and || instead of branching on a phi. That emitter has special-case logic for constant branches, but from the generated code it looks like we’re not applying that in all positions, just at the top level. Recognizing when we’re about to branch on a constant i1 would be sufficient to catch things like this and allow IRGen to reap the (small) benefits of emitting less silly code. I don’t think there’s much point in trying to recognize (X || true) especially, but we could do that, too, if you have a good argument.

John.

Firstly, I briefly clarify the logic of Info.noteSideEffect. Info.noteSideEffect will set EvalStatus.HasSideEffects = true; and make hasUnacceptableSideEffect function always return true so that our
EvaluateAsInt alway return false when LHS can not evaulate.

if (!EvaluateAsRValue(ExprResult, Ctx) || !ExprResult.Val.isInt() ||
hasUnacceptableSideEffect(ExprResult, AllowSideEffects))
return false;

static bool hasUnacceptableSideEffect(Expr::EvalStatus &Result,
Expr::SideEffectsKind SEK) {
return (SEK < Expr::SE_AllowSideEffects && Result.HasSideEffects) ||
(SEK < Expr::SE_AllowUndefinedBehavior && Result.HasUndefinedBehavior);
}

Then I want to express one significant point of it, that is debug. Let us compare it with GCC in the following example.

int main()
{
int x;
if (1 || x) {
printf(“LHS\n”);
}

if (x || 1) {
printf(“RHS\n”);
}
}

When we use GCC to debug it, we will get the following result:

(gdb) b main
Breakpoint 1 at 0x10000620: file a.c, line 5.
(gdb) r
Starting program: a.out

Breakpoint 1, main () at a.c:5
5 printf(“LHS\n”);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.ael7b.ppc64le
(gdb) step
LHS
9 printf(“RHS\n”);
(gdb) step
RHS
11 }

And if we use clang, we will get:

(gdb) b main
Breakpoint 1 at 0x10000614: file a.c, line 5.
(gdb) r
Starting program: a.out

Breakpoint 1, main () at a.c:5
5 printf(“LHS\n”);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.ael7b.ppc64le
(gdb) step
LHS
8 if (x || 1) {
(gdb) step
9 printf(“RHS\n”);
(gdb) step
RHS
11 }

We can see that we will stop at line 8 and can not stop at line 9 directly like GCC or LHS is constant. This is somehow not unified behaviour and make uses feel strange.

在 2017-08-24 13:59:06,“John McCall” rjmccall@apple.com 写道:

Firstly, I briefly clarify the logic of Info.noteSideEffect. Info.noteSideEffect will set EvalStatus.HasSideEffects = true; and make hasUnacceptableSideEffect function always return true so that our
EvaluateAsInt alway return false when LHS can not evaulate.

Yes, I understand the implementation.

Then I want to express one significant point of it, that is debug. Let us compare it with GCC in the following example.

int main()
{
int x;
if (1 || x) {
printf(“LHS\n”);
}

if (x || 1) {
printf(“RHS\n”);
}
}

When we use GCC to debug it, we will get the following result:

(gdb) b main
Breakpoint 1 at 0x10000620: file a.c, line 5.
(gdb) r
Starting program: a.out

Breakpoint 1, main () at a.c:5
5 printf(“LHS\n”);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.ael7b.ppc64le
(gdb) step
LHS
9 printf(“RHS\n”);
(gdb) step
RHS
11 }

And if we use clang, we will get:

(gdb) b main
Breakpoint 1 at 0x10000614: file a.c, line 5.
(gdb) r
Starting program: a.out

Breakpoint 1, main () at a.c:5
5 printf(“LHS\n”);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-78.ael7b.ppc64le
(gdb) step
LHS
8 if (x || 1) {
(gdb) step
9 printf(“RHS\n”);
(gdb) step
RHS
11 }

We can see that we will stop at line 8 and can not stop at line 9 directly like GCC or LHS is constant. This is somehow not unified behaviour and make uses feel strange.

Clang’s behavior seems correct. The general goal of stepping is to stop at every statement boundary.

John.

Clang’s behavior seems correct. The general goal of stepping is to stop at every statement boundary.

Which means, the ideal is to stop at both if statements. Neither compiler allows that. Clang’s behavior is closer to the ideal than GCC, even if it looks inconsistent. But I would not be inclined to complain about either behavior; they are idiosyncrasies of how each compiler chooses to handle these expressions at –O0.

–paulr

1 || x is a constant expression.
x || 1 is not.
This is a property of the underlying programming language – || short circuits. The user needs to understand that.