[RFC] Code style of initializer in if statement

jacquesguan · September 19, 2022, 6:24am

Since we support C++17 now, in some case we could use the new feature “initializer in if statement” to reduce some definitions’ scope and make code more clear.
For example, the former code

T a = getVal();
if (a.check()) {

}

could switch to:

if (T a = getVal(); a.check()) {

}

I think the latter one is more clear to read and the value a 's scope would be reduced to the if statement.

But there is another situation that might be arguable.

For example, if type T is nullable, we should check either value a is null first.

if (T a = getVal()) {
    if (a.check()) {

   }
}

Should we use the latter form:

if (T a = getVal(); a && a.check()) {

}

The latter one does decrease one level nested if statement, but might be dangerous to write and hard to read.

This is my revision ⚙ D134027 [mlir][NFC] Use initializers in if statement.;

I would like to discuss here that should we use the C++17 form or keep it as before?

clattner · September 20, 2022, 1:11am

I’ve tried adopting this in code, and it leads to some surprising confusion. TIL, that these two things are different:

if (T a = getVal()) {
    if (a.check()) {

and

if (T a = getVal(); a.check()) {

As you correctly say, you actually need if (T a = getVal(); a && a.check()) { which gets frequently forgotten.

Personally, I’m not sure this is worth the footgun. It is super surprising and easy to miss. Is the scope reduction actually that important?

-Chris

Mogball · September 20, 2022, 1:32am

I would prefer to “take what I can get” in terms of scope reduction. Nested ifs make code really hard to read sometimes.

With nullable types, I can see how falling in that trap can happen often, but this pattern is useful with non nullable types, e.g.

if (StringRef thing = getThing(); !thing.empty())

I personally found the C++14 behaviour of

if (T thing = getThing())

and implicitly doing a conversion to bool and checking it to be very confusing, because the code itself suggests it’s just an initializer. I prefer

if (T thing = getThing(); thing)

Which makes more sense, IMO. I feel like the problem has been that C++14 has baked this bad expectation in our heads which has made adoption of the C++17 feature tricky.

I really like writing

if (IntegerAttr value = getA(); value && value.getInt() == 4) {...}
if (IntegerAttr value = getB(); value && value.getInt() == 6) {...}

Instead of

IntegerAttr aValue = getA();
if (aValue && aValue.getInt() == 4) {...}
IntegerAttr bValue = getB();
if (bValue && bValue.getInt() == 6) {...}

clattner · September 20, 2022, 4:24am

Ok, that’s interesting. I don’t think that’s really the alternative though, I’ve seen those more idiomatically written as:

if (IntegerAttr aValue = getA()) {
  if (aValue.getInt() == 4) {...}
}

IOW, the real savings in the case where you want to “bind + test for nullability” is an extra level of indentation, not scoping.

River707 · September 20, 2022, 4:39am

This should really be moved out of the MLIR section, this feels more of a project wide discussion.

I strongly agree with @clattner here. I’ve found the if initializer easily error prone in the nullable case, we’ve already had several bugs from this. I also don’t really find it more readable in many cases, i.e. when it splits into two lines (examples take from the patch):

  // Current variant.
  if (auto sub = getLhs().getDefiningOp<SubIOp>())
    if (getRhs() == sub.getRhs())
      return sub.getLhs();

  // New variant.
  if (auto sub = getLhs().getDefiningOp<SubIOp>();
      sub && getRhs() == sub.getRhs())
    return sub.getLhs();

The only thing this saves is 2 spaces of indentation, as Chris mentions above.

Another example:

  // Current variant.
  auto pred = cmp.getPredicate();
  if (pred == arith::CmpIPredicate::eq || pred == arith::CmpIPredicate::ne) {

  // New variant.
  if (auto pred = cmp.getPredicate();
      pred == arith::CmpIPredicate::eq || pred == arith::CmpIPredicate::ne) {

In cases like this you aren’t saving anything, but make the code less readable by creating larger if bodies.

This was one feature of c++17 I was actually looking forward to, but in practice I’ve found quite bad. The only places I could really see this being usable is if the if (...) is one line and the initializer isn’t a nullable thing (which maybe limits it to just container find methods?).

– River

mehdi_amini · September 20, 2022, 5:32am

Done.

ftynse · September 20, 2022, 1:24pm

The nullable type footgun is particularly worrying with a lot of LLVM-style APIs that are full of nullable objects. Doing something like if (auto *foo = dyn_cast<Bar>(baz)) is quite common and will make it easy to trigger the gun. MLIR is even worse in this sense because it has nullable value-types that are not even pointers. Since nullability isn’t really a language concept, I don’t think we can have a compiler warning to guard against this case?

jrtc27 · September 20, 2022, 5:04pm

One advantage that’s sometimes extremely useful is being able to chain an else:

if (auto foo = ...; foo && foo->bar()) {
  ...
} else {
  ...
}

I know there have been cases in the past where I’ve really wanted C++17 so I could write that in LLVM.

rnk · September 20, 2022, 5:46pm

I’m pretty sure it’s legal to use the variable defined in if in the else block:

Apologies, that looks like that’s what you meant.

jrtc27 · September 20, 2022, 5:51pm

No, I just mean that it simplifies control flow so you don’t need to duplicate the contents of the else like:

if (auto foo = ...) {
  if (foo->bar()) {
    thing1
  } else {
    thing2
  }
} else {
  thing2
}

or introduce some new bool as:

bool baz;
if (auto foo = ...) {
  if (foo->bar()) {
    baz = true;
    thing1
  } else {
    baz = false;
  }
} else {
  baz = false;
}
if (!baz) {
  thing2
}

Mogball · September 20, 2022, 6:07pm

+1

zero9178 · September 21, 2022, 11:57am

Maybe we could solidify some more nuanced rules as to when it is appropiate to use them?
Something akin to:

Don’t use them with nullable types when nested ifs suffice. I.e. prefer:

if (auto thing = calcthing())
    if (thing.predicate())
        // thing

over

if (auto thing = calcthing(); thing && thing.predicate)
    // thing

Do use them to move otherwise redundant declarations into the if scope. I.e. prefer:

if (auto result = map.find(thing); result != map.end()) {
    // thing
}

over

auto result = map.find(thing);
if (result != map.end()) {
   // thing
}

Do use them to share else branches. I.e. prefer:

if (auto thing = calcthing(); thing && thing.predicate()) {
   // thing1
}  else {
  // thing2
}

over

if (auto thing = calcthing()) {
  if (thing.predicate()) {
    // thing1
  } else {
    // thing2
  }
} else {
  // thing2
}

The issue here is that both the last and first point are in conflict with each other in some situations. I’d personally be in favour of making the last rule here win over the first one generally speaking. Such cases might also be rather rare however.

Topic		Replies	Views
Before we go cleaning up LLVM+Clang of all Static Analyzer Warnings... LLVM Dev List Archives	8	86	August 18, 2016
[f18]How to check if a variable was initialised in f18. Flang	4	123	December 10, 2019
clang::CompoundLiteral::getInitializer() should use cast_or_null() (or I am missing something) Clang Frontend	2	82	December 17, 2008
Style question: NULL or 0? LLVM Dev List Archives	5	88	August 18, 2011
BUG: VarDecl::getSourceRange() incorrect if CallInit style used for builtin type. Clang Frontend	5	90	February 3, 2013

[RFC] Code style of initializer in if statement

Related topics