Planned change to IR semantics: constant expressions never have undefined behavior

See https://reviews.llvm.org/D63036 for the full patch and description. Essentially, the Constant::canTrap API is going away. To make this work, the semantics of division and remainder constant expressions are changing slightly, so division by zero produces poison, not undefined behavior. (The corresponding instructions still have undefined behavior.) This change should make writing and understanding IR optimizations easier.

-Eli

Is anyone interested and willing to make a more profound change to llvm’s constants? We should really remove all the trapping operators. The only reason they exist in the first place is to allow use of Constant::get() as a high level constant folding API. We could replace that, eliminate the trapping operators, and thus eliminate a ton of complexity and bugs…

-Chris

This is a good change. Thanks for working on it.

I don’t mean to steal your thunder, Eli, but I’d like to propose that we add constrained intrinsics for integer DIV (and maybe REM). Some architectures, like x86, can trap on integer div-by-zero. I’d like to have some way to access that behavior, even if the C/C++ Standards say it’s undefined behavior.

We have a few dusty deck codes in-house that depend on this, so I’m motivated to push our local changes upstream. That said, if no one else finds value in this, we can keep it local.

-Cam

Hi Chris,

If I’m understanding your proposal correctly, the constrained FP intrinsic work could make use of that code – in the future. Eventually we’ll want to optimize the constrained FP intrinsics and being able to constant fold non-trapping instructions is important.

-Cam

Right: in terms of primary constant folding entry points, the ConstExpr::get sort of methods have always been a mistake (and I’ve never gotten around to fixing them).

A better thing to do is to make ConstExpr’s have their own enum of “operations”, and make them correspond to the things that show up in a global variable initializer (e.g. relocatable expressions). There is no reason for constant expressions to mirror the instruction enum, and there is lots of harm caused by it.

-Chris

Do you really need a special way to represent this in IR? If you need a SIGFPE, the frontend can generate the equivalent of "if (x==0) raise(SIGFPE);", which is well-defined across all targets. If you really need a "real" trap, I guess we could expose the x86 idiv as an intrinsic.

-Eli

See https://reviews.llvm.org/D63036 for the full patch and description. Essentially, the Constant::canTrap API is going away. To make this work, the semantics of division and remainder constant expressions are changing slightly, so division by zero produces poison, not undefined behavior. (The corresponding instructions still have undefined behavior.) This change should make writing and understanding IR optimizations easier.

Is anyone interested and willing to make a more profound change to llvm’s constants? We should really remove all the trapping operators. The only reason they exist in the first place is to allow use of Constant::get() as a high level constant folding API. We could replace that, eliminate the trapping operators, and thus eliminate a ton of complexity and bugs…

Hi Chris,

If I’m understanding your proposal correctly, the constrained FP intrinsic work could make use of that code – in the future. Eventually we’ll want to optimize the constrained FP intrinsics and being able to constant fold non-trapping instructions is important.

Right: in terms of primary constant folding entry points, the ConstExpr::get sort of methods have always been a mistake (and I’ve never gotten around to fixing them).

Ok, I think I understand now. So we’d retain the ability to constant fold operations that may trap, possibly returning trap flags, through APFloat. If that’s correct, that would work for the constrained FP intrinsics project. No big deal.

A better thing to do is to make ConstExpr’s have their own enum of “operations”, and make them correspond to the things that show up in a global variable initializer (e.g. relocatable expressions). There is no reason for constant expressions to mirror the instruction enum, and there is lots of harm caused by it.

You may have lost me here, but that’s ok. If you’re suggesting not switching on the Instruction OpCodes in ConstantExpr::get(…) et al, to encapsulate the ConstantExpr class, then +1 from me.

See https://reviews.llvm.org/D63036<https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D63036&d=DwMFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=O_4M49EtSpZ_-BQYeigzGv0P4__noMcSu2RYEjS1vKs&m=HPH8LbjOYmP3G9p6HJYKLtOFGQQJ0aWeEc7QU46PkOw&s=Q-PK-STEJTkFgcTQeOjIR7RY4e_iIHz191Qk_8syQlA&e=> for the full patch and description. Essentially, the Constant::canTrap API is going away. To make this work, the semantics of division and remainder constant expressions are changing slightly, so division by zero produces poison, not undefined behavior. (The corresponding instructions still have undefined behavior.) This change should make writing and understanding IR optimizations easier.

Is anyone interested and willing to make a more profound change to llvm’s constants? We should really remove all the trapping operators. The only reason they exist in the first place is to allow use of Constant::get() as a high level constant folding API. We could replace that, eliminate the trapping operators, and thus eliminate a ton of complexity and bugs…

Hi Chris,

If I'm understanding your proposal correctly, the constrained FP intrinsic work could make use of that code -- in the future. Eventually we'll want to optimize the constrained FP intrinsics and being able to constant fold non-trapping instructions is important.

Right: in terms of primary constant folding entry points, the ConstExpr::get sort of methods have always been a mistake (and I’ve never gotten around to fixing them).

Ok, I think I understand now. So we'd retain the ability to constant fold operations that may trap, possibly returning trap flags, through APFloat. If that's correct, that would work for the constrained FP intrinsics project. No big deal.

Yes, this would be specifically about the entry points that produce a ConstantExpr.

A better thing to do is to make ConstExpr’s have their own enum of “operations”, and make them correspond to the things that show up in a global variable initializer (e.g. relocatable expressions). There is no reason for constant expressions to mirror the instruction enum, and there is lots of harm caused by it.

You may have lost me here, but that's ok. If you're suggesting not switching on the Instruction OpCodes in ConstantExpr::get(...) et al, to encapsulate the ConstantExpr class, then +1 from me.

The idea here is actually to eliminate the underlying expressions themselves, so it would no longer be legal to write “sdiv (i32 […], i32 […])” as a constant expression at all in IR.

The hardest part of this is probably figuring out how to handle AutoUpgrade; there aren’t that many transforms which actually try to create an sdiv constant expression, and it shouldn’t be too hard to make them fail gracefully.

-Eli

Exactly. I haven’t actually tried this, but in my imagination, it isn’t too bad: there are three cases for a constant expr containing a div:

  1. constant used by normal instruction: expand inline into the current block, and use the non constant value.
  2. phi node: expand into the appropriate predecessor.
  3. Global variable init or other thing: error, not supported by a backend anyway.

-Chris

A real trap would be preferred. I suspect that it would be difficult/expensive to set all the flags correctly (if that’s even possible). E.g.:

#include <stdio.h>
#include <signal.h>

void handle_sigfpe(int fpe)
{ printf(“Caught signal %d\n”, fpe); }

int main (void) {
signal(SIGFPE, handle_sigfpe);

#if TRAP
int x = 1/0;
#else
raise(SIGFPE);
#endif

return 0;
}

In this pathological case, the RF flag differs so we’ll see an infinite loop.

This sounds a lot like the approach we took to implicit null checks in LLVM. At the IR level, we represent them as explicit control flow. In the backend, we pattern match into a side table of which is used by the signal handler to redirect control flow to the appropriate handler. The mechanisms already in place could be easily reused for idiv.

Philip

This is fundamentally different than the problems we’re trying to handle with the constrained FP intrinsics. The constrained FP intrinsics were necessary because LLVM IR otherwise assumes that FP instructions do not have side effects. In the case of integer divide-by-zero, the optimizer generally recognizes this as a possibility and avoids introducing it (through speculation, for instance). But if we’re optimizing it away as undefined behavior I guess that’s another story.

I’m not against introducing a way to handle that case, but I would be against making it look like the constrained FP intrinsics.

-Andy

This is fundamentally different than the problems we’re trying to handle with the constrained FP intrinsics. The constrained FP intrinsics were necessary because LLVM IR otherwise assumes that FP instructions do not have side effects. In the case of integer divide-by-zero, the optimizer generally recognizes this as a possibility and avoids introducing it (through speculation, for instance). But if we’re optimizing it away as undefined behavior I guess that’s another story.

Eli and I (and others) had this conversation about 5 years ago. I haven’t dug through the C/C++ Standards (or drafts) since then, but I’ll assume that not much has changed (someone please correct me if I’m mistaken). It’s clear that the C/C++ Standards treat integer div-by-zero as undefined behavior. So the compiler can do whatever it wants with it (e.g self-immolation). And LLVM currently does whatever it wants. Here’s InstSimplify for example:

// X / 0 → undef
// X % 0 → undef
// We don’t need to preserve faults!
if (match(Op1, m_Zero()))
return UndefValue::get(Ty);

Since X86 (IA-32/IA-64 IIRC) has a protocol for handling integer div-by-zero, I’d really like the compiler to leave it alone and let the hardware do what it will. I understand that this affects a very small part of the community, so I don’t expect that to happen. But it would be nice if there was some way to access this behavior in an IEEE conformance mode.

I’m not against introducing a way to handle that case, but I would be against making it look like the constrained FP intrinsics.

It’s not really that much different. We’re trying to preserve a trapping side-effect, but it just happens to be integer and not FP. But yeah, I agree, it doesn’t have to be rolled into the constrained FP intrinsics project.

But it would be nice if there was some way to access this behavior in an IEEE conformance mode.

Bah, FP traps on the brain. Obviously not an IEEE-754 issue…