Undefined behavior in IntrinsicInst?

IntrinsicInst has all of its constructors deleted here:

class IntrinsicInst : public CallInst {
public:
  IntrinsicInst() = delete;
  IntrinsicInst(const IntrinsicInst &) = delete;
  IntrinsicInst &operator=(const IntrinsicInst &) = delete;

which means there can’t ever be instances of IntrinsicInst or any of its children. However, this class is actively being used in casts for intrinsic matching, for example for llvm.instrprof.increment.step here:

      if (auto *IPIS = dyn_cast<InstrProfIncrementInstStep>(&Instr)) {
        lowerIncrement(IPIS);
        MadeChange = true;
      }

Here, InstrProfIncrementInstStep is inherited from InstrProfIncrementInst, which inherits from InstrProfCntrInstBase, which inherits from InstrProfInstBase, which inherits from IntrinsicInst. AFAIU, this code is essentially static_casting the actual CallInst instance to InstrProfIncrementInstStep (a downcast), which according to cppreference, I believe, is UB:

static_cast<target-type>(expression)

  1. If target-type is a reference to some complete class D and expression is an lvalue of its non-virtual base B, or target-type is a pointer to some complete class D and expression is a prvalue pointer to its non-virtual base B, static_cast performs a downcast.
    <…>
    If the object expression refers or points to is actually a base class subobject of an object of type D, the result refers to the enclosing object of type D. Otherwise, the behavior is undefined[.]

I’ve also found a similar discussion on Operator utility class: Undefined behavior in Operator class?, however I don’t think the code was changed.

So it this actually UB and should something be done about this?

The classof implementations in intrinsics don’t violate classof contract, I guess, defined here as:

To be more precise, let classof be inside a class C. Then the contract for classof is “return true if the dynamic type of the argument is-a C”. As long as your implementation fulfills this contract, you can tweak and optimize it as much as you want.

But to prevent Operator (and IntrinsicInst) situation shouldn’t it be the opposite, like “return true only if the dynamic type of the argument is-a C” or even “return true if and only if the dynamic type of the argument is-a C”?

I agree that this is UB. I suspect it’s one of those things about core LLVM that you just don’t talk about in polite company :wink:

Using casts of this general form is obviously extremely valuable and any solution needs some kind of replacement that is humanly possible to migrate to. A tall order.

We can’t solve this by creating the instructions with “the right type” because the hierachy of relevant types isn’t a tree.

The only potential solution I came across would be to migrate towards some kind of “Ref” classes. In your example, InstrProfIncrementInstStep would become a lightweight class that only holds a pointer to the underlying CallInst. The dyn_cast would then no longer return a C++ pointer but simply a value of that class (or perhaps a std::optional<InstrProfIncrementInstStep>). Details may vary, but in any case making this ergonomic would basically require moving at least the entire Instruction hierarchy (but probably the entire Value hierarchy) over to this style. Which is technically possible but a pretty insane level of effort.

1 Like