Undefined behavior in Operator class?

If I understand correctly, Operator is basically a trivial temporary
object that holds some common functionality for both Instruction and
ConstantExpr.

It looks to me like Operator is doing something illegal though. For
example, in this member function of class Operator (in
include/llvm/Operator.h):

  /// getOpcode - Return the opcode for this Instruction or ConstantExpr.
  ///
  unsigned getOpcode() const {
    if (const Instruction *I = dyn_cast<Instruction>(this))
      return I->getOpcode();
    return cast<ConstantExpr>(this)->getOpcode();
  }

In this function, the `this` pointer is not pointing at an Operator
object. That's undefined behavior, right?

The problem that this Operator is trying to solve is similar to the
situation with DeclContext's inside Clang. The primary difference is
that while Operator isn't actualy a parent, DeclContext is a proper
parent (using multiple inheritance), and so these casts become legit
cross-casts for DeclContext. I'm not sure if multiple inheritance is
the right solution here, but we definitely don't want UB. Maybe it can
just store a User* to the Instruction/ConstantExpr?

Thoughts? Is this UB?

CC'ing Dan Gohman since it looks like he's the one that initially put
Operator in place.

-- Sean Silva

Yes.

-Eli

In this function, the `this` pointer is not pointing at an Operator
object. That's undefined behavior, right?

Yes.

Well then I guess the next step is how to fix it.

Maybe someone with a bit more experience with the Operator code and
how it is used could make a suggestion? (I just stumbled upon this
code yesterday while doing some refactoring, so I'm not intimately
familiar with how it is generally used).

I would really rather not use multiple inheritance for this. It looks
like a lot of what it does could be maybe be done through traits? Can
someone with more experience with the code weigh in?

-- Sean Silva

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu] On Behalf Of Sean Silva
Subject: Re: [LLVMdev] Undefined behavior in Operator class?

> > In this function, the `this` pointer is not pointing at an Operator
> > object. That's undefined behavior, right?

> Yes.

Well then I guess the next step is how to fix it.

I tried removing the method and rebuilding. The resulting compilation errors were easily fixed by changing the few occurrences in:
  include/llvm/Support/PatternMatch.h
  lib/Analysis/BasicAliasAnalysis.cpp
  lib/Analysis/ScalarEvolution.cpp
  lib/Analysis/ValueTracking.cpp
  lib/CodeGen/ScheduleDAGInstrs.cpp
  lib/CodeGen/SelectionDAG/FastISel.cpp
to use the static variant of Operator::getOpcode(). (This was with 3.1; trunk may be different.)

I suppose one could also modify the non-static method to invoke the static one, passing "this" as the argument, but that just doesn't feel right...

- Chuck

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

I tried removing the method and rebuilding. The resulting compilation errors were easily fixed by changing the few occurrences in:
        include/llvm/Support/PatternMatch.h
        lib/Analysis/BasicAliasAnalysis.cpp
        lib/Analysis/ScalarEvolution.cpp
        lib/Analysis/ValueTracking.cpp
        lib/CodeGen/ScheduleDAGInstrs.cpp
        lib/CodeGen/SelectionDAG/FastISel.cpp
to use the static variant of Operator::getOpcode(). (This was with 3.1; trunk may be different.)

That seems like probably the best short term solution. It doesn't fix
the core problem though, which is that Instruction and ConstantExpr
are getting cast<>'d to Operator (or one of its subclasses), in an
illegal fashion.

I suppose one could also modify the non-static method to invoke the static one, passing "this" as the argument, but that just doesn't feel right...

That would still be UB.

-- Sean Silva

From: Sean Silva [mailto:silvas@purdue.edu]
Subject: Re: [LLVMdev] Undefined behavior in Operator class?

It doesn't fix the core problem though, which is that Instruction
and ConstantExpr are getting cast<>'d to Operator (or one of its
subclasses), in an illegal fashion.

This appears to be completely avoidable in nearly all of the situations, since the result of the dyn_cast<>() is used primarily to invoke the getOpcode() method. In a few places, it's used to call a method in the User class. Perhaps the dyn_cast to Operator could be replaced by one to User, avoiding any usage of an instance of Operator.

- Chuck

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.