Microsoft vtordisp field support for clang.

Hi all.

This patch provide support for vtordisp fields.
About this hidden fields you can read here vtordisp pragma | Microsoft Docs .
By default Microsoft compiler set /vd1 option and this behavior is default in my patch.
In future I can add /vd0 and /vd2 options to clang if somebody can tell me how can I do this.
Also this patch contain bug fix for virtual base layout in MS mode.

Please review this patch as soon as possible.

  - Dmitry.

vtordisp.patch (14.2 KB)

Ping...

This patch first of several patches.
I have been implemented basic support for MS RTTI.
But MS RTTI is huge patch, so I want divide it to several patches.
If community prefer one big path I will send one big path.

  - Dmitry.

Looks like it should be a LangOption. You should add a -cc1 option to
control it; if/when we add a VS-compatible driver, it would map /vd0 and /vd2
down to that.

I guess we'll also need to implement the pragma.

The information about this thing is very incomplete; it's not really clear
to me exactly how it's used. I guess we can worry about that later.

+ /// VtorDispOffsets - Offsets for adjusting 'this' pointer
+ /// in thunks (Microsoft-only).
+ VtorDispOffsetsMapTy VtorDispOffsets;

This patch provide support for vtordisp fields.
About this hidden fields you can read here vtordisp pragma | Microsoft Docs .
By default Microsoft compiler set /vd1 option and this behavior is default in my patch.
In future I can add /vd0 and /vd2 options to clang if somebody can tell me how can I do this.

Looks like it should be a LangOption. You should add a -cc1 option to
control it; if/when we add a VS-compatible driver, it would map /vd0 and /vd2
down to that.

I guess we'll also need to implement the pragma.

The information about this thing is very incomplete; it's not really clear
to me exactly how it's used. I guess we can worry about that later.

+ /// VtorDispOffsets - Offsets for adjusting 'this' pointer
+ /// in thunks (Microsoft-only).
+ VtorDispOffsetsMapTy VtorDispOffsets;
+

The values in this map are always just sizeof(int) less than the
corresponding VBaseOffset, right?

Yes, now I change VtorDispOffsetsMapTy to SmallVector<const CXXRecordDecl*, 4>.

  I guess we still need to record whether
a base actually requires this at all. I'd be more comfortable with storing this
as additional value information in VBaseOffsets; that will avoid bloating
the RecordLayout for the common case of a class with no virtual bases.

Are you suggest something like this?
struct VBaseInfo {
   CharUnits Offset;
   bool IsVtorDispPresent;
};
llvm::DenseMap<const CXXRecordDecl *, VBaseInfo> VBaseOffsets;

+ /// r4start
+ /// VtorDispOffsets - Offsets of hidden fields.
+ ASTRecordLayout::VtorDispOffsetsMapTy VtorDispOffsets;
+

Please don't tag code with your name.

Removed.

+ bool hasNewVirtualFunction(const CXXRecordDecl *RD,
+ bool OnlyMethods = false) const;

This parameter name makes no sense. All virtual functions are methods.

+ if (method->isVirtual()&& !method->size_overridden_methods()&&
+ !(OnlyMethods&& (method->getKind() == Decl::CXXConstructor ||
+ method->getKind() == Decl::CXXDestructor))) {

I see. Maybe you should call it something like IgnoreDestructor.

Yes, this name more clear.

Also, constructors can never be virtual.

+ if (!RD->hasUserDeclaredConstructor()&&
+ !RD->hasUserProvidedDefaultConstructor()&&

I believe this is redundant.

Removed.

+ // If class has primary base, then it can be primary base.
+ if (Context.getASTRecordLayout(Base).getPrimaryBase()) return true;

Please split the layout fixes unrelated to vtordisp as a separate commit.
Additionally, the next line calls getASTRecordLayout again; please just
re-use the result.

Ok.

+ CharUnits IntSize =
+ CharUnits::fromQuantity(Context.getTargetInfo().getIntWidth() / 8);

Is this really 'int'? Like, on Windows 64, is this still only 32-bits?

Yes, this field on Windows x64 is still int.

+bool RecordLayoutBuilder::needsVtorDisp(const CXXRecordDecl *RD,
+ const CXXRecordDecl *Base) const {

This entire algorithm is grossly inefficient. I believe that all you want is:
   1) Create a set of classes, and fill it with the all virtual bases.
   2) For each virtual method in the entire class hierarchy, remove all the
     classes declaring a method overridden by that method.
   3) A vtordisp field is required for every virtual base class which is no
     longer in the set.
In addition, you can avoid the full hierarchy walk by immediately
removing virtual bases which are known by a direct base class to
require a vtordisp field.

This algorithm has the side-effect of working when a function overrides
two different virtual functions from virtual bases at once.

Please tell me if I am not right.
You suggest remove this function with function like this "void RecordLayoutBuilder::FillVtorDispBases(const CXXRecordDecl *RD)".

This function in pseudo code:

Vbases = getAllVBasesForRD; // First step in your algorithm.
VBasesVtorDisp;

// Then goes step 2, but if we want traverse all virtual methods we must walk throw inheritance path.
foreach (VBase in Vbases) {
   CXXBasePath &Path = getPathFromRDToVBase;
   bool needsVtorDisp = false;

   foreach(Class in Path) {
     // Here we will have another 'for' from Class->method_begin() to Class->method_end(),
     // and check that Class overrides virtual method from VBase.
     if ((Class.hasCtorDeclared || Class.hasDtorDeclared) && Class.overridesVirtualFunctionOfBase) {
       needsVtorDisp = true;
       break;
     }
   }
   if (needsVtorDisp)
     VBasesVtorDisp.push_back(VBase);
}
If I am right then I am not sure that this algorithm is better.

  - Dmitry.

I guess we still need to record whether
a base actually requires this at all. I'd be more comfortable with storing this
as additional value information in VBaseOffsets; that will avoid bloating
the RecordLayout for the common case of a class with no virtual bases.

Are you suggest something like this?
struct VBaseInfo {
CharUnits Offset;
bool IsVtorDispPresent;
};
llvm::DenseMap<const CXXRecordDecl *, VBaseInfo> VBaseOffsets;

Yep, that's what I had in mind.

+ CharUnits IntSize =
+ CharUnits::fromQuantity(Context.getTargetInfo().getIntWidth() / 8);

Is this really 'int'? Like, on Windows 64, is this still only 32-bits?

Yes, this field on Windows x64 is still int.

Okay, that looks good, then.

+bool RecordLayoutBuilder::needsVtorDisp(const CXXRecordDecl *RD,
+ const CXXRecordDecl *Base) const {

This entire algorithm is grossly inefficient. I believe that all you want is:
  1) Create a set of classes, and fill it with the all virtual bases.
  2) For each virtual method in the entire class hierarchy, remove all the
    classes declaring a method overridden by that method.
  3) A vtordisp field is required for every virtual base class which is no
    longer in the set.
In addition, you can avoid the full hierarchy walk by immediately
removing virtual bases which are known by a direct base class to
require a vtordisp field.

This algorithm has the side-effect of working when a function overrides
two different virtual functions from virtual bases at once.

Please tell me if I am not right.
You suggest remove this function with function like this "void RecordLayoutBuilder::FillVtorDispBases(const CXXRecordDecl *RD)".

This function in pseudo code:

Vbases = getAllVBasesForRD; // First step in your algorithm.
VBasesVtorDisp;

// Then goes step 2, but if we want traverse all virtual methods we must walk throw inheritance path.
foreach (VBase in Vbases) {
CXXBasePath &Path = getPathFromRDToVBase;
bool needsVtorDisp = false;

foreach(Class in Path) {
   // Here we will have another 'for' from Class->method_begin() to Class->method_end(),
   // and check that Class overrides virtual method from VBase.
   if ((Class.hasCtorDeclared || Class.hasDtorDeclared) && Class.overridesVirtualFunctionOfBase) {
     needsVtorDisp = true;
     break;
   }
}
if (needsVtorDisp)
   VBasesVtorDisp.push_back(VBase);
}
If I am right then I am not sure that this algorithm is better.

I don't think this needs to be path-dependent, or at least not path-dependent in any way beyond what's encapsulated in the usual override rules. Either the most-derived class directly overrides a method from a virtual base class, or one of its base classes does. That is, I think the algorithm just needs to be something like this:

  var basesThatNeedVtorDisps : set<class>
  for each (base in mostDerivedClass.directBases)
    basesThatNeedVtorDisps.addAll(base.basesThatNeedVtorDisps)
  for each (method in mostDerivedClass.virtualMethods)
    for each (overriddenMethod in method.overriddenMethods)
      if (mostDerivedClass.virtualBases.contains(overriddenMethod.declaringClass))
        basesThatNeedVtorDisps.add(overriddenMethod.declaringClass);

The only tricky thing is that you might be overriding a method from a non-virtual base of a virtual base; I don't even know what that's supposed to do for these vtordisp fields, though. Could you check? The test case would be something like this:

  class A { virtual void foo(); }
  class B : public A { };
  class C : public virtual B { C(); virtual void foo(); };

John.

Also for this test case:

class A { virtual void foo(){} };
class B : public A { };
class C : public virtual B { C(){} virtual void foo(){} };
class D : public C { int qqq; };

   class D size(16):

Okay. I don't think this fundamentally changes the algorithm. I did think of
a case that might matter, though: a superclass might override a method
from a non-virtual base that is *also* a virtual base in some derived class.
That is to say, is there a vtordisp in 'C' in the following code?

class A { virtual void foo(); };
class B: public A { virtual void foo(); B(); };
class C : public B, public virtual A { C(); };

John.

Class C does not have vtordisp, because C does not override 'foo' and B inherit A non-virtual.

  - Dmitry.

Please see this patch.
I am not sure that new algorithm very efficient, but I think new algorithm is better than old one.
Also I refactor condition in 1472 line RecordLayoutBuilder.cpp.
Without this fix new test cases in ms_class_layout.cpp does not pass.

  - Dmitry.

vtordisp.patch (18.3 KB)

Okay. I was just noticing that 'C' installs a construction v-table for A, and
that v-table has to include a thunk for foo() that's really in exactly the
same situation as a thunk would be if B inherited from A virtually. I guess
Microsoft just gave up in this case.

John.

Please see this patch.

You don't need the store both the set of vbases with vtordisps and the
additional information in the virtual-base-offset map. I would prefer that
you rely only on the second; it has the distinct virtual of imposing no
extra overload to the layouts of classes without virtual bases.

I am not sure that new algorithm very efficient, but I think new algorithm is better than old one.

Why is this influenced by whether the class has a new virtual function?

Here's my suggested (untested) implementation:

void RecordLayoutBuilder::getRequiredVtorDisps(const CXXRecordDecl *RD, llvm::SmallPtrSet<const CXXRecordDecl*> &VBasesRequiringVtorDisp) {
  if (RD->vbases_empty()) return;

  // As an optimization, remember the set of virtual bases that we haven't decided yet.
  llvm::SmallPtrSet<const CXXRecordDecl *> UndecidedVBases;
  for (CXXRecordDecl::base_class_const_iterator I = RD->vbases_begin(),
       E = RD->vbases_end(); I != E; ++I) {
    UndecidedVBases.insert(*I);
  }

  // A virtual base requires a vtor disp if any of our base classes require it to have one.
  for (CXXRecordDecl::base_class_const_iterator I = RD->bases_begin(),
       E = RD->bases_end(); I != E; ++I) {
    const CXXRecordDecl *BaseDecl = I->getType()->getAsCXXRecordDecl();
    const ASTRecordLayout &BaseLayout = Context.getASTRecordLayout(BaseDecl);
    for (ASTRecordLayout::VBaseOffsetsMapTy::const_iterator
          VI = BaseLayout.VBaseOffsetsMap.begin(), VE = BaseLayout.VBaseOffsetsMap.end(); VI != VE; ++VI) {
      if (!VI->second.IsVtorDispPresent) continue;
      if (!UndecidedVBases.erase(VI->first)) continue;
      VBasesRequiringVtorDisp.insert(VI->first);
      if (UndecidedVBases.empty()) return;
    }
  }

  // Otherwise, a virtual base requires a vtor disp if any the non-destructor
  // virtual methods declared in this class directly override a method provided
  // by that virtual base.

  // Collect the set of bases directly overridden by any method in this class.
  // It's possible that some of these classes won't be virtual bases, or won't be
  // provided by virtual bases, or won't be virtual bases in the overridden
  // instance but are virtual bases elsewhere. Only the last matters for what
  // we're doing, and we can ignore those: if we don't directly override
  // a method provided by a virtual copy of a base class, but we do directly
  // override a method provided by a non-virtual copy of that base class,
  // then we must indirectly override the method provided by the virtual base,
  // and so we should already have collected it in the loop above.
  llvm::SmallPtrSet<const CXXRecordDecl*> OverriddenBases;
  for (CXXRecordDecl::method_iterator M = RD->method_begin(),
       E = RD->method_end(); M != E; ++M) {
    if (!M->isVirtual() || isa<CXXDestructorDecl>(M)) continue;
    for (CXXMethodDecl::method_iterator I = M->begin_overridden_methods(),
          E = M->end_overridden_methods(); I != E; ++I) {
      const CXXRecordDecl *OverriddenBase = (*I)->getParent();

      // As an optimization, check immediately whether we're overriding
      // something from the undecided set.
      if (UndecidedVBases.erase(OverriddenBase)) {
        VBasesRequiringVtorDisp.insert(OverriddenBase);
        if (UndecidedVBases.empty()) return;
        continue;
      }

      // Otherwise, just collect it.
      OverriddenBases.insert(OverriddenBase);
    }
  }

  // Walk the undecided v-bases and check whether they (non-virtually)
  // provide any of the overridden bases.
  for (llvm::SmallPtrSet<const CXXRecordDecl*>::const_iterator
        I = UndecidedVBases.begin(), E = UndecidedVBases.end(); I != E; ++I) {
    if (hasAnyNonVirtualBase(*I, OverriddenBases))
      VBasesRequiringVtorDisp.insert(*I);
  }
}

Where hasAnyNonVirtualBase just recursively walks the non-virtual
class hierarchy and checks whether any of the bases are in the set.

Also I refactor condition in 1472 line RecordLayoutBuilder.cpp.

Yes, that part looks fine; you can separate that out as a different patch, I think.

John.

Please see this patch.

You don't need the store both the set of vbases with vtordisps and the
additional information in the virtual-base-offset map. I would prefer that
you rely only on the second; it has the distinct virtual of imposing no
extra overload to the layouts of classes without virtual bases.

I am not sure that new algorithm very efficient, but I think new algorithm is better than old one.

Why is this influenced by whether the class has a new virtual function?

This check need if we have something like this:
class first{
public:
   virtual void asdf() {}
     virtual void g(){}
};
class second : public virtual first {
public :
   second(){}
   virtual void g(){}
   virtual void asdf() {}
};
class third : public virtual second {
public:
   third(){}
   virtual void g() {}
   virtual void asdf() {}
};
Only first will have vtordisp. Also flags in condition help to ignore this check in next case:
struct AV {
   virtual void foo();
};
struct BV : AV {
};
struct C : virtual BV {
   C();
   virtual void foo();
};

Here's my suggested (untested) implementation:

void RecordLayoutBuilder::getRequiredVtorDisps(const CXXRecordDecl *RD, llvm::SmallPtrSet<const CXXRecordDecl*> &VBasesRequiringVtorDisp) {
   if (RD->vbases_empty()) return;

   // As an optimization, remember the set of virtual bases that we haven't decided yet.
   llvm::SmallPtrSet<const CXXRecordDecl *> UndecidedVBases;
   for (CXXRecordDecl::base_class_const_iterator I = RD->vbases_begin(),
        E = RD->vbases_end(); I != E; ++I) {
     UndecidedVBases.insert(*I);
   }

   // A virtual base requires a vtor disp if any of our base classes require it to have one.
   for (CXXRecordDecl::base_class_const_iterator I = RD->bases_begin(),
        E = RD->bases_end(); I != E; ++I) {
     const CXXRecordDecl *BaseDecl = I->getType()->getAsCXXRecordDecl();
     const ASTRecordLayout&BaseLayout = Context.getASTRecordLayout(BaseDecl);
     for (ASTRecordLayout::VBaseOffsetsMapTy::const_iterator
           VI = BaseLayout.VBaseOffsetsMap.begin(), VE = BaseLayout.VBaseOffsetsMap.end(); VI != VE; ++VI) {
       if (!VI->second.IsVtorDispPresent) continue;
       if (!UndecidedVBases.erase(VI->first)) continue;
       VBasesRequiringVtorDisp.insert(VI->first);
       if (UndecidedVBases.empty()) return;
     }
   }

   // Otherwise, a virtual base requires a vtor disp if any the non-destructor
   // virtual methods declared in this class directly override a method provided
   // by that virtual base.

   // Collect the set of bases directly overridden by any method in this class.
   // It's possible that some of these classes won't be virtual bases, or won't be
   // provided by virtual bases, or won't be virtual bases in the overridden
   // instance but are virtual bases elsewhere. Only the last matters for what
   // we're doing, and we can ignore those: if we don't directly override
   // a method provided by a virtual copy of a base class, but we do directly
   // override a method provided by a non-virtual copy of that base class,
   // then we must indirectly override the method provided by the virtual base,
   // and so we should already have collected it in the loop above.
   llvm::SmallPtrSet<const CXXRecordDecl*> OverriddenBases;
   for (CXXRecordDecl::method_iterator M = RD->method_begin(),
        E = RD->method_end(); M != E; ++M) {
     if (!M->isVirtual() || isa<CXXDestructorDecl>(M)) continue;
     for (CXXMethodDecl::method_iterator I = M->begin_overridden_methods(),
           E = M->end_overridden_methods(); I != E; ++I) {
       const CXXRecordDecl *OverriddenBase = (*I)->getParent();

       // As an optimization, check immediately whether we're overriding
       // something from the undecided set.
       if (UndecidedVBases.erase(OverriddenBase)) {
         VBasesRequiringVtorDisp.insert(OverriddenBase);
         if (UndecidedVBases.empty()) return;
         continue;
       }

       // Otherwise, just collect it.
       OverriddenBases.insert(OverriddenBase);
     }
   }

   // Walk the undecided v-bases and check whether they (non-virtually)
   // provide any of the overridden bases.
   for (llvm::SmallPtrSet<const CXXRecordDecl*>::const_iterator
         I = UndecidedVBases.begin(), E = UndecidedVBases.end(); I != E; ++I) {
     if (hasAnyNonVirtualBase(*I, OverriddenBases))
       VBasesRequiringVtorDisp.insert(*I);
   }
}

Where hasAnyNonVirtualBase just recursively walks the non-virtual
class hierarchy and checks whether any of the bases are in the set.

Thx for your algorithm, it works fine in general. I add one condition and write hasAnyNonVirtualBase.
If realization is ok, then I cleanup code and relocate it to ToT.
Also I include patch for condition in Layout function, this patch must be committed before vtordisp patch.

  - Dmitry.

ms-layout.patch (1.32 KB)

new-vtordisp.patch (19.5 KB)

Ping...
Is patch ok?
I include relocated patch to newer revision.

  - Dmitry.

new-vtordisp.patch (19 KB)

I'm sorry, I'm going to be on vacation for two weeks and won't get the chance to review this.

John.

Ok, I remind you later.
Thx for your response.

  • Dmitry.

Hi!

Sorry for the delay, but at now I am writing my diploma so I don't have enough time.
I attached patch with vtordisp functionality.
Please review.

  - Dmitry.

vtordisp.patch (19 KB)

I revised this a bit more and checked in as r155905 (tweaked in r155906).

I added an extra test case (in the 'test1' namespace) to the bottom of
ms_class_layout.cpp, but I'm only guessing that this is the right result;
I'd appreciate it if you'd check this:

namespace test1 {
  struct A { virtual void foo(); };
  struct B : A {};
  struct C : virtual A, virtual B { C(); virtual void foo(); };
  void test() { C *c; }
}

I claim that both B and A should have vtordisps in C.

John.

Hi.

Yes you are right, B and A have vtordisp fields in C.

  - Dmitry Sokolov.