Adding support for multiple non-virtual inheritance for -cxx-abi microsoft

Hi John,

I wanted to work on the multiple non-virtual inheritance support for
the Microsoft ABI.
I already have a local patch that generates code that works, but I
think it requires quite some polishing.

There are a few general questions I'd like to ask first as the Clang
architecture assumes some high-level things that are not true in the
Microsoft ABI.

Let's assume we have
struct A { virtual void a(); }
struct B { virtual void b(); }
struct C : A, B { virtual void b(); }

In Itanium ABI, C::b takes C* as an implicit "this" parameter;
in order to work with B* pointers, there is an adjusting thunk in the
C's vtable slot for "b".

In Microsoft ABI, C::b takes B* as an implicit "this" parameter.
No thunks are needed for vtable generation.
I've only observed thunks when I took a member pointer for a class
with multiple inheritance, but it was not specific to any particular
method.

So here are the questions:

Q1) Passing this to overriden methods
I assume the type of C::b should be "void <...>(%struct.B* %this)" -
does this look reasonable?
Can you suggest how to change the CodeGen to handle this appropriately?
Do you know a non-painful way to do that?
I think I'll need to:
a) adjust CodeGenTypes::arrangeCXXMethodType
  ... and now it needs not only the RD and FTP, but MD also.
  Instead of "argTypes.push_back(GetThisType(Context, RD));"
  I'll probably need to ask CGCXXABI on the this type?
  Is there an easy way to tell the most base class which declared a
given virtual method?

b) adjust all the places where the method may be called.
btw, is there an easy way to know the base class offset in a given
class in the CGCall ?
Or do I need to move some code from VTableBuilder for that?

Q2) thunks
As thunks are not needed for vtable generation, I don't think we need
to emit them when there are no virtual bases.
I'm not sure how to handle this in VTableBuilder as there are quite a
few places where this can be possible.
One option is to short-circuit the method iteration in
VTableBuilder::ComputeThisAdjustments if there are no virtual bases,
the other is to change the ComputeThisAdjustment method to work as if
the adjustment was not needed.

Q3) vtable layout
In Itanium ABI, there's one vtable for each class.
The non-virtual bases get address points somewhere in the middle of this vtable.

In Microsoft ABI, each base gets its own vtable, e.g. for the class C we'll have
"C vtable for the A part" and "C vtable for the B part".

We do the same in Clang, but that'd require some API changes to
VTableBuilder - i.e. you'll need to know a pair of classes to generate
a vtable, not just one.
We can go our own Itanium-like way and emit one vtable with address
points and still make this ABI-compatible (this is how my local patch
works) but I have a bad feeling that this may fail to work when we
start supporting virtual inheritance.

Do you have any thoughts or suggestions on this?

Q3) vtable layout
In Itanium ABI, there's one vtable for each class.
The non-virtual bases get address points somewhere in the middle of this
vtable.

In Microsoft ABI, each base gets its own vtable, e.g. for the class C
we'll have
"C vtable for the A part" and "C vtable for the B part".

We do the same in Clang, but that'd require some API changes to
VTableBuilder - i.e. you'll need to know a pair of classes to generate
a vtable, not just one.

It's actually more than that. Because a non-virtual base can appear more
than once as a subobject, you need the base of the current subobject and
the inheritance path to the most derived class.

This came up in the virtual base table mangling patch I sent. I assume
we'll need similar logic at some point.

We can go our own Itanium-like way and emit one vtable with address
points and still make this ABI-compatible (this is how my local patch
works) but I have a bad feeling that this may fail to work when we
start supporting virtual inheritance.

IMO we should have two separate vtable builders, like we have two name
manglers. The comments on VTableContext::isMicrosoftABI() talk about
adding such a layer. They may be able to produce the same layout classes
when done. I think we can go incremental and cross that bridge later, but
that's what we should aim for.

Q3) vtable layout
In Itanium ABI, there's one vtable for each class.
The non-virtual bases get address points somewhere in the middle of this
vtable.

In Microsoft ABI, each base gets its own vtable, e.g. for the class C
we'll have
"C vtable for the A part" and "C vtable for the B part".

We do the same in Clang, but that'd require some API changes to
VTableBuilder - i.e. you'll need to know a pair of classes to generate
a vtable, not just one.

It's actually more than that. Because a non-virtual base can appear more
than once as a subobject, you need the base of the current subobject and the
inheritance path to the most derived class.

This came up in the virtual base table mangling patch I sent. I assume
we'll need similar logic at some point.

Agree!

We can go our own Itanium-like way and emit one vtable with address
points and still make this ABI-compatible (this is how my local patch
works) but I have a bad feeling that this may fail to work when we
start supporting virtual inheritance.

IMO we should have two separate vtable builders, like we have two name
manglers. The comments on VTableContext::isMicrosoftABI() talk about adding
such a layer. They may be able to produce the same layout classes when
done. I think we can go incremental and cross that bridge later, but that's
what we should aim for.

Sure.
I've added the comment btw :slight_smile:

I wanted to work on the multiple non-virtual inheritance support for
the Microsoft ABI.
I already have a local patch that generates code that works, but I
think it requires quite some polishing.

There are a few general questions I'd like to ask first as the Clang
architecture assumes some high-level things that are not true in the
Microsoft ABI.

Let's assume we have
struct A { virtual void a(); }
struct B { virtual void b(); }
struct C : A, B { virtual void b(); }

In Itanium ABI, C::b takes C* as an implicit "this" parameter;
in order to work with B* pointers, there is an adjusting thunk in the
C's vtable slot for "b".

In Microsoft ABI, C::b takes B* as an implicit "this" parameter.
No thunks are needed for vtable generation.
I've only observed thunks when I took a member pointer for a class
with multiple inheritance, but it was not specific to any particular
method.

Interesting. Does MSVC make a thunk for B's vtable in this, or does it
just double-emit the function body?
  struct A { virtual void a(); };
  struct B { virtual void a(); };
  struct C : A, B { virtual void a(); };

I also find it curious that MSVC uses a thunk for member pointers, since
the required this-adjustment is already plainly expressible in the member
pointer value.

Q1) Passing this to overriden methods
I assume the type of C::b should be "void <...>(%struct.B* %this)" -
does this look reasonable?

Yes, that seems reasonable.

Can you suggest how to change the CodeGen to handle this appropriately?
Do you know a non-painful way to do that?
I think I'll need to:
a) adjust CodeGenTypes::arrangeCXXMethodType
... and now it needs not only the RD and FTP, but MD also.
Instead of "argTypes.push_back(GetThisType(Context, RD));"
I'll probably need to ask CGCXXABI on the this type?

I would just make sure to pass the right RD down. The code calling
this will need to be significantly different anyway, since the call algorithm
is different.

Is there an easy way to tell the most base class which declared a
given virtual method?

That doesn't have a unique answer.

You can walk up the override chains, although it's possible that you'll
need to stop at a virtual base boundary.

b) adjust all the places where the method may be called.
btw, is there an easy way to know the base class offset in a given
class in the CGCall ?

You'll need to construct a base path as you walk the override chains,
and then basically do that derived-to-base conversion.

We do the same in Clang, but that'd require some API changes to
VTableBuilder - i.e. you'll need to know a pair of classes to generate
a vtable, not just one.

As Reid suggested, I think the way to go here is to make VTableBuilder
become ItaniumVTableBuilder and make a new MicrosoftVTableBuilder.

There should be a lot of common behavior you can share between
the implementations.

John.

I wanted to work on the multiple non-virtual inheritance support for
the Microsoft ABI.
I already have a local patch that generates code that works, but I
think it requires quite some polishing.

There are a few general questions I'd like to ask first as the Clang
architecture assumes some high-level things that are not true in the
Microsoft ABI.

Let's assume we have
struct A { virtual void a(); }
struct B { virtual void b(); }
struct C : A, B { virtual void b(); }

In Itanium ABI, C::b takes C* as an implicit "this" parameter;
in order to work with B* pointers, there is an adjusting thunk in the
C's vtable slot for "b".

In Microsoft ABI, C::b takes B* as an implicit "this" parameter.
No thunks are needed for vtable generation.
I've only observed thunks when I took a member pointer for a class
with multiple inheritance, but it was not specific to any particular
method.

Interesting. Does MSVC make a thunk for B's vtable in this, or does it
just double-emit the function body?
  struct A { virtual void a(); };
  struct B { virtual void a(); };
  struct C : A, B { virtual void a(); };

Great question - it does emit a B-to-(A,C) adjusting thunk for the "C
vtable for the B part"!

I also find it curious that MSVC uses a thunk for member pointers, since
the required this-adjustment is already plainly expressible in the member
pointer value.

Me too actually.
Reid, wdyt?

Q1) Passing this to overriden methods
I assume the type of C::b should be "void <...>(%struct.B* %this)" -
does this look reasonable?

Yes, that seems reasonable.

Can you suggest how to change the CodeGen to handle this appropriately?
Do you know a non-painful way to do that?
I think I'll need to:
a) adjust CodeGenTypes::arrangeCXXMethodType
... and now it needs not only the RD and FTP, but MD also.
Instead of "argTypes.push_back(GetThisType(Context, RD));"
I'll probably need to ask CGCXXABI on the this type?

I would just make sure to pass the right RD down.

Ah, yeah, I've missed this idea :slight_smile:

The code calling
this will need to be significantly different anyway, since the call algorithm
is different.

Is there an easy way to tell the most base class which declared a
given virtual method?

That doesn't have a unique answer.

You can walk up the override chains, although it's possible that you'll
need to stop at a virtual base boundary.

b) adjust all the places where the method may be called.
btw, is there an easy way to know the base class offset in a given
class in the CGCall ?

You'll need to construct a base path as you walk the override chains,
and then basically do that derived-to-base conversion.

We do the same in Clang, but that'd require some API changes to
VTableBuilder - i.e. you'll need to know a pair of classes to generate
a vtable, not just one.

As Reid suggested, I think the way to go here is to make VTableBuilder
become ItaniumVTableBuilder and make a new MicrosoftVTableBuilder.

There should be a lot of common behavior you can share between
the implementations.

Should they have a common base / interface or just separate at first?

I think it allows them to avoid the union between non-virtual methods
and virtual methods. Seems a bit cleaner and more obvious to me, but
it has tradeoffs in terms of code size at the call site and the number
of conditional vs. indirect branches that you have to do:
indirect to thunk and indirect through vtable, vs conditional between
two indirect calls

Actualy, in MSVC API, the pointer to member function are just sizeof(void*) in
the single inheritance case. It does not contains enough bits to put an offset.
(We had that discussion before :slight_smile: )

http://blogs.msdn.com/b/oldnewthing/archive/2004/02/09/70002.aspx

If you think there's something that can be interestingly abstracted out, feel
free, but I'm not sure what it would be off-hand.

John.

Oh, does MSVC not do the union thing? They always make a thunk to do
the virtual call?

If so, this "thunk" is potentially quite a bit more than just a thunk — it may
actually have ABI pointer-equality requirements on it for e.g. member
pointer equality tests.

John.

We were originally just talking about the multiple-inheritance case here,
although it seems like there may have been confusion — it sounds like there
may be a thunk for virtual member functions regardless of inheritance model.

John.

I wanted to work on the multiple non-virtual inheritance support for
the Microsoft ABI.
I already have a local patch that generates code that works, but I
think it requires quite some polishing.

There are a few general questions I'd like to ask first as the Clang
architecture assumes some high-level things that are not true in the
Microsoft ABI.

Let's assume we have
struct A { virtual void a(); }
struct B { virtual void b(); }
struct C : A, B { virtual void b(); }

In Itanium ABI, C::b takes C* as an implicit "this" parameter;
in order to work with B* pointers, there is an adjusting thunk in the
C's vtable slot for "b".

In Microsoft ABI, C::b takes B* as an implicit "this" parameter.
No thunks are needed for vtable generation.
I've only observed thunks when I took a member pointer for a class
with multiple inheritance, but it was not specific to any particular
method.

Interesting. Does MSVC make a thunk for B's vtable in this, or does it
just double-emit the function body?
struct A { virtual void a(); };
struct B { virtual void a(); };
struct C : A, B { virtual void a(); };

Great question - it does emit a B-to-(A,C) adjusting thunk for the "C
vtable for the B part"!

I also find it curious that MSVC uses a thunk for member pointers, since
the required this-adjustment is already plainly expressible in the member
pointer value.

Me too actually.
Reid, wdyt?

Q1) Passing this to overriden methods
I assume the type of C::b should be "void <...>(%struct.B* %this)" -
does this look reasonable?

Yes, that seems reasonable.

Can you suggest how to change the CodeGen to handle this appropriately?
Do you know a non-painful way to do that?
I think I'll need to:
a) adjust CodeGenTypes::arrangeCXXMethodType
... and now it needs not only the RD and FTP, but MD also.
Instead of "argTypes.push_back(GetThisType(Context, RD));"
I'll probably need to ask CGCXXABI on the this type?

I would just make sure to pass the right RD down.

Ah, yeah, I've missed this idea :slight_smile:

The code calling
this will need to be significantly different anyway, since the call algorithm
is different.

Is there an easy way to tell the most base class which declared a
given virtual method?

That doesn't have a unique answer.

You can walk up the override chains, although it's possible that you'll
need to stop at a virtual base boundary.

b) adjust all the places where the method may be called.
btw, is there an easy way to know the base class offset in a given
class in the CGCall ?

You'll need to construct a base path as you walk the override chains,
and then basically do that derived-to-base conversion.

We do the same in Clang, but that'd require some API changes to
VTableBuilder - i.e. you'll need to know a pair of classes to generate
a vtable, not just one.

As Reid suggested, I think the way to go here is to make VTableBuilder
become ItaniumVTableBuilder and make a new MicrosoftVTableBuilder.

Do you think it's OK to do the rename as the first step, with
post-commit review?

There should be a lot of common behavior you can share between
the implementations.

Should they have a common base / interface or just separate at first?

If you think there's something that can be interestingly abstracted out, feel
free, but I'm not sure what it would be off-hand.

Just wanted to clarify if I'm reading this correctly: your suggested
default approach is "just write a new separate MicrosoftVTableBuilder
from scratch", right?
Do you think it's OK to put it into the same VTableBuilder.cpp file so
it's easier to reuse some handy static functions out there?
Probably a better filename would be VTableBuilders.cpp as it'll be
building different kinds of vtables (one-for-all vtable in ItaniumABI,
vftables and maybe vbtables in Microsoft ABI).

I wanted to work on the multiple non-virtual inheritance support for
the Microsoft ABI.
I already have a local patch that generates code that works, but I
think it requires quite some polishing.

There are a few general questions I'd like to ask first as the Clang
architecture assumes some high-level things that are not true in the
Microsoft ABI.

Let's assume we have
struct A { virtual void a(); }
struct B { virtual void b(); }
struct C : A, B { virtual void b(); }

In Itanium ABI, C::b takes C* as an implicit "this" parameter;
in order to work with B* pointers, there is an adjusting thunk in the
C's vtable slot for "b".

In Microsoft ABI, C::b takes B* as an implicit "this" parameter.
No thunks are needed for vtable generation.
I've only observed thunks when I took a member pointer for a class
with multiple inheritance, but it was not specific to any particular
method.

Interesting. Does MSVC make a thunk for B's vtable in this, or does it
just double-emit the function body?
struct A { virtual void a(); };
struct B { virtual void a(); };
struct C : A, B { virtual void a(); };

Great question - it does emit a B-to-(A,C) adjusting thunk for the "C
vtable for the B part"!

I also find it curious that MSVC uses a thunk for member pointers, since
the required this-adjustment is already plainly expressible in the member
pointer value.

Me too actually.
Reid, wdyt?

Q1) Passing this to overriden methods
I assume the type of C::b should be "void <...>(%struct.B* %this)" -
does this look reasonable?

Yes, that seems reasonable.

Can you suggest how to change the CodeGen to handle this appropriately?
Do you know a non-painful way to do that?
I think I'll need to:
a) adjust CodeGenTypes::arrangeCXXMethodType
... and now it needs not only the RD and FTP, but MD also.
Instead of "argTypes.push_back(GetThisType(Context, RD));"
I'll probably need to ask CGCXXABI on the this type?

I would just make sure to pass the right RD down.

Ah, yeah, I've missed this idea :slight_smile:

The code calling
this will need to be significantly different anyway, since the call algorithm
is different.

Is there an easy way to tell the most base class which declared a
given virtual method?

That doesn't have a unique answer.

You can walk up the override chains, although it's possible that you'll
need to stop at a virtual base boundary.

b) adjust all the places where the method may be called.
btw, is there an easy way to know the base class offset in a given
class in the CGCall ?

You'll need to construct a base path as you walk the override chains,
and then basically do that derived-to-base conversion.

We do the same in Clang, but that'd require some API changes to
VTableBuilder - i.e. you'll need to know a pair of classes to generate
a vtable, not just one.

As Reid suggested, I think the way to go here is to make VTableBuilder
become ItaniumVTableBuilder and make a new MicrosoftVTableBuilder.

Do you think it's OK to do the rename as the first step, with
post-commit review?

Sure.

There should be a lot of common behavior you can share between
the implementations.

Should they have a common base / interface or just separate at first?

If you think there's something that can be interestingly abstracted out, feel
free, but I'm not sure what it would be off-hand.

Just wanted to clarify if I'm reading this correctly: your suggested
default approach is "just write a new separate MicrosoftVTableBuilder
from scratch", right?

More or less.

Do you think it's OK to put it into the same VTableBuilder.cpp file so
it's easier to reuse some handy static functions out there?

Yeah, that's fine.

Probably a better filename would be VTableBuilders.cpp as it'll be
building different kinds of vtables (one-for-all vtable in ItaniumABI,
vftables and maybe vbtables in Microsoft ABI).

Abstractly, yes, but it's such a minor win that I'm not sure it's worthwhile.

John.

I don't believe there's any need for that -- comparisons on pointers to
virtual member functions produce unspecified results (see [expr.eq]p2,
second-last sentence).

Oh, interesting. I guess Itanium made it work because it fell out easily enough to do so.

John.

Even on Itanium, it (arguably) doesn't always work:

struct A { virtual void f() = 0; };
struct B { virtual void f() = 0; };
struct C : A, B { virtual void f(); };
void (C::*af)() = &A::f;
void (C::*bf)() = &B::f;
int main() { return af == bf; } // returns 0

True, although under this definition of “same member” you would presumably expect &A::f to be (indirectly) convertible to type ‘void (B::*)()’ and then be validly dereferenced on an object of a dynamic type that didn’t include A. We could actually make that work, but only by emitting an obscene amount of code for member conversions. :slight_smile:

Also, I don’t think your example needs to be even that complex; you could probably just compare &B::f (offset sizeof(void*) when converted into C) against &C::f (offset zero). Any sort of repeat declaration will screw it up. But Itanium will succeed if you start with the address of the same declaration.

At any rate, it’d be interesting to know to what extent MSVC does try to make this work: absent any conversion, do member pointers for the same virtual member fail to compare equal:

  • if produced at different times from the same expression? (pretty unlikely)
  • if produced at different times from different instantiations of the same expression? (also pretty unlikely)
  • if produced from different expressions within the same translation unit?
  • if produced from different translation units within the same linkage unit?

You could get all the way out to the last pretty easily by just emitting the thunk with a well-defined mangling and weak linkage.

John.