Potential missed optimization - unnecessary reload of vtable ptr inside loop body

Hello all!

I posted a question about a potential missed optimization to llvm-dev, but was
directed here since it concerned more C++-specific bits of code. Previous
conversion can be found at [0] and [1].

The code in question is here: Compiler Explorer

My main question here is about assembly lines 24 and 46, where I think the
vtable pointer for the Rect object is being reloaded every iteration of the
loop. nbjoerg on #llvm said that's due to the possibility of placement new being
used somewhere inside the called function, but I'm not entirely sure that
placement new can change what vtable the vtable pointer points to.

(I'm new to this language lawyering stuff, so please let me know what I mess up)

As far as I understand, paragraph 8 of 6.6.3 [basic.life] in the most recent
draft of the standard says that references or names of an object that has been
"replaced" by placement new are only "redirected" to the new object if the new
object is the same type and no other class derives from that type; otherwise,
the reference/name refers to an object whose lifetime has ended. Thus, any uses
of the "this" pointer after a member function is called are only valid if the
placement new'd object is the same type, and so has the same vtable, which means
the vtable pointer does not have to be reloaded.

The example for point 6.5 of paragraph 6 of 6.6.3 sort of supports this
interpretation, since calling B::mutate() changes the type of *this, which
causes pb to point to an object whose lifetime has ended, and further method
calls through pb result in undefined behavior.

Is this reasoning correct? ICC 18, Clang trunk, Clang 5.0, GCC 7.3, GCC trunk,
and MSVC 19 all perform the reload, so I'm guessing I'm wrong, but I'm not sure
how.

If the vtable pointer reload is required, is there a way to indicate to Clang
that such a reload will not be necessary, even though the compiler can't verify
that (sort of like __restrict)? I tried adding [[gnu::pure]] to the function
declarations and definitions, but the vtable pointer reload remained. Does Clang
take [[gnu::pure]]/[[gnu::const]] into account for code generation/optimization?

Thanks for the help!

Alex

    [0]: http://lists.llvm.org/pipermail/llvm-dev/2018-February/121439.html
    [1]: http://lists.llvm.org/pipermail/llvm-dev/2018-March/121486.html

Hello all!

I posted a question about a potential missed optimization to llvm-dev, but was
directed here since it concerned more C++-specific bits of code. Previous
conversion can be found at [0] and [1].

The code in question is here: Compiler Explorer

My main question here is about assembly lines 24 and 46, where I think the
vtable pointer for the Rect object is being reloaded every iteration of the
loop. nbjoerg on #llvm said that's due to the possibility of placement new being
used somewhere inside the called function, but I'm not entirely sure that
placement new can change what vtable the vtable pointer points to.

(I'm new to this language lawyering stuff, so please let me know what I mess up)

As far as I understand, paragraph 8 of 6.6.3 [basic.life] in the most recent
draft of the standard says that references or names of an object that has been
"replaced" by placement new are only "redirected" to the new object if the new
object is the same type and no other class derives from that type; otherwise,
the reference/name refers to an object whose lifetime has ended. Thus, any uses
of the "this" pointer after a member function is called are only valid if the
placement new'd object is the same type, and so has the same vtable, which means
the vtable pointer does not have to be reloaded.

The example for point 6.5 of paragraph 6 of 6.6.3 sort of supports this
interpretation, since calling B::mutate() changes the type of *this, which
causes pb to point to an object whose lifetime has ended, and further method
calls through pb result in undefined behavior.

Is this reasoning correct? ICC 18, Clang trunk, Clang 5.0, GCC 7.3, GCC trunk,
and MSVC 19 all perform the reload, so I'm guessing I'm wrong, but I'm not sure
how.

Your reasoning is right, but it's proven to be very difficult to write a reliable general
optimization that only triggers on the wrong cases and not when, say, constructing
different base-class subobjects of a class. Our optimization can be enabled with
-fstrict-vtable-pointers, but it's still experimental.

John.

Hello all!

I posted a question about a potential missed optimization to llvm-dev, but was
directed here since it concerned more C++-specific bits of code. Previous
conversion can be found at [0] and [1].

The code in question is here: Compiler Explorer

My main question here is about assembly lines 24 and 46, where I think the
vtable pointer for the Rect object is being reloaded every iteration of the
loop. nbjoerg on #llvm said that's due to the possibility of placement new being
used somewhere inside the called function, but I'm not entirely sure that
placement new can change what vtable the vtable pointer points to.

(I'm new to this language lawyering stuff, so please let me know what I mess up)

As far as I understand, paragraph 8 of 6.6.3 [basic.life] in the most recent
draft of the standard says that references or names of an object that has been
"replaced" by placement new are only "redirected" to the new object if the new
object is the same type and no other class derives from that type; otherwise,
the reference/name refers to an object whose lifetime has ended. Thus, any uses
of the "this" pointer after a member function is called are only valid if the
placement new'd object is the same type, and so has the same vtable, which means
the vtable pointer does not have to be reloaded.

The example for point 6.5 of paragraph 6 of 6.6.3 sort of supports this
interpretation, since calling B::mutate() changes the type of *this, which
causes pb to point to an object whose lifetime has ended, and further method
calls through pb result in undefined behavior.

Is this reasoning correct? ICC 18, Clang trunk, Clang 5.0, GCC 7.3, GCC trunk,
and MSVC 19 all perform the reload, so I'm guessing I'm wrong, but I'm not sure
how.

Your reasoning is right, but it's proven to be very difficult to write a reliable general
optimization that only triggers on the wrong cases and not when, say, constructing

Wrong cases? Guessing you meant right cases?

different base-class subobjects of a class. Our optimization can be enabled with
-fstrict-vtable-pointers, but it's still experimental.

Even when dealing with constructing/destructing objects, I thought that the
object's type/vtable pointer effectively changes only between the base
constructor finishing and the next constructor starting, or vice-versa for
destruction. So even if full devirtualization isn't possible, the vtable pointer
could still be hoisted out of loops. Is that kind of optimization just too
specific to spend time on, compared to the benefits from getting
-fstrict-vtable-pointers implemented correctly?

Also, if I were interested in enabling this flag in my codebase, are there any
issues beyond the ones listed in this 2015 Nov. email [0]?

Thanks,
Alex

    [0]: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092384.html

Hello all!

I posted a question about a potential missed optimization to llvm-dev, but was
directed here since it concerned more C++-specific bits of code. Previous
conversion can be found at [0] and [1].

The code in question is here: Compiler Explorer

My main question here is about assembly lines 24 and 46, where I think the
vtable pointer for the Rect object is being reloaded every iteration of the
loop. nbjoerg on #llvm said that's due to the possibility of placement new being
used somewhere inside the called function, but I'm not entirely sure that
placement new can change what vtable the vtable pointer points to.

(I'm new to this language lawyering stuff, so please let me know what I mess up)

As far as I understand, paragraph 8 of 6.6.3 [basic.life] in the most recent
draft of the standard says that references or names of an object that has been
"replaced" by placement new are only "redirected" to the new object if the new
object is the same type and no other class derives from that type; otherwise,
the reference/name refers to an object whose lifetime has ended. Thus, any uses
of the "this" pointer after a member function is called are only valid if the
placement new'd object is the same type, and so has the same vtable, which means
the vtable pointer does not have to be reloaded.

The example for point 6.5 of paragraph 6 of 6.6.3 sort of supports this
interpretation, since calling B::mutate() changes the type of *this, which
causes pb to point to an object whose lifetime has ended, and further method
calls through pb result in undefined behavior.

Is this reasoning correct? ICC 18, Clang trunk, Clang 5.0, GCC 7.3, GCC trunk,
and MSVC 19 all perform the reload, so I'm guessing I'm wrong, but I'm not sure
how.

Your reasoning is right, but it's proven to be very difficult to write a reliable general
optimization that only triggers on the wrong cases and not when, say, constructing

Wrong cases? Guessing you meant right cases?

Yes, sorry.

different base-class subobjects of a class. Our optimization can be enabled with
-fstrict-vtable-pointers, but it's still experimental.

Even when dealing with constructing/destructing objects, I thought that the
object's type/vtable pointer effectively changes only between the base
constructor finishing and the next constructor starting, or vice-versa for
destruction. So even if full devirtualization isn't possible, the vtable pointer
could still be hoisted out of loops.

Yes, absolutely. I'm just describing the issues that I understand to make the analysis
difficult, not saying that it's impossible.

Is that kind of optimization just too
specific to spend time on, compared to the benefits from getting
-fstrict-vtable-pointers implemented correctly?

I think everybody agrees that this is likely to be an extremely powerful optimization.

Also, if I were interested in enabling this flag in my codebase, are there any
issues beyond the ones listed in this 2015 Nov. email [0]?

Hopefully the people working on the optimization can answer that.

John.