If you have a call to a virtual function inside of a Loop, llvm is unable to de-virt the call simply with Instr Combine. However, for the case where we put the virtual call inside a loop.
You can recreate the output for inst-combine with:
It seems to me a problem with Inst combine not being able to hoist the call to get the virtual func pointer outside of the loop, is that right? Can we possibly do better than this?
Pretty sure the inlined version works because the frontend devirtualizes (since the concrete type is known without any analysis) - once that’s not possible, the single call (or probably any non-looping chain of calls) can be devirtualized until a non-inline virtual call is hit - at that point the compiler can’t see that the vtable pointer hasn’t changed (this is a limitation of LLVM, not a limitation of C++ in general - C++ in general guarantees that the vtable pointer won’t change over a virtual call - but it doesn’t guarantee that it won’t change at all (you can placement delete, placement new some other object, then reverse that before the function returns and that’s valid)).
For example: https://godbolt.org/z/8zrhx8 - two calls to ‘test’ if ‘sub::method’ is not defined in this translation unit: First is devirtualized because the ctor is inline and the vtable pointer is seen from that, but then the compiler assumes the vtable pointer might’ve been modified by that call, so the second call is not devirtualized. If you make the ctor non-inline, you’ll see both are indirect/not devirtualized.
The loop presents the general case of this problem - can’t provide an inductive proof that every call will result in the same vtable pointer, so have to err on the side of caution.
I think tehre’s an old bug kicking around somewhere that boils down to “make we could use the llvm.assume intrinsic after every virtual call, to assert that the vtable pointer is the same before and after the call” but not sure what state that bug is in/how practical that strategy would be.
Not adding much, but I see that there’s the sub::method call outside of the loop (loop peeling?) that is devirtualized but not inlined. If the peeling doesn’t happen or that can be inlined somehow, it should probably work out for this particular case.
Pretty sure the inlined version works because the frontend devirtualizes (since the concrete type is known without any analysis)
I assume you are referring to “devirt_bad_fixed”, yes exactly this is getting de-virted by clang. In fact I tried to remedy this in my code base by making the “test” function a template over the input object to force clang to see the types and de-virt safely.
The devirt_good however is being de-virtualized by Inst combine, that is what I see from print-after-all.
once that’s not possible, the single call (or probably any non-looping chain of calls) can be devirtualized until a non-inline virtual call is hit - at that point the compiler can’t see that the vtable pointer hasn’t changed (this is a limitation of LLVM, not a limitation of C++ in general - C++ in general guarantees that the vtable pointer won’t change over a virtual call - but it doesn’t guarantee that it won’t change at all (you can placement delete, placement new some other object, then reverse that before the function returns and that’s valid)).
So IIUC, you are saying that because the bad case is in a loop, the first call to the “method()” might result in changing the vtable pointer, so every call after the first will have to go get the function pointer again?
That kind of explains what I am seeing here (https://godbolt.org/z/bvsbE7), and what Hiroshi mentioned, the first peeled loop is devirtualized but not the rest.
For example: https://godbolt.org/z/8zrhx8 - two calls to ‘test’ if ‘sub::method’ is not defined in this translation unit: First is devirtualized because the ctor is inline and the vtable pointer is seen from that, but then the compiler assumes the vtable pointer might’ve been modified by that call, so the second call is not devirtualized. If you make the ctor non-inline, you’ll see both are indirect/not devirtualized.
Yep, I am not even trying to devirt something not fully inline-able
Pretty sure the inlined version works because the frontend devirtualizes (since the concrete type is known without any analysis)
I assume you are referring to “devirt_bad_fixed”, yes exactly this is getting de-virted by clang. In fact I tried to remedy this in my code base by making the “test” function a template over the input object to force clang to see the types and de-virt safely.
Right - I’d expect a template could probably tickle the frontend devirt - but the frontend might still have to be able to see the concrete type of the object, so not just an object reference (because there could be a further derived class - /maybe/ if the type is “final” that’d suffice)
The devirt_good however is being de-virtualized by Inst combine, that is what I see from print-after-all.
once that’s not possible, the single call (or probably any non-looping chain of calls) can be devirtualized until a non-inline virtual call is hit - at that point the compiler can’t see that the vtable pointer hasn’t changed (this is a limitation of LLVM, not a limitation of C++ in general - C++ in general guarantees that the vtable pointer won’t change over a virtual call - but it doesn’t guarantee that it won’t change at all (you can placement delete, placement new some other object, then reverse that before the function returns and that’s valid)).
So IIUC, you are saying that because the bad case is in a loop, the first call to the “method()” might result in changing the vtable pointer, so every call after the first will have to go get the function pointer again?
That kind of explains what I am seeing here (https://godbolt.org/z/bvsbE7), and what Hiroshi mentioned, the first peeled loop is devirtualized but not the rest.