I want to investigate if there are any possibilities of optimizing virtual functions calls. From my knowledge and reading I understand that the overhead for this is two pointer dereferences. I know about alternatives like CRTP and std::variant, but for this investigation, I’m interested only in traditional, dynamic polymorphism. One of my ideas is to use some form of caching of the computation of the address of the actual method that gets called. Obviously we cannot always do this, but there are some cases where we could.
I have a virtual function call in a loop. All the time, the same method is called:
The loop is in method work; the virtual call is to method id.
We can see that method work gets inlined, but inside the loop, there are always two pointer dereferences:
mov rax, qword ptr [r14]
call qword ptr [rax]
Since the object referred to by b, never changes to a different object, this could(at least in this case), be cached.
My assembly might be rusty, but before the loop, we could have:
mov r13, qword ptr [r14]
mov r13, qword ptr [r13]
and inside the loop we would only have:
My questions are:
Do we have a mechanism in C++ to explicitly store the result of the lookup in the vtable without additional overhead? Some sort of cache for this result so that we do not do the same computation over and over again? I’m specifically looking for this solution, not alternatives to dynamic polymorphism like CRTP or std::variant. I couldn’t find one.
For functional programming style “pure functions”, we would have:
int res = pure_function();
Is there anything in the C++ standard that prevents such an optimization?
How would we identify the cases in which such an optimization is possible and the ones in which it is not?
Are there any other reasons for which such an optimization would not be desired?
N.B.: I realize the last two questions might be difficult to answer, but could you at least point me in the right direction for investigating this myself?