Question: How to access c++ vtable pointer to use as Value* in LLVM pass

Dear Mailing List,

This might sound unconventional, but I am trying to access a C++
objects vtable to pass as an argument to a function call for a library
function I created. Creating & inserting a function call at the
correct location in LLVM is done.

I have learned that C++ objects are represented as struct types. But
I'm just not quite sure how to get at the vtable pointer within, when
looking at the interface of Value:: class. clang, more specifically
CGClass.cpp, deals with C++ initializing constructors and destructors
and its API is straightforward while I can't find the similar API
calls in the LLVM counter part.

So far I am able to get the class object itself from a loadInst or
CallInst and I can iterate through the StructType, and the structs
"Types" contained within via element_begin()/element_end() to confirm
what I am looking at is the object. e.g.:

i32 (...)*** (this is how vtable is represented according to online
sources as a generic pointer)
i32 (class member in this case an int)

But this doesn't give me a Value* handle i can grab to and use later.
How can I leverage this Value to get that contained ??

2nd question: What happens if the struct object is from a derived
class; iterating over the struct again, it looks like the vtable ptr
is tangled even deeper within the object:

%class.Base.base = type <{i32 (...)**, i32 }>
i32

I looked at the ThreadSanitizer.cpp pass for inspiration, and it seems
they are also using MD_tbaa as hints for whether a load/store
isVTableAccess(), but doesn't need the Value. Maybe MDNode metadata
could be of use here?

TLDR: How can I leverage a Value that is of StructType generated from
a C++ object to get its vtable ptr in LLVM to use as a Value for a
to-be-inserted function call??

Thank you in advance!

Sincerely,

Christopher Jelesnianski
Graduate Research Assistant, Virginia Tech

Thanks for the information,

I already knew theory side of ""where"" a vtable is located in a C++
object, I need more information on how to access/manipulate it using
the LLVM API. Can you confirm then that the LLVM API treats the object
struct ptr also as the vtable ptr: so if I were to put it into a
function call argument, the function would then be able to manipulate
that vtable ptr specifically.

Sincerely,

Chris

But this doesn't give me a Value* handle i can grab to and use later.
How can I leverage this Value to get that contained ??

You need to get that from an instance rather than by iterating over
the Type; almost certainly from a pointer to an instance since classes
are hardly ever loaded as a whole in LLVM, just individual fields when
needed. It sounds like you'll have one lying around.

After that, you'd write:

    %vtable.ptr = getelementptr %MyStruct, %MyStruct* %obj, i32 0, i32 0
    %vtable = load i32(...)*, i32(...)** %vtable.ptr

The first index on that GEP is just because your object may be in an
array, the second selects the vtable pointer. Loading it gives you a
*pointer* to the vtable (so the object instance is a pointer to a
pointer to the vtable). It's essentially what you'd get if you'd had a
Value * from Clang's @_ZTV8MyStruct directly (via
Module::getGlobalVariable).

You have to bitcast it to the correct type, of course, because at the
moment it's pretending to be a i32(...)*. But that's probably what you
want to pass to your library function if it's expecting a vtable.

2nd question: What happens if the struct object is from a derived
class; iterating over the struct again, it looks like the vtable ptr
is tangled even deeper within the object:

It can get horribly complicated, with multiple vtables inside the
object at different locations and vtables within vtables; sometimes
even different vtables at different stages of the object's life. The
specification of what goes where is here:
https://itanium-cxx-abi.github.io/cxx-abi/abi.html

This is a good time to point out that all of this is platform
dependent. MSVC in particular does things very differently, and Clang
on Windows follows it.

Cheers.

Tim.

Thanks for the super detailed answer! The ABI link is a great
resource! Thanks for showing it.

So my end goal is to have a function pass instrument and insert my
custom call before any virtual calls in a program. Reading your
response, I noticed similar code to the one you mentioned happening
before each virtual call. It looks like the IR is "fetching" the
needed vtable (%vtable) followed by extracting the appropriate virtual
function (%vfn). Sample code below from a simple main calling the
virtual function after the constructor call. I annotated the placement
of where the custom call would go.

1 call void @_ZN7DerivedC2Ev(%class.Derived* %0) #3, !dbg !955
--- constructor call
2 store %class.Derived* %0, %class.Derived** %derv, align 8, !dbg !953
3 %3 = load %class.Derived*, %class.Derived** %derv, align 8, !dbg !958
4 %4 = bitcast %class.Derived* %3 to i32 (%class.Derived*)***, !dbg !959
5 %vtable = load i32 (%class.Derived*)**, i32 (%class.Derived*)***
%4, align 8, !dbg !959
6 %vfn = getelementptr inbounds i32 (%class.Derived*)*, i32
(%class.Derived*)** %vtable, i64 0, !dbg !959
---- too far I dont need a virt function call
7 %5 = load i32 (%class.Derived*)*, i32 (%class.Derived*)** %vfn,
align 8, !dbg !959
8~~~~ Insert custom call here ~~~~~~
9 %call1 = call i32 %5(%class.Derived* %3), !dbg !959 ---- I used
this CallInst Operand Value to get this far.

In this case, can I "piggyback" off the 5th instruction and use that
Instruction value* ? It should be do-able to iterate backwards from
the CallInst(line 9) store the Value. The approach you are suggesting
is to write my own 2 instructions which do the same things as lines 4
and 5, correct?

Thanks again!