As part of a MSIL (i.e. C#) to LLVM frontend I am currently working on ( https://github.com/xen2/SharpLang ), I would need some help/hint about how to properly design “PInvoke callbacks”.
Through “PInvoke” mechanism .NET allows you to call C functions, i.e.:
[DllImport(“libc.so”)] extern void mempcy(void* dest, void* src, int size); // declaration of C function
memcpy(ptr1, ptr2, 32); // let’s use it in C# code
That’s quite easy to support.
However, the tricky part is that C# functions pointers (callbacks, a.k.a delegate) can be transmitted to those C functions:
delegate void CallbackType(int result); // ← Point to a member to function pointer (need “this”)
[DllImport(“mylib.so”)] extern void MethodWithCallback(CallbackType callback);
An extra “this” parameter is needed to call the real method, but of course the calling C code doesn’t know about it (MethodWithCallback expects a non-member function pointer).
This is similar to C++ not being able to cast pointer to function member (containing this) as normal function pointers, because they are not compatible.
Using a JIT, it would be quite easy to deal with (generate thunk/code) but I would like to support full AOT scenario where executable memory can’t be modified (and also not have to embed/use LLVM at runtime, just like a plain C/C++ executable).
One (rather complicated) option I was thinking is:
Define a maximum number of those callback alive (let’s say 4096) – it’s not per function signature, but global
Define i8* thunkTargets
Define i8* ThunkIdToFuncPtr
Define 4096 funcs (not even sure how to do that with LLVM, might need to emit assembly?)
Thunk0: jmp thunkTargets;
Thunk1: jmp thunkTargets;
Thunk4095: jmp thunkTargets;
When I call a C function from C# with a callback, what happens is:
Find an unused slot in this thunk table (X)
Register C# member to function pointer in ThunkIdToFuncPtr[X]
Replace thunkTargets[X] with pointer address to “RedirectMethodFuncWithIntParameter” (one such function per callback signature)
This redirect method would receive arguments unmodified from C functions (since previous call was a simple jmp)
It would check in the call stack the current slot X being called (up in the callstack, if call instruction is “call Thunk3” from address Thunk3 we know X is 3 – it will need assembly, might be difficult to compute and won’t be portable…)
ThunkIdToFuncPtr[X] would give us the actual method to forward to
RedirectMethodFuncWithIntParameter would call ThunkIdToFuncPtrX
This design allow to use the slot number X of MethodX to differentiate the actual C# callback to call (code is small, so OK to have many, 4096 in this case), but still have only ONE actual dispatcher/redirect method per signature (code is much bigger).
Does that seem feasible? I don’t like the fact that I would have to step out of LLVM bitcode and generate some non-portable assembly code (I was trying to stick with LLVM bitcode so far).
Any other idea, or maybe some LLVM infrastructure/system/subproject or another LLVM frontend that had the same issue that might help me there?
Also, I am not sure whether LLVM trampoline could help me there? (not sure if they could do that, and if they work in full AOT scenarios, where JIT is not allowed?)