Frontend: How to use Member to Function Pointer as callbacks

Hello,

As part of a MSIL (i.e. C#) to LLVM frontend I am currently working on ( https://github.com/xen2/SharpLang ), I would need some help/hint about how to properly design “PInvoke callbacks”.

Through “PInvoke” mechanism .NET allows you to call C functions, i.e.:

C#:
[DllImport(“libc.so”)] extern void mempcy(void* dest, void* src, int size); // declaration of C function
memcpy(ptr1, ptr2, 32); // let’s use it in C# code

That’s quite easy to support.
However, the tricky part is that C# functions pointers (callbacks, a.k.a delegate) can be transmitted to those C functions:

C:
void MethodWithCallback(void(*callback)(int));

C#:
delegate void CallbackType(int result); // ← Point to a member to function pointer (need “this”)
[DllImport(“mylib.so”)] extern void MethodWithCallback(CallbackType callback);

An extra “this” parameter is needed to call the real method, but of course the calling C code doesn’t know about it (MethodWithCallback expects a non-member function pointer).

This is similar to C++ not being able to cast pointer to function member (containing this) as normal function pointers, because they are not compatible.

Using a JIT, it would be quite easy to deal with (generate thunk/code) but I would like to support full AOT scenario where executable memory can’t be modified (and also not have to embed/use LLVM at runtime, just like a plain C/C++ executable).

One (rather complicated) option I was thinking is:

  • Define a maximum number of those callback alive (let’s say 4096) – it’s not per function signature, but global

  • Define i8* thunkTargets[4096]

  • Define i8* ThunkIdToFuncPtr[4096]

  • Define 4096 funcs (not even sure how to do that with LLVM, might need to emit assembly?)
    Thunk0: jmp thunkTargets[0];
    Thunk1: jmp thunkTargets[1];

    Thunk4095: jmp thunkTargets[4095];

  • When I call a C function from C# with a callback, what happens is:

  • Find an unused slot in this thunk table (X)

  • Register C# member to function pointer in ThunkIdToFuncPtr[X]

  • Replace thunkTargets[X] with pointer address to “RedirectMethodFuncWithIntParameter” (one such function per callback signature)

  • This redirect method would receive arguments unmodified from C functions (since previous call was a simple jmp)

  • It would check in the call stack the current slot X being called (up in the callstack, if call instruction is “call Thunk3” from address Thunk3 we know X is 3 – it will need assembly, might be difficult to compute and won’t be portable…)

  • ThunkIdToFuncPtr[X] would give us the actual method to forward to

  • RedirectMethodFuncWithIntParameter would call ThunkIdToFuncPtrX

This design allow to use the slot number X of MethodX to differentiate the actual C# callback to call (code is small, so OK to have many, 4096 in this case), but still have only ONE actual dispatcher/redirect method per signature (code is much bigger).

Does that seem feasible? I don’t like the fact that I would have to step out of LLVM bitcode and generate some non-portable assembly code (I was trying to stick with LLVM bitcode so far).
Any other idea, or maybe some LLVM infrastructure/system/subproject or another LLVM frontend that had the same issue that might help me there?

Also, I am not sure whether LLVM trampoline could help me there? (not sure if they could do that, and if they work in full AOT scenarios, where JIT is not allowed?)

Thanks,

Hi Virgile,

One (rather complicated) option I was thinking is:
- Define a maximum number of those callback alive (let's say 4096) -- it's
not per function signature, but global
- Define i8* thunkTargets[4096]
- Define i8* ThunkIdToFuncPtr[4096]
- Define 4096 funcs (not even sure how to do that with LLVM, might need to
emit assembly?)
    Thunk0: jmp thunkTargets[0];
    Thunk1: jmp thunkTargets[1];
    ...
    Thunk4095: jmp thunkTargets[4095];

- When I call a C function from C# with a callback, what happens is:
  - Find an unused slot in this thunk table (X)
  - Register C# member to function pointer in ThunkIdToFuncPtr[X]
  - Replace thunkTargets[X] with pointer address to
"RedirectMethodFuncWithIntParameter" (one such function per callback
signature)
    - This redirect method would receive arguments unmodified from C
functions (since previous call was a simple jmp)
    - It would check in the call stack the current slot X being called (up
in the callstack, if call instruction is "call Thunk3" from address Thunk3
we know X is 3 -- it will need assembly, might be difficult to compute and
won't be portable...)
    - ThunkIdToFuncPtr[X] would give us the actual method to forward to
    - RedirectMethodFuncWithIntParameter would call
ThunkIdToFuncPtr[X](arg1)

I couldn't quite understand this approach -- at what point do you
figure out what the value of 'this' is?

-- Sanjoy

My bad, it would be captured as well during thunk allocation in a ThunkIdToThis[] array, and last line should be:

  • RedirectMethodFuncWithIntParameter would call ThunkIdToFuncPtr[X](ThunkIdToThis[X], arg1)

That feature sounds pretty hard to implement efficiently with an AOT compiler. =/

What is the lifetime of the callback object in .NET? And how frequently are they created? That seems like an important constraint.

I think you meant to use a table like this:
mov thunkThisPtrs[0] → ecx ; or scratch reg
jmp thunkTargets[0]
That’s probably the best you can do without creating the thunks at runtime.