Perfect forwarding?

Hey all, it's been a while since I have posted on this list, but I've been continuing my work with LLVM and getting lots done :slight_smile:

One question I wanted to ask is whether anyone had any advice on how to implement "perfect forwarding" in a runtime (as opposed to compile-time) context with LLVM. The general idea is that you have a bunch of methods that have different call signatures, and you want to have some generic handler function that can intercept these calls and then forward each call to another method that has the same signature as the original call.

A typical example of how this would be used is something like Mockito or EasyMock - you have some interface, and you dynamically create an implementation of that interface which is able to intercept all of the method calls and record the order in which they are called and the value of the arguments. The unit test code then performs some validation on the recording to ensure that the invocations match what was expected.

The key point is that the handler method doesn't know at compile-time the call signature of the various methods that are going to be intercepted. In the case of Java and C#, the VM is able to box all of the arguments that are primitive types, and pass to the handler method an array of type Object[]. Similarly, one can call method.invoke() on a method, passing in that same array of objects, and the VM will unbox the primitive types and then call the actual method with the right argument types.

One obvious way to do this is to generate two stub functions for every function emitted by the compiler, one that takes the original argument signature and creates a list of boxed objects, and one that takes a list of boxed objects, unboxes them and then calls the function. However, that's a lot of extra code to generate, and I am concerned about the amount of code bloat that would create.

What would be better is if there was some way for the generic handler function to be able to dynamically get a picture of the layout of the call frame. On the receiving end, this would be something similar to a varargs call, where the called function unpacks the argument list based on a runtime-provided schema. One problem with the varargs approach is that I wouldn't want every method call to have to use the varargs calling convention.

On the calling side, I'm not sure that any language has a means to dynamically construct an argument list for a call. If such a thing did exist, what I suppose it would do is given a description of a function's calling signature, figure out which arguments should be put in which registers and which should be pushed on the stack. In other words, to do at runtime what LLVM currently does at compile time.

-- Talin

What would be better is if there was some way for the generic handler
function to be able to dynamically get a picture of the layout of the
call frame. On the receiving end, this would be something similar to a
varargs call, where the called function unpacks the argument list based
on a runtime-provided schema. One problem with the varargs approach is
that I wouldn't want every method call to have to use the varargs
calling convention.

On the calling side, I'm not sure that any language has a means to
dynamically construct an argument list for a call. If such a thing did
exist, what I suppose it would do is given a description of a function's
calling signature, figure out which arguments should be put in which
registers and which should be pushed on the stack. In other words, to do
at runtime what LLVM currently does at compile time.

-- Talin

It so happens that I'm working on a language (and compiler for same)
that does just that and more. It has built-in support for runtime
code generation, as well as arbitrary compile-time code execution.

Basically, all functions are compiled by having my compiler create
code that issues LLVM calls to create the actual function. Thus, each
"normal" function is created by having the compiler emit and then JIT
and run a function whose purpose is to create the "normal" function.
When it's all done, the Module is written out, and opt can clean up
the mess. But if you want runtime code-generation, these
function-generating-functions can actually be called at runtime, and
you can pass parameters to them, and get basically everything you
would get from C++ templates and more. And the "normal" function code
has hooks to have its generator-function call other custom functions -
those are either the usual kind of function-generating-functions, or
"normal" functions in their own right with explicit LLVM calls to
generate code to be incorporated into the calling function's target.

I'm not sure how this could best be used in conjunction with an
existing language, but I'd be happy to post as much detail as y'all
are interested in. A "basic" variant of the language that handles all
existing LLVM types but doesn't add class constructs, or even names
for struct fields, is working fairly well as a prototype on my system,
and I'm about to start writing a standard library for it and adding
extensions such as overloaded operators and class descriptions with
named fields & methods. As it stands, "perfect forwarding" should be
well within its capabilities. (Especially since I'm not putting in
C++-like "references")

BLAST! LLVM mailing list headers are still royally screwed up...
My message is below...

Hey all, it's been a while since I have posted on this list, but I've
been continuing my work with LLVM and getting lots done :slight_smile:

One question I wanted to ask is whether anyone had any advice on how to
implement "perfect forwarding" in a runtime (as opposed to compile-time)
context with LLVM. The general idea is that you have a bunch of methods
that have different call signatures, and you want to have some generic
handler function that can intercept these calls and then forward each
call to another method that has the same signature as the original call.

A typical example of how this would be used is something like Mockito or
EasyMock - you have some interface, and you dynamically create an
implementation of that interface which is able to intercept all of the
method calls and record the order in which they are called and the value
of the arguments. The unit test code then performs some validation on
the recording to ensure that the invocations match what was expected.

The key point is that the handler method doesn't know at compile-time
the call signature of the various methods that are going to be
intercepted. In the case of Java and C#, the VM is able to box all of
the arguments that are primitive types, and pass to the handler method
an array of type Object[]. Similarly, one can call method.invoke() on a
method, passing in that same array of objects, and the VM will unbox the
primitive types and then call the actual method with the right argument
types.

One obvious way to do this is to generate two stub functions for every
function emitted by the compiler, one that takes the original argument
signature and creates a list of boxed objects, and one that takes a list
of boxed objects, unboxes them and then calls the function. However,
that's a lot of extra code to generate, and I am concerned about the
amount of code bloat that would create.

What would be better is if there was some way for the generic handler
function to be able to dynamically get a picture of the layout of the
call frame. On the receiving end, this would be something similar to a
varargs call, where the called function unpacks the argument list based
on a runtime-provided schema. One problem with the varargs approach is
that I wouldn't want every method call to have to use the varargs
calling convention.

On the calling side, I'm not sure that any language has a means to
dynamically construct an argument list for a call. If such a thing did
exist, what I suppose it would do is given a description of a function's
calling signature, figure out which arguments should be put in which
registers and which should be pushed on the stack. In other words, to do
at runtime what LLVM currently does at compile time.

Actually many languages have such means to do that, Python is probably
the most easy. You can even do so for C++ using Boost.Fusion
(Boost.Fusion has an invoke method that can call any arbitrary
function/functor with a vector of parameters, it also has a lot more
stuff, Boost.Fusion also has thing that allow C++ get the closest to
perfect forwarding then C++ has ever got before). Although
Boost.Fusion is not usable in the LLVM world, the way it works is. I
have even used it to implement a C++ network RPC system that
serializes up the arguments, handles pointers/references/const'ness
correctly, handles sync of network objects, and is used like any
normal C++ function. It let me create this syntax:

// example used code
class someClass : public NetworkIDObject {
public:
   void someMethod(float f) {
       // do other stuff
   }
}

// If someClass did not have NetworkIDObject as a child anywhere in
its multibase hierarchy, then the class itself is serialized up as if
by value, then passed to the remote function by pointer when
deserialized (on the stack, eh, eh?, has to be default constructable
to support that, or the registerRPCCall below will not even compile,
yes this is completely typesafe in every way)
void _testFunc(int i, someClass *ptr) {
   // do something
}

// example syntax
networkWorld::RPCCaller testFunc =
networkWorld->registerRPCCall("testFunc",&testFunc,enumCallOnClientOnly);
networkWorld::RPCCaller someMethod =
networkWorld->registerRPCCall("someMethod",&someClass::someMethod,enumCallOnServerOnly);

// then use it as a normal function, it uses near perfect forwarding
(more perfect the anything else in C++) if it will be called on the
local machine, or serialized up if called remotely:
someClass s;

testFunc(42, &s);

someMethod(&s, 3.14);

No macros, no pre-processing, all pure C++ thanks to Boost.Fusion.

The way it works is that the creater, the registerRPCCall, builds up a
recursive function (which every modern C++ compiler levels to a single
call in assembly I have noticed), one for each and every argument the
function takes, then Boost.Phoenix.Bind binds the method to the
networkWorld instance and returns it in a struct that only has two
members, the built-up call, and the original function pointer.

At call time (like calling testFunc above) the struct recursively
builds up another function list (that the compiler levels to a single
call), one for each passed in type. If the call will be locally then
it passes the function pointer to that struct and invokes that
function with the passed in params (near perfectly forwarded, try
getting it this perfect in C++ other way I dare you). If the call is
remote then it calls the built up function with the bitstream of the
network lib I wrote this for (RakNet, the 'new' RPC functionality in
it is what I wrote, download the code and take a look), the built up
function then serializes the types into the bitstream, and sends it
out across the network.

So, as you can see it calls handling function that are built up based
on the parameters, you can do that same thing in LLVM (heck, used
Boost.Fusion to build an example and compile it using llvm-gcc and see
what LLVM code it creates). :slight_smile:

Then there are things like Python where all non-keyword arguments are
in an array and all keyword arguments are in a dictionary, it just
hides it all behind a normal function call syntax, but you can still
do things like this:

def testFunc(a, b, c):
   pass # do stuff

testFunc(1,2,3)
testFunc(1,c=3,b=2) # = testFunc(1,2,3)
arr = [1,2,3]
testFunc(*arr) # = testFunc(1,2,3)
arr = [1]
kw = {b:2,c:3}
testFunc(*arr,**kw) # testFunc(1,2,3)
ks = {a:1,b:2,c:3}
testFunc(**kw) # testFunc(1,2,3)

and so forth. Slower dispatching, but very powerful.

OvermindDL1 wrote:

BLAST! LLVM mailing list headers are still royally screwed up...
My message is below...

Hey all, it's been a while since I have posted on this list, but I've
been continuing my work with LLVM and getting lots done :slight_smile:

One question I wanted to ask is whether anyone had any advice on how to
implement "perfect forwarding" in a runtime (as opposed to compile-time)
context with LLVM. The general idea is that you have a bunch of methods
that have different call signatures, and you want to have some generic
handler function that can intercept these calls and then forward each
call to another method that has the same signature as the original call.

A typical example of how this would be used is something like Mockito or
EasyMock - you have some interface, and you dynamically create an
implementation of that interface which is able to intercept all of the
method calls and record the order in which they are called and the value
of the arguments. The unit test code then performs some validation on
the recording to ensure that the invocations match what was expected.

The key point is that the handler method doesn't know at compile-time
the call signature of the various methods that are going to be
intercepted. In the case of Java and C#, the VM is able to box all of
the arguments that are primitive types, and pass to the handler method
an array of type Object[]. Similarly, one can call method.invoke() on a
method, passing in that same array of objects, and the VM will unbox the
primitive types and then call the actual method with the right argument
types.

One obvious way to do this is to generate two stub functions for every
function emitted by the compiler, one that takes the original argument
signature and creates a list of boxed objects, and one that takes a list
of boxed objects, unboxes them and then calls the function. However,
that's a lot of extra code to generate, and I am concerned about the
amount of code bloat that would create.

What would be better is if there was some way for the generic handler
function to be able to dynamically get a picture of the layout of the
call frame. On the receiving end, this would be something similar to a
varargs call, where the called function unpacks the argument list based
on a runtime-provided schema. One problem with the varargs approach is
that I wouldn't want every method call to have to use the varargs
calling convention.

On the calling side, I'm not sure that any language has a means to
dynamically construct an argument list for a call. If such a thing did
exist, what I suppose it would do is given a description of a function's
calling signature, figure out which arguments should be put in which
registers and which should be pushed on the stack. In other words, to do
at runtime what LLVM currently does at compile time.
    
Actually many languages have such means to do that, Python is probably
the most easy. You can even do so for C++ using Boost.Fusion
(Boost.Fusion has an invoke method that can call any arbitrary
function/functor with a vector of parameters, it also has a lot more
stuff, Boost.Fusion also has thing that allow C++ get the closest to
perfect forwarding then C++ has ever got before). Although
Boost.Fusion is not usable in the LLVM world, the way it works is. I
have even used it to implement a C++ network RPC system that
serializes up the arguments, handles pointers/references/const'ness
correctly, handles sync of network objects, and is used like any
normal C++ function. It let me create this syntax:
  

I should have been more specific - I'm aware of Python's abilities in this regard, and I've used it a number of times before. What I should have said was that I wasn't aware of any statically compiled language that can dynamically construct a call frame.

What C++ does is something different - it pre-generates at compile time a stub function for each permutation of statically typed arguments. These stub functions box/unbox the arguments. That is a powerful feature, but I don't think it is as powerful as, say, Java's "Proxy" capability which can operate on whole classes rather than individual methods. C++ templates have no way to iterate over all of the members of a class, however if the stub function generator was built into the compiler to begin with then you wouldn't need to - once you have the ability to intercept any method call generically, you can do at runtime what you would normally use templates to do at compile time. (I've noticed that metaprogramming and reflection tend to get used for similar things.) Which approach (runtime or compile time) is better is really just a time vs. space tradeoff.

One approach that I have been considering is rather than generating a pair of stub functions for each normal function, instead generate a pair of stub functions for each unique type signature - the idea being that two methods that have the same calling signature can use the same argument boxing/unboxing logic. The number of permutations could be reduced further by weakening the type signature - i.e. convert all pointer arguments to void* and so on.

Each stub function would have a function name which is generated from the type signature of the arguments, and then these functions would be given "linkonce" linkage, so that duplicates would get coalesced at link time.

What's missing from this picture is a scheme for encoding a set of LLVM argument types as a string function name. In particular, I'd need some way to accomodate "structs as first class data types" arguments.

Blast! LLVM mailing server *still* has the headers broken...

OvermindDL1 wrote:

BLAST! LLVM mailing list headers are still royally screwed up...
My message is below...

Hey all, it's been a while since I have posted on this list, but I've
been continuing my work with LLVM and getting lots done :slight_smile:

One question I wanted to ask is whether anyone had any advice on how to
implement "perfect forwarding" in a runtime (as opposed to compile-time)
context with LLVM. The general idea is that you have a bunch of methods
that have different call signatures, and you want to have some generic
handler function that can intercept these calls and then forward each
call to another method that has the same signature as the original call.

A typical example of how this would be used is something like Mockito or
EasyMock - you have some interface, and you dynamically create an
implementation of that interface which is able to intercept all of the
method calls and record the order in which they are called and the value
of the arguments. The unit test code then performs some validation on
the recording to ensure that the invocations match what was expected.

The key point is that the handler method doesn't know at compile-time
the call signature of the various methods that are going to be
intercepted. In the case of Java and C#, the VM is able to box all of
the arguments that are primitive types, and pass to the handler method
an array of type Object[]. Similarly, one can call method.invoke() on a
method, passing in that same array of objects, and the VM will unbox the
primitive types and then call the actual method with the right argument
types.

One obvious way to do this is to generate two stub functions for every
function emitted by the compiler, one that takes the original argument
signature and creates a list of boxed objects, and one that takes a list
of boxed objects, unboxes them and then calls the function. However,
that's a lot of extra code to generate, and I am concerned about the
amount of code bloat that would create.

What would be better is if there was some way for the generic handler
function to be able to dynamically get a picture of the layout of the
call frame. On the receiving end, this would be something similar to a
varargs call, where the called function unpacks the argument list based
on a runtime-provided schema. One problem with the varargs approach is
that I wouldn't want every method call to have to use the varargs
calling convention.

On the calling side, I'm not sure that any language has a means to
dynamically construct an argument list for a call. If such a thing did
exist, what I suppose it would do is given a description of a function's
calling signature, figure out which arguments should be put in which
registers and which should be pushed on the stack. In other words, to do
at runtime what LLVM currently does at compile time.

Actually many languages have such means to do that, Python is probably
the most easy. You can even do so for C++ using Boost.Fusion
(Boost.Fusion has an invoke method that can call any arbitrary
function/functor with a vector of parameters, it also has a lot more
stuff, Boost.Fusion also has thing that allow C++ get the closest to
perfect forwarding then C++ has ever got before). Although
Boost.Fusion is not usable in the LLVM world, the way it works is. I
have even used it to implement a C++ network RPC system that
serializes up the arguments, handles pointers/references/const'ness
correctly, handles sync of network objects, and is used like any
normal C++ function. It let me create this syntax:

I should have been more specific - I'm aware of Python's abilities in this
regard, and I've used it a number of times before. What I should have said
was that I wasn't aware of any statically compiled language that can
dynamically construct a call frame.

What C++ does is something different - it pre-generates at compile time a
stub function for each permutation of statically typed arguments. These stub
functions box/unbox the arguments. That is a powerful feature, but I don't
think it is as powerful as, say, Java's "Proxy" capability which can operate
on whole classes rather than individual methods. C++ templates have no way
to iterate over all of the members of a class, however if the stub function
generator was built into the compiler to begin with then you wouldn't need
to - once you have the ability to intercept any method call generically, you
can do at runtime what you would normally use templates to do at compile
time. (I've noticed that metaprogramming and reflection tend to get used for
similar things.) Which approach (runtime or compile time) is better is
really just a time vs. space tradeoff.

One approach that I have been considering is rather than generating a pair
of stub functions for each normal function, instead generate a pair of stub
functions for each unique type signature - the idea being that two methods
that have the same calling signature can use the same argument
boxing/unboxing logic. The number of permutations could be reduced further
by weakening the type signature - i.e. convert all pointer arguments to
void* and so on.

Each stub function would have a function name which is generated from the
type signature of the arguments, and then these functions would be given
"linkonce" linkage, so that duplicates would get coalesced at link time.

What's missing from this picture is a scheme for encoding a set of LLVM
argument types as a string function name. In particular, I'd need some way
to accomodate "structs as first class data types" arguments.

Boost.Fusion has adapters to convert structs/classes into
tuples/kwtuples. You have to write the short and simple adapter for
your classes, but it would allow you to iterate over
everything/anything in it. I have done this exact thing. Not quite
fully what you want, but Boost.Fusion and other bits of other boost
metalibraries seem to do most of what you want. Might be interesting
to look at?

OvermindDL1 wrote:

Boost.Fusion has adapters to convert structs/classes into
tuples/kwtuples. You have to write the short and simple adapter for
your classes, but it would allow you to iterate over
everything/anything in it. I have done this exact thing. Not quite
fully what you want, but Boost.Fusion and other bits of other boost
metalibraries seem to do most of what you want. Might be interesting
to look at?
  

Possibly. Bear in mind that part of my goal is to make something like Boost obsolete - because my language's type inference engine is more powerful than C++'s template resolution algorithms, it will allow many of the same things that Boost does, but without requiring the complex metaprogramming techniques that Boost uses to implement them. At least, that's the plan.

Gah, headers not set appropriately, message I sent is forwarded below
(it still takes a good extra 45-90 seconds to change the "To" line on
this device...):