Hi,
I've been working on a proof of concept for a new configuration language for LLVM: specifically for my needs in llvmc2, but I have tried to make it as generic as possible for use throughout LLVM if other projects would like to make use of it. It's a compiler that compiles a near-subset of Standard ML to C++, with an architecture deliberately very similar to TableGen.
The code is not yet ready to be merged by any means - it has many failure cases and may not compile at any given time - but I thought that before I go further I should send a proposal to the list. The WIP code, for the curious, is here:
http://github.com/pcwalton/llvm-nw/tree/miniml
If TableGen is a language that allows users to specify records of domain-specific information, TableML is designed to be a configuration language that is designed to be allow users to specify how to *construct* records of domain-specific information. TableML has a plugin architecture in which at any given time one of several backends is in use, just as in TableGen. The backends specify one or more record types and definitions. TableML then reads a configuration file, evaluates the definitions, and passes the results to the backend for serialization.
For instance, we might have a RegisterInfo backend that declares a definition of "RegisterNames : string list". Then we could have a TableML input file like this:
def val RegisterInfo = [ "eax", "ebx", "ecx", "edx" ]
Or we could have a more complex one that performs computation to produce the result.
val make32bit = (fn x => strcat("e", x))
def val RegisterInfo = map make32bit [ "ax", "bx", "cx", "dx" ]
Obviously, this example is somewhat contrived, but it's just to illustrate that arbitrary computation is allowed (and is performed at compile time), as long as the definitions end up with the correct types. This could be thought of as a generalization of the "class" and "multiclass" concepts in TableGen. Also notice that, like all ML-based languages, TableML is strongly typed, and it makes heavy use of Hindley-Milner type inference. (The parser, lexer, and typechecker are all coded already, by the way, just not very well tested at the moment.) The subset of Standard ML that TableML supports is essentially the one shown here:
http://www.macs.hw.ac.uk/ultra/compositional-analysis/type-error-slicing/slicing.cgi
Now the upshot of this for the compiler driver is that function types are acceptable types for definitions. This means that, unlike TableGen, backends that want to allow scripting (which is currently just llvmc2) don't have to define their own programming languages. Instead, they can simply request a definition with a function type (e.g. SomeFunction : int -> int). TableML will hand the AST for the function, as well as its values, over to the backend for emission as C++ code. The backend is free to generate any C++ code it wants for the typed ASTs (of course, some support routines could be added to the base to make this easier).
So, in summary, there are two main benefits to TableML that I see, depending on the backend/use case:
(1) Users of backends that don't need scripting support can benefit from arbitrary computation in order to express the records, more than the macro facility that TableGen provides.
(2) Users of backends that do need scripting support don't have to define their own programming languages, without any run-time performance loss when compared to TableGen.
I'd definitely appreciate any comments on this proposal! I'd also be happy to clarify any issues with this explanation.
Patrick