Proposal for TableML, llvmc2 configuration language

Patrick_Walton · November 26, 2008, 9:34pm

Hi,

I've been working on a proof of concept for a new configuration language for LLVM: specifically for my needs in llvmc2, but I have tried to make it as generic as possible for use throughout LLVM if other projects would like to make use of it. It's a compiler that compiles a near-subset of Standard ML to C++, with an architecture deliberately very similar to TableGen.

The code is not yet ready to be merged by any means - it has many failure cases and may not compile at any given time - but I thought that before I go further I should send a proposal to the list. The WIP code, for the curious, is here:

http://github.com/pcwalton/llvm-nw/tree/miniml

If TableGen is a language that allows users to specify records of domain-specific information, TableML is designed to be a configuration language that is designed to be allow users to specify how to *construct* records of domain-specific information. TableML has a plugin architecture in which at any given time one of several backends is in use, just as in TableGen. The backends specify one or more record types and definitions. TableML then reads a configuration file, evaluates the definitions, and passes the results to the backend for serialization.

For instance, we might have a RegisterInfo backend that declares a definition of "RegisterNames : string list". Then we could have a TableML input file like this:

def val RegisterInfo = [ "eax", "ebx", "ecx", "edx" ]

Or we could have a more complex one that performs computation to produce the result.

val make32bit = (fn x => strcat("e", x))
def val RegisterInfo = map make32bit [ "ax", "bx", "cx", "dx" ]

Obviously, this example is somewhat contrived, but it's just to illustrate that arbitrary computation is allowed (and is performed at compile time), as long as the definitions end up with the correct types. This could be thought of as a generalization of the "class" and "multiclass" concepts in TableGen. Also notice that, like all ML-based languages, TableML is strongly typed, and it makes heavy use of Hindley-Milner type inference. (The parser, lexer, and typechecker are all coded already, by the way, just not very well tested at the moment.) The subset of Standard ML that TableML supports is essentially the one shown here:
http://www.macs.hw.ac.uk/ultra/compositional-analysis/type-error-slicing/slicing.cgi

Now the upshot of this for the compiler driver is that function types are acceptable types for definitions. This means that, unlike TableGen, backends that want to allow scripting (which is currently just llvmc2) don't have to define their own programming languages. Instead, they can simply request a definition with a function type (e.g. SomeFunction : int -> int). TableML will hand the AST for the function, as well as its values, over to the backend for emission as C++ code. The backend is free to generate any C++ code it wants for the typed ASTs (of course, some support routines could be added to the base to make this easier).

So, in summary, there are two main benefits to TableML that I see, depending on the backend/use case:
(1) Users of backends that don't need scripting support can benefit from arbitrary computation in order to express the records, more than the macro facility that TableGen provides.
(2) Users of backends that do need scripting support don't have to define their own programming languages, without any run-time performance loss when compared to TableGen.

I'd definitely appreciate any comments on this proposal! I'd also be happy to clarify any issues with this explanation.

Patrick

Mikhail_Glushenkov · November 27, 2008, 9:42pm

Hi Patrick,

I've been working on a proof of concept for a new configuration language
for LLVM: specifically for my needs in llvmc2, but I have tried to make
it as generic as possible for use throughout LLVM if other projects
would like to make use of it.

Your proposal seems interesting - I especially like that you are using a
functional language. When your compiler will be able to generate llvmc plugins,
it will provide a nice TableGen alternative for llvmc.

val make32bit = (fn x => strcat("e", x))
def val RegisterInfo = map make32bit [ "ax", "bx", "cx", "dx" ]

It'd probably be nice if it was possible to syntactically distinguish between
what is evaluated at run-time and at compile-time (like in Template Haskell).

The subset of Standard ML that TableML supports is essentially
the one shown here:
http://www.macs.hw.ac.uk/ultra/compositional-analysis/

type-error-slicing/slicing.cgi

As I understand from this link, TableML supports only lists and some primitive
types (no algebraic datatypes).

That'd be enough for llvmc, but I can't speak for the other
backends; you'll probably need to integrate some additional
syntactic sugar to cater to their needs.

This means that, unlike TableGen,
backends that want to allow scripting (which is currently just llvmc2)
don't have to define their own programming languages. Instead, they can
simply request a definition with a function type (e.g. SomeFunction :
int -> int). TableML will hand the AST for the function, as well as its
values, over to the backend for emission as C++ code.

Another (pie-in-the-sky) option is to compile TableML to LLVM IR and integrate
llvmc with the JIT engine.
That way llvmc won't even need a C++ compiler present to support plugins.
But that's probably too heavyweight for a humble compiler driver:)

Patrick_Walton · November 28, 2008, 1:35am

It'd probably be nice if it was possible to syntactically distinguish between
what is evaluated at run-time and at compile-time (like in Template Haskell).

Well, it is in a sense: things evaluated at run time will always be inside lambda functions, while things evaluated at compile time aren't.

As I understand from this link, TableML supports only lists and some primitive
types (no algebraic datatypes).

That'd be enough for llvmc, but I can't speak for the other
backends; you'll probably need to integrate some additional
syntactic sugar to cater to their needs.

The current plan is that backends will be able to define their own datatypes in the Standard ML sense, with explicit constructors.

Another (pie-in-the-sky) option is to compile TableML to LLVM IR and integrate
llvmc with the JIT engine.
That way llvmc won't even need a C++ compiler present to support plugins.
But that's probably too heavyweight for a humble compiler driver:)

At first I considered that, but this might create a bootstrapping problem: if TableML is to become an alternative to TableGen, then we could get into a situation in which TableML is needed to compile LLVM, and LLVM is needed to compile TableML.

Thanks for the feedback!

Patrick

Mike_Stump1 · November 29, 2008, 9:58pm

Not lisp? [ runs away ducking ]

Topic		Replies	Views
TableML status LLVM Dev List Archives	1	92	December 17, 2008
Discussing feasibility: Generating Tablegen files for easier LLVM backend development? Common Infrastructure	11	583	June 26, 2023
Tablegen backend HOWTO? LLVM Dev List Archives	0	105	August 23, 2010
Creating a tablegen backend LLVM Dev List Archives	3	85	August 23, 2010
Need help with review: configurable register sizes in TableGen LLVM Dev List Archives	0	97	August 29, 2017

Proposal for TableML, llvmc2 configuration language

Related topics