Test compiler help

Hi all,

I'm writing a test compiler to understand the overall structure of
LLVM and I managed to produce IR for expressions, functions and
function calls.

I'm following the Kaleidoscope example and had a hard time de-tangling
the language specifics from LLVM syntax. Anyhow, I got here and I'd
like some more specific help to finish my example.

My very simple language has global variables and static functions
only, so I can focus on the code generation. My problems are below.

1. Kaleidoscope uses function calls, which I believe prepare the
stack, saves the link register and so on, which I'm not interested at
all, as everything is static. What's the correct way to create a
simple jump instead of a call?

2. Each static function is called a "state". At the end, it MUST jump
to the next state, even if it's itself. If it doesn't, the program
must end. I also want create an end of the program whenever the token
"end" is parsed. How can I do it?

3. When the program is running, I'd like to print some values. As
Kaleidoscope "returns" the value to the JIT, it's easy to give the
answer, but in my case it won't happen. Do I need to create a simple
IR routine to print doubles or is there something easy to use for that
purpose?

Thanks!
--renato

PS: Where's the best place to find such information?

Hi all,

I’m writing a test compiler to understand the overall structure of
LLVM and I managed to produce IR for expressions, functions and
function calls.

I’m following the Kaleidoscope example and had a hard time de-tangling
the language specifics from LLVM syntax. Anyhow, I got here and I’d
like some more specific help to finish my example.

My very simple language has global variables and static functions
only, so I can focus on the code generation. My problems are below.

  1. Kaleidoscope uses function calls, which I believe prepare the
    stack, saves the link register and so on, which I’m not interested at
    all, as everything is static. What’s the correct way to create a
    simple jump instead of a call?

Jumps in LLVM are branches to BasicBlocks.

  1. Each static function is called a “state”. At the end, it MUST jump
    to the next state, even if it’s itself. If it doesn’t, the program
    must end. I also want create an end of the program whenever the token
    “end” is parsed. How can I do it?

I’m not sure I understand this. (Are you implementing a Turing machine?)
It sounds like “state” here is global, rather than per-function, so make it a global array.

  1. When the program is running, I’d like to print some values. As
    Kaleidoscope “returns” the value to the JIT, it’s easy to give the
    answer, but in my case it won’t happen. Do I need to create a simple
    IR routine to print doubles or is there something easy to use for that
    purpose?

Yes, you’ll probably need to implement printing functionality.

PS: Where’s the best place to find such information?

http://llvm.org/docs/ has a search box at the top of the page.

3. When the program is running, I'd like to print some values. As
Kaleidoscope "returns" the value to the JIT, it's easy to give the
answer, but in my case it won't happen. Do I need to create a simple
IR routine to print doubles or is there something easy to use for that
purpose?

Yes, you'll probably need to implement printing functionality.

Or you could link in clang-compiled C code that does it, or call out
to code that does it.

Reid

3. When the program is running, I'd like to print some values. As
Kaleidoscope "returns" the value to the JIT, it's easy to give the
answer, but in my case it won't happen. Do I need to create a simple
IR routine to print doubles or is there something easy to use for that
purpose?

Yes, you'll probably need to implement printing functionality.

libc will get linked in automatically, so you have access to all those
functions automatically, like puts or printf. This is pretty obvious
when using the c example compiler:

http://llvm.org/demo/index.cgi

Or you could link in clang-compiled C code that does it, or call out
to code that does it.

And along with this, you can use the llvm-gcc family of compilers to
generate bitcode from c, c++, obj-c, and etc to implement your
runtime. Llvm also supports the c calling convention, so you can link
against normal c libraries. However, llvm won't be able to optimize it
as well.

Hi all,

I'm writing a test compiler to understand the overall structure of
LLVM and I managed to produce IR for expressions, functions and
function calls.

I'm following the Kaleidoscope example and had a hard time de-tangling
the language specifics from LLVM syntax. Anyhow, I got here and I'd
like some more specific help to finish my example.

My very simple language has global variables and static functions
only, so I can focus on the code generation. My problems are below.

1. Kaleidoscope uses function calls, which I believe prepare the
stack, saves the link register and so on, which I'm not interested at
all, as everything is static. What's the correct way to create a
simple jump instead of a call?

One way to do this is to just use one big Function and use branches
to do the jumping. If you're just making a simple state machine,
this may be the best approach.

Another way to do this is with tail calls. In short, you put a call at
the end of each function which "calls" the next function, marked with
with the "tail" keyword, and mark all the functions and calls with the
"fastcc" keyword. Check out LangRef.html for details on these
keywords. Then run llc with the -tailcallopt flag, and it should
optimize the call into a simple jump. This approach ought to be more
suitable for more involved functional-style programming languages,
though I don't personally have experience using LLVM in this way.

2. Each static function is called a "state". At the end, it MUST jump
to the next state, even if it's itself. If it doesn't, the program
must end. I also want create an end of the program whenever the token
"end" is parsed. How can I do it?

The program exits after main returns, so with the tail call model,
the way to exit the program is to have a function just return,
instead of calling another function. For the branch model, just
return from main.

Dan

Hi All,

Thanks for all the ideas, seems that the easiest approach is to create
a single main function and work up the goto statement back and forth
and put a return at the end of every block to avoid unwanted state
change.

Although, if I ever want to merge two pieces of code, which is not at
all unlikely, I might have non-trivial problems. In that case, the
tail call optimization seems a good idea to follow.

Luckily, on both cases, a simple return should suffice to exit the program.

I'm trying to make my code as educational as possible and maybe that
can also help other people to understand the basics of LLVM (when it's
finished)...

cheers,
--renato

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

Alternatively, you could do your own trampolining, so that you don't
have to worry about whether tail call optimization is working. Make
each state be a function that returns a function pointer to the next
state or NULL, and then make the main function call that pointer in a
loop.

Reid