LLVM has long needed a tutorial for people who are interested in using it to implement their favorite language and to demonstrate how to use the JIT. To help solve this, I've put together a little tutorial that runs through the implementation and extension of a toy language here:
At this point, the tutorial is feature complete, but might still need some final editing. Before I try to get other sites to link to it, I thought it would be good to get some more eyeballs on it and get some thoughts and feedback from you.
LLVM has long needed a tutorial for people who are interested in using
it to implement their favorite language and to demonstrate how to use
the JIT. To help solve this, I've put together a little tutorial that
runs through the implementation and extension of a toy language here:
At this point, the tutorial is feature complete, but might still need
some final editing. Before I try to get other sites to link to it, I
thought it would be good to get some more eyeballs on it and get some
thoughts and feedback from you.
Anyone have thoughts or feedback?
Nice job. The only bit that is not immediately clear is the 'Proto' variable, but is clear when looking through the code at the end of the page.
Could do with a link to the LLVMBuilder class reference material.
LLVM has long needed a tutorial for people who are interested in using it to implement their favorite language and to demonstrate how to use the JIT. To help solve this, I've put together a little tutorial that runs through the implementation and extension of a toy language here:
A big thanks for it. I had just a glance on it (some I may be wrong) but I would find nice an exact explanation of llvm/examples/HowToUseJIT/ in particular it seems that the exact way of calling the just JIT-ed code is skipped in this tutorial. Maybe I missed some link.
Nice work, Chris! This is a much needed tutorial. Some comments:
– It would be helpful to add some navigation links at the top and bottom of the pages.
– Not clear why PrototypeAST::Codegen returns a Function* instead of a Value* like the other CodeGen methods?
– It can be convenient to use Visitor methods on the AST classes for code generation, instead of hard-coding the CodeGen methods into the AST classes. Many languages will need other traversals besides the CodeGen operations, e.g., for type checking, class layout, etc.
– In the optimization section, how much code size reduction have you seen in practice by doing constant folding in the LLVMFoldingBuilder (instead of just letting the later optimization pass take care of it)?
– One of the harder parts is generating type declarations correctly for recursively connected types (e.g., for classes). Without extending Kaleidoscope, it would be be helpful to add a link to the Programmers Manual section on recursive type construction.
Found some typos in LangImpl2.html:
/// identifierexpr
/// ::= identifer
/// ::= identifer '(' expression* ')'
Fixed, thanks!
A big thanks for it. I had just a glance on it (some I may be wrong)
but I would find nice an exact explanation of
llvm/examples/HowToUseJIT/ in particular it seems that the exact way of
calling the just JIT-ed code is skipped in this tutorial. Maybe I missed
some link.
-- It would be helpful to add some navigation links at the top and bottom of the pages.
I added a TOC to each chapter, thanks.
-- Not clear why PrototypeAST::Codegen returns a Function* instead of a Value* like the other CodeGen methods?
I clarified this in the text. The short version is that PrototypeAST doesn't correspond to an expression value.
-- It can be convenient to use Visitor methods on the AST classes for code generation, instead of hard-coding the CodeGen methods into the AST classes. Many languages will need other traversals besides the CodeGen operations, e.g., for type checking, class layout, etc.
Yep, the tutorial also leaks memory, uses global variables, and commits a number of other sins against good software engineering practice. However, I added a mention of the possibility of using a visitor to the text.
-- In the optimization section, how much code size reduction have you seen in practice by doing constant folding in the LLVMFoldingBuilder (instead of just letting the later optimization pass take care of it)?
Depends on the language. For C, it can be significant. In any case, there is no reason not to use the folding builder, so it doesn't hurt anything.
-- One of the harder parts is generating type declarations correctly for recursively connected types (e.g., for classes). Without extending Kaleidoscope, it would be be helpful to add a link to the Programmers Manual section on recursive type construction.
I added a link to chapter 8, thanks for the feedback!
-Chris
Hi All,
LLVM has long needed a tutorial for people who are interested in using
it to implement their favorite language and to demonstrate how to use
the JIT. To help solve this, I've put together a little tutorial that
runs through the implementation and extension of a toy language here:
At this point, the tutorial is feature complete, but might still need
some final editing. Before I try to get other sites to link to it, I
thought it would be good to get some more eyeballs on it and get some
thoughts and feedback from you.
I edited "The basic language, with its lexer" somewhat and I am
attaching a .html file. This is simply spelling/grammar editing and not
content.
It includes a few small changes to things like - ending a sentence with
a noun and starting the next sentence with the same noun, clarifying
some statements, not using "and" too many times in a sentence...little
things like that.
A simple diff should show the changes. Please let me know if this format
(ie. html file) is acceptable for your use. I don't think I changed
anything too drastically, but please let me know if I took too many
liberties. You may also tell me not to be so picky, if you like
Thanks,
K.Wilson
P.S. Good work. This is a well written and much needed tutorial.
I edited "The basic language, with its lexer" somewhat and I am
attaching a .html file. This is simply spelling/grammar editing and not
content.
Thanks, I merged them in!
A simple diff should show the changes. Please let me know if this format
(ie. html file) is acceptable for your use. I don't think I changed
anything too drastically, but please let me know if I took too many
liberties. You may also tell me not to be so picky, if you like
All the changes you made look great, I appreciate it. If you are interested in making future changes, please send me a diff of the change you make. This makes it easier for me, because I don't know which version of the file you started from, making it harder to pull the difference out on my end.
Very nice. Here's a couple comments on the first 6 chapters: http://llvm.org/docs/tutorial/LangImpl1.html
"We handle comments by skipping to the end of the line and then
returning the next comment."
Shouldn't this say "returning the next comment"?
http://llvm.org/docs/tutorial/LangImpl2.html
I was a bit confused at first because the AST node classes are called
ASTs. Instead of saying "ExprAST node" all the time, who not just call
the class ExprNode or ExprAstNode?
Also it looks like there's a typo in ch 8:
As one trivial example, it is possible to add language-specific
optimization passes that "known" things about code compiled for a
language.
Very nice. Here's a couple comments on the first 6 chapters: http://llvm.org/docs/tutorial/LangImpl1.html
"We handle comments by skipping to the end of the line and then
returning the next comment."
Shouldn't this say "returning the next comment"?
Fixed.
http://llvm.org/docs/tutorial/LangImpl2.html
I was a bit confused at first because the AST node classes are called
ASTs. Instead of saying "ExprAST node" all the time, who not just call
the class ExprNode or ExprAstNode?
I don't think ExprNode or ExprASTNode is more clear than ExprAST. "ExprAST" is the right thing because they are AST's and they are Expr specific.
Nice job. The only bit that is not immediately clear is the 'Proto'
variable, but is clear when looking through the code at the end of the page.
Where in the tutorial? What would you suggest that I say?
First reference in 'Function *FunctionAST::Codegen()'. What Proto is is not totally clear till you look at the complete code at the end of the page. A minor point.
Could do with a link to the LLVMBuilder class reference material.
I added a link to the doxygen info, thanks!
It might be an idea to add a download link for the complete code to save user having to cut and paste it to try it. A minor point.
Nice job. The only bit that is not immediately clear is the 'Proto'
variable, but is clear when looking through the code at the end of the
page.
Where in the tutorial? What would you suggest that I say?
First reference in 'Function *FunctionAST::Codegen()'. What Proto is is not
totally clear till you look at the complete code at the end of the page. A
minor point.
It's still unclear where you are talking about this. I clarified the prose in chapter 3 that first describes FunctionAST::Codegen. If you still think it is unclear, please propose new wording, thanks!
Could do with a link to the LLVMBuilder class reference material.
I added a link to the doxygen info, thanks!
It might be an idea to add a download link for the complete code to save
user having to cut and paste it to try it. A minor point.
As I told Owen: "If the user can't copy and paste, there is no hope for them". It's not worth the burden to maintain yet another copy of every piece of source that has to be updated when changes are made.
I have finished with the "Implementing a Parser and AST" chapter and
attached a diff file. Once again, let me know if the format of the diff
is not acceptable (ie. I used subdirs for the two different files and a
'diff -ru'....).