Hi Chris,
Thanks for the supporting words! I'm pushing the document both for
egoistic motives (like so many others, I'll learn a ton from this document)
and for altruistic motives - the easier it is to implement a new language,
the more interesting and highly well-thought out languages we will see in
the future. And I see it as my purpose, as a mostly black-box user of
LLVM, to enhance the experience for newcomers so that they don't turn away
and waste time on other projects just because it all seems rather
overwhelming at first.
I couldn't recall having heard of the "alloca trick", but a Google search
revealed that this is described in the Kaleidoscope sample. I will be more
than happy to include it - that's precisely what the document is also for:
Teaching people all the things that cannot easily be said in a Language
Reference. In a way, the name is already now becoming poorly chosen.
Because I begin to see a User's Guide forming in the horizon. And that
would go really well with the Language Reference; most products have both.
TBQH I'm pretty set on this being a guide for language frontends, rather
than a general "user's guide" for the IR. The IR has at least two very
different classes of users: optimization writers (which are mostly
transforming IR) and language frontend writers (which are mostly creating
IR). Almost everything in this document is geared at language frontend
writers (or more generally "people generating IR"), rather than
optimization writers (we already have pretty good docs for them).
I've added all of your suggestions to my to-do list, which I'll write into
the document later today, so that none of the suggestions get lost. Yes,
the unions I thought about at some point but forgot about them again. I
also feel that there needs to be good documentation of GEP and extractvalue
- when to use one and when to use the other. In fact, the whole
structure/union aspect seems mostly overlooked because I got too
preoccupied with the class stuff.
GEP is for forming addresses, and extractvalue/insertvalue is for
extracting/inserting fields from aggregate-typed SSA values.
I am not at all opposed to working directly from llvm.org/docs, the only
thing is that I do a lot of small commits with an occasional large commit
here and there, and I wouldn't want to provoke a review whenever I change a
single line here or there. The reason I use GitHub is that it provides a
nifty, colorized page (
https://github.com/archfrog/llvm-doc/blob/master/MappingHighLevelConstructsToLLVMIR.rst)
that people (including myself) can view without going through the trouble
of installing and running Sphinx locally. And also, it allows people to
submit reviews by forking, creating an issue, or attaching a comment (all
three of which have already been in use). I think it is better that I do
it in GitHub for the time being as I tend to make many small, stupid
mistakes that I usually discover quite quickly and then fix. Then when I
feel I've got something interesting to show people, I can submit a commit
and everybody can join in the review.
If it's easier for your workflow to iterate on github, that's fine,
although eventually we will want to move it into docs/. It definitely has a
bit of a "grab bag" feel; as the content solidifies, I'd like to see a
better organization.
There's some content though that might be easier to develop in-tree, like
how to hint the optimizers to get maximum performance. Especially alias
analysis (both TBAA, and the various function/parameter attributes) and
alignment, as most non-C languages can provide very strong aliasing and
alignment guarantees.
-- Sean Silva