How the LLVM tools work together

Hi,

I've been reading through some of the documentation and I'm a little confused.

What I'm wondering is if someone could explain how the different tools in LLVM (llvmc, clang, llvm-gcc, llvm-ar, etc.) work together to go from the C code I create through to a running executable (after linking).

Apologies if this isn't the right list. I'm not a compiler developer so I'm rather a novice with how LLVM works.

Cheers,

Stephen

What I'm wondering is if someone could explain how the different tools in LLVM (llvmc, clang, llvm-gcc, llvm-ar, etc.) work together to go from the C code I create through to a running executable (after linking).

Hi Stephen,

This is probably *not* the right list, since this is the developer's
list, not the users' (if there is such thing). But it doesn't hurt to
shoot you in the right direction anyway.

From the high level of your question, LLVM is a compiler like any

other. Clang should behave like GCC in most ways, so that's more than
you need to know if you just want to compile code with LLVM.

You can get a glimpse of the LLVM tools and what they do here:
http://llvm.org/docs/GettingStarted.html#tools

If you want to go deeper, at http://llvm.org/docs/ you'll find all
sorts of documentations and presentations, high and low level, from
tools manuals to internal design information.

happy reading! :wink:

cheers,
--renato

Renato Golin <rengolin@systemcall.org> writes:

What I'm wondering is if someone could explain how the different
tools in LLVM (llvmc, clang, llvm-gcc, llvm-ar, etc.) work together
to go from the C code I create through to a running executable (after
linking).

Hi Stephen,

This is probably *not* the right list, since this is the developer's
list, not the users' (if there is such thing). But it doesn't hurt to
shoot you in the right direction anyway.

The developer's mailing list (this one) is purposely the user list too,
so this question if perfectly on topic here.

[snip]

Most of the tools are really just compiler hacker tools that we use
for development, test, and demonstration. LLVM is designed to be used
as a set of libraries instead of a set of tools. However, there's
nothing stopping you (and it can be quite informative) to do each step
individually.

clang contains a driver, much like gcc, that takes the source files
and options you provide and produces the desired output. This can be
anything from just preprocessing all the way down to a final
executable.

So the command:
% clang -O3 source.c -o prog.exe

Can be broken down into:

* Pre-process
% clang -E source.c -o source.ii
* Compile to the llvm intermediate representation
    - This file is a human readable representation of the c input code
for the specified target.
% clang -S -emit-llvm source.ii -o source.ll
* Optimize
    - This runs a set of optimizations on source.ll and outputs the
optimized version in a binary encoded version of the llvm-ir. Use the
-S option to get readable output.
% opt -O3 source.ll -o source-opt.bc
* Generate machine code
    - This lowers the llvm-ir to the target instruction set and
optimizes it along the way.
% llc -O3 source-opt.bc -o source.s
* Assemble
% as source.s -o source.o
* Link
% ld source.o -o prog.exe

clang doesn't directly run all these commands. It uses the libraries
internally to do everything up to assembly output, and on some
platforms it even does the assembling internally.

* llvm-{as,dis} are just used to convert to and from the bitcode and
human readable llvm-ir.
* llvm-ar is for creating standard archives containing bitcode.
* llvmc ... I'm still confused about the exact reason for this one.
* llvm-diff produces intelligent diffs between two llvm-ir files
ignoring names. Makes it much easier to tell what semantics changed
when values are renamed.
* llvm-ld is really just a driver for the system linker. It can also
produce scripts that run the bitcode via lli.
* llvm-link links llvm-ir files together.
* llvm-mc is the machine code playground. It can be used as an
assembler, dissembler, and other things.
* llvm-nm is classic unix nm for llvm-ir. It dumps the symbol table.

And I don't know what the rest are for exactly.

You don't need to know about any of these to use clang or llvm-gcc,
but they can be useful when playing with llvm.

- Michael Spencer

Thanks Michael,

Your information was extremely helpful. Coming from a non-compiler background it is interesting to see how all the different components go together, from the code developers write to the final output.

Cheers,

Stephen