Assembly file compilation flow


LLVM MC blog post explains fairly well how the backend MC project is working. However, it’s not clear to me how LLVM converts an assembly file (.s) to an object file. The driver --verbose option is not giving me much information as well. I want to know the flow. Like, What CodeGen phases does this file pass through? Is it converted to MachineFunction ever? Where does it hook into the backend infrastructure? Any documentation could also be helpful.

Passing an assembly file to clang does not go through any CodeGen phases. It operates entirely in the MC layer. No MachineFunctions will be created. It just runs it through the MCAsmParser. Most of the setup for this is in clang’s tools/driver/cc1as_main.cpp in the function ExecuteAssembler

I think a key point to understand is the MCStreamer interface, which has two implementations: one for object-file generation, and one for .s file generation. CodeGen/AsmPrinter can attach to either of these implementations. MCAsmParser will generally attach to the object-file version of MCStreamer.


Note that it is possible (even via the llvm-mc command-line interface) to attach it to the assembly back end, so you can parse assembly and generate assembly. This can be interesting on architectures such as MIPS that include a lot of pseudos, because it will expand them to individual instructions (MIPS dla, for example, expands to one of 4 different things depending on assembler state and target).

This path is also used in testing, to ensure that assembly instructions are parsed to their canonical representation.