LLVM "Native" Backend

Having read through the documentation and browsed the code what are the missing pieces, if any, required to implement a “native” LLVM backend that:

  • is able to produce loadable (ELF) object files with DWARF debugging information
  • handle inline assembly
  • perform linker all necessary linking fixups

This assumes that there will be no native object file linking stage in the tool chain.

From what I can tell the ELF writer should be able to produce the object files, but a DWARF-in-ELF writer would have to be created to insert the debugging information into the ELF file.

Either the inline assembly would be reverse compiled or handled by custom pass. I’m not sure what the options are here…

Am I on the right track?

Hi Christopher,

Having read through the documentation and browsed the code what are
the missing pieces, if any, required to implement a "native" LLVM
backend that:

* is able to produce loadable (ELF) object files with DWARF debugging
information
* handle inline assembly
* perform linker all necessary linking fixups

This assumes that there will be no native object file linking stage in
the tool chain.

Which assumes that there's only one translation unit. Given that
scenario, its doable, currently.

From what I can tell the ELF writer should be able to produce the
object files, but a DWARF-in-ELF writer would have to be created to
insert the debugging information into the ELF file.

We currently support DWARF/ELF on x86 and PPC. Not sure about other back
ends.

Either the inline assembly would be reverse compiled or handled by
custom pass. I'm not sure what the options are here...

Inline assembly is well supported although not as completely as gcc. For
example, it provides enough to compile the Linux kernel.

Having read through the documentation and browsed the code what are the missing pieces, if any, required to implement a "native" LLVM backend that:
* is able to produce loadable (ELF) object files with DWARF debugging information
* handle inline assembly
* perform linker all necessary linking fixups

Sure, this should be no problem. I'd start with the sparc backend, it's very simple and straight-forward place to start.

LLVM can currently produce .o files directly in some limited cases, or you can go through an assembler, which is much more robust. Any help improving the direct ELF writer would be appreciated.

This assumes that there will be no native object file linking stage in the tool chain.

LLVM works will your standard tool components, like the gnu linker, assembler, etc.

From what I can tell the ELF writer should be able to produce the object files, but a DWARF-in-ELF writer would have to be created to insert the debugging information into the ELF file.

Yep

Either the inline assembly would be reverse compiled or handled by custom pass. I'm not sure what the options are here...

Inline asm is very tricky, it basically amounts to LLVM having an assembler for each target. We're not oppposed to this, but it is a significant amount of work.

-Chris

Having read through the documentation and browsed the code what are the
missing pieces, if any, required to implement a "native" LLVM backend that:
* is able to produce loadable (ELF) object files with DWARF debugging
information
* handle inline assembly
* perform linker all necessary linking fixups

Sure, this should be no problem. I'd start with the sparc backend, it's
very simple and straight-forward place to start.

LLVM can currently produce .o files directly in some limited cases, or you
can go through an assembler, which is much more robust. Any help
improving the direct ELF writer would be appreciated.

Which of the LLVM tools can currently produce .o files directly without going through the platform's as/ld? How do I specify this on the command line to any of the tools?

This assumes that there will be no native object file linking stage in the
tool chain.

LLVM works will your standard tool components, like the gnu linker,
assembler, etc.

I'm primarily considering this as an option for a custom target. My interest is in not having to do a binutils port, so currently I have no "real" assembler and linker.

From what I can tell the ELF writer should be able to produce the object
files, but a DWARF-in-ELF writer would have to be created to insert the
debugging information into the ELF file.

Yep

Either the inline assembly would be reverse compiled or handled by custom
pass. I'm not sure what the options are here...

Inline asm is very tricky, it basically amounts to LLVM having an
assembler for each target. We're not oppposed to this, but it is a
significant amount of work.

Is there an infrastructure in place (or a design) to work on such an inline assembler? I'd think that a lot of the tblgen information used for the Code/Asm emitters could be reused for this.

LLVM can currently produce .o files directly in some limited cases,
or you
can go through an assembler, which is much more robust. Any help
improving the direct ELF writer would be appreciated.

Which of the LLVM tools can currently produce .o files directly
without going through the platform's as/ld? How do I specify this on
the command line to any of the tools?

llc -filetype=obj

I think that this is disabled for ELF, but you can easily enable it. Macho works pretty well, but ELF is not entirely finished. Any help would be appreciated: basing the work on the macho writer would be appropriate.

This assumes that there will be no native object file linking
stage in the
tool chain.

LLVM works will your standard tool components, like the gnu linker,
assembler, etc.

I'm primarily considering this as an option for a custom target. My
interest is in not having to do a binutils port, so currently I have
no "real" assembler and linker.

LLVM doesn't have any support for linking native objects, you will need to get this somewhere.

Inline asm is very tricky, it basically amounts to LLVM having an
assembler for each target. We're not oppposed to this, but it is a
significant amount of work.

Is there an infrastructure in place (or a design) to work on such an
inline assembler? I'd think that a lot of the tblgen information used
for the Code/Asm emitters could be reused for this.

I agree with you, it should be a good starting place. However, we don't have the infrastructure for this yet.

-Chris

So I’ve increased the support for ELF writing, but it’s certainly not done yet. I fixed a few bugs that were preventing it from emitting correct elf headers, so if there are no ELF relocations emitted (i.e. the target can handle all of the relocations at code emission) then this change set should allow one to write a perfectly valid ELF REL file.

I’ve got preliminary support for relocations, the data structures to be precise, and refactored some of the ELF writer to be more like the MachO writer.

Should I try to submit a patch for the partial work that I’ve done?

Should a PR be opened for ELF output?

So I've increased the support for ELF writing, but it's certainly not done yet. I fixed a few bugs that were preventing it from emitting correct elf headers, so if there are no ELF relocations emitted (i.e. the target can handle all of the relocations at code emission) then this change set should allow one to write a perfectly valid ELF REL file.

Nice!

Should I try to submit a patch for the partial work that I've done?

Absolutely.

-Chris

Sure, that would be a good way to track the feature. Patches should be sent to llvm-commits.

-Chris

This is now PR1294. I’ll work up a patch next week.