Binutils and LLVM - gathering information

Binutils and LLVM

As part of "owning our own toolchain", various people have expressed an interest and have been working on creating various tools that duplicate the functionality of tools available on other systems.

As a start, I'd like to summarize the current status, and ask people for help updating the list.

List taken from <http://www.gnu.org/software/binutils/>

* as (assembler) -- There is llvm-as, but it appears to be an assembler
for LLVM bitcode.

clang has an integrated assembler. llvm-mc will work as well.

* gprof (Displays profiling information) – ???

llvm-cov, I don't know how well it works at all.

* readelf (Displays information from any ELF format object file) – ???

Probably not completely, but llvm-objdump can deal with anything that's not
there (or be adapted to do so).

-eric

Binutils and LLVM

As part of "owning our own toolchain", various people have expressed an interest and have been working on creating various tools that duplicate the functionality of tools available on other systems.

As a start, I'd like to summarize the current status, and ask people for help updating the list.

List taken from <http://www.gnu.org/software/binutils/>

==============

* ld (linker) -- lld is under active development

* as (assembler) -- There is llvm-as, but it appears to be an assembler for LLVM bitcode.

* addr2line (Converts addresses into filenames and line numbers) – ???

* ar (creates, modifies and extracts from archives) -- llvm-ar <http://llvm.org/docs/CommandGuide/llvm-ar.html> appears to do this for bitcode, object and archive files.

This will need major changes to handle integrating ranlib.

* c++filt (demangles encoded C++ symbols) -- this facility is built into libcxxabi, but I don't know of a tool that brings this out to the command-line.

* dlltool (Creates files for building and using DLLs) – ???

* gprof (Displays profiling information) – ???

* nlmconv (Converts object code into an Netware Loadable Module) – ???

I really don't think we care about this.

* nm (Lists symbols from object files) -- llvm-nm does this for bitcode, object and archive files.

* objcopy (Copies and translates object files) -- ???

This one is hard, as translating object files between formats doesn't
really make sense with modern formats. I believe the main
functionality people use is converting object files to flat binaries.

* objdump (Displays information from object files) -- llvm-objdump appears to do this for bitcode, object and archive files.

* ranlib (Generates an index to the contents of an archive) -- There is an llvm-ranlib, but the docs at <http://llvm.org/docs/CommandGuide/llvm-ranlib.html> say that it only indexes bitcode files.

Yep, needs object support.

* readelf (Displays information from any ELF format object file) – ???

llvm-readobj outputs the same format as readelf.

* size (Lists the section sizes of an object or archive file) -- llvm-size does this for bitcode, object and archive files.

* strings (Lists printable strings from files) -- I have written a program named llvm-strings that does this, and will be submitting it as a patch shortly.

* strip (discards symbols) – ???

* windmc (A Windows compatible message compiler) – ???

* windres (A compiler for Windows resource files) – ???

These two aren't high priority, but they are needed for a lot of MFC GUI code.

I'd appreciate if people with more knowledge than myself could chime in with updates to this list.

Thanks!

-- Marshall

So far we have solved naming collisions by making the tool work with
multiple formats, however I don't feel that is the right solution for
llvm-as. llvm-as/llvm-dis are very simple core tools for converting
between ir formats. Extending this to object code assembly feels
wrong.

I think the solution to this is to move the llvm developer only tools
(llc, lli, opt, as, dis, bugpoint, bcanalyzer, diff, extract, link) to
the llvm tool. This would have a git like interface for accessing the
subtools. The other tools remain as user tools and still support
bitcode.

- Michael Spencer

Marshall,

* as (assembler) -- There is llvm-as, but it appears to be an assembler for LLVM bitcode.

The MC framework can definitely do this, but we don't have any tool (AFAIK) that roughly matches driver-compatibility with gas.

* gprof (Displays profiling information) – ???

This goes along vaguely with gcov, which has lots of issues which makes replacement impossible.

* objdump (Displays information from object files) -- llvm-objdump appears to do this for bitcode, object and archive files.

I can say with complete confidence from experience that llvm-objdump is not a good drop-in replacement for objdump. The biggest issue I had was that objdump resolves addresses into symbol names, so you get things like jmp xxxx <foo+123>, which is not performed by llvm-objdump.

I'd appreciate if people with more knowledge than myself could chime in with updates to this list.

If we're gathering toolchain information, it's probably best to also include the other programs/libraries included in gcc and gdb as well. For the sake of completeness, in addition to your above list, I'd add:
[gcc]
* gcc/g++/gobjc++/gobjc - clang/clang++
* gcj - vmkit, although it is only a frontend at bytecode level instead of source level
* gfortran - flang? I've heard it mentioned before, but it appears to have gotten nowhere.
* gccgo - [No Go frontend]
* gnat - [No Ada frontend]
* gcov - llvm-cov
* cpp - This is conceptually equal to clang -E.
* libada - [No Ada frontend]
* libffi - ???
* libgcc - compiler-rt
* libgfortran - No fortran frontend
* libgo - No go frontend
* libgomp - ???
* libiberty - ???
* libitm - ???
* libjava - What would be necessary is probably part of vmkit.
* libmudflap - ???
* libquadmath - ???
* libssp - ???
* libstdc++ - libc++ and libc++-abi

[Most of the lib* stuff probably goes into compiler-rt]

[binutils]
* elfedit - ??? [You didn't have this in your list]
* libbfd - ???
* libopcodes - ???

[gdb]
* gcore - ???
* gdb - LLDB
* gdbserver - ???

Binutils and LLVM

....

* objcopy (Copies and translates object files) -- ???

This one is hard, as translating object files between formats doesn't
really make sense with modern formats. I believe the main
functionality people use is converting object files to flat binaries.

This is also used to embed CellSPU binaries inside PPC ELF files so they can be referenced, loaded, and run.

Alex

llvm-readobj outputs the same format as readelf.

For which of readelf’s options? I didn’t get too far on OS X:

bin $ echo “void a(){}” | ./clang -xc -c - -o tmp.o
bin $ ./llvm-readobj tmp.o
File Format : Mach-O 64-bit x86-64
Arch : x86_64
Address Size: 64 bits
LLVM ERROR: get_load_name() unimplemented in MachOObjectFile

-Greg

I am definitely interested in the topic of exactly what is needed to build a complete LLVM-based toolchain. Thanks for
putting together this list. Perhaps we should make the status more visible by tracking a list of such things on the
website?

I have summarized the information that I've gathered, and put it up at:
  http://marshall.calepin.co/binutils-replacements-for-llvm.html

Comments/updates welcome.
Also, if anyone has a suggestion for a place for this to live on llvm.org, I'm all ears…

-- Marshall

Marshall Clow Idio Software <mailto:mclow.lists@gmail.com>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
        -- Yu Suzuki

Upd: llvm-symbolizer (possible addr2line replacement) was recently moved to //tools/llvm-symbolizer.
Probably I’ll need to add a documentation for it…

llvm/docs/BinutilsReplacement.rst?

Dmitri

Remember to link to
<http://www.llvm.org/docs/SphinxQuickstartTemplate.html> when you
suggest someone to write documentation.

-- Sean Silva