[RFC] Support disassembly of ARM and thumb mixed in single ELF file

Hi,

The llvm-objdump tool at the moment disassembles ARM ELF binary but with lot of extra user supplied data

· The triple supplied on the command line for tools has to be be correct (ARM, thumb etc)

· The ELF file cannot have mix of ARM and thumb

· There is no direct way of using it such as llvm-objdump –d elf_file. This works for architectures such as Hexagon, X86 etc.

Is there a way to enhance the tool to make the use more friendly by removing the need to specify these extra option on command line since it is possible to find the information in the file.

The ELF file for arm does not have any field that can be easily inspected to see if the file of type ET_ARM is either a thumb only binary, ARM only binary or a mix. There is however an optional ARM specific section .ARM.attributes that can contain CPU and architecture information such as ISA in use. BFD and gold linker use this section while linking to find any incompatibilities and then emit out a merged (as per ARM ABI extensions) section in final binary.

Any ideas to add this functionality to the without completely redesigning the llvm-objdump will be appreciated. There is already a patch that uses data from ELF file to try doing this.

It is at https://reviews.llvm.org/D24636

Thanks,

This can be done by looking for the marker symbols, at least if the file
was not stripped.

Joerg

objdump is currently in a pretty bad state. The disassembler was basically
ripped from llvm-mc and has been hacked on since without needed
refactoring. I feel that at this point we need to stop and restructure it
before throwing more complexity onto the pile.

- Michael Spencer

One option for detecting arm/thumb is to look at the LSB of the symbol addresses (SymbolRef::SF_Thumb flag) what specifies the instruction set for the (beginning of the) function. It will work more or less for stripped object files as well based on the symbols in the dynsym section. I think the disassembler in the Mach-O part of objdump (same binary different code path) already solves the problem this way.

Tamas