LLVM Disassembler question


By way of example, I have the following instruction:
44 8b 80 c8 03 00 00 movl 968(%rax), %r8d

  1. How is this represented in MCInst?

  2. Is there information in MCInst that would tell me which bytes of the instruction are responsible for the 968?

The reason I am asking is that I want to work with the bytes disassembled/decoded to an instruction at MCInst level. I then wish to apply customized relocation records at the MCInst level.

My customized relocation records will consist of an byte offset followed by an expression of what to replace the following X number of bytes with. Much like a normal relocation record.

I need a way to map relocation byte offsets to the correct operand in MCInst.

The main difference is generally relocation records are applied before disassembly, and I wish to do it after disassembly.

Kind Regards


The MCInst in this case would contain a value from the instruction enum and 6 operands. One being the dest and another 5 to represent the maximal address components used by X86 (base, index, scale, displacement, and segment). The only code that knows how to map those operands is in lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp. It does this by looking up TSFlag bits using the enum value and applying rules based on different bits in the TSFlags.