Symbolic information in disassembler output


As I understand it, old disassembler (based on libedis) could print symbolic information instead/beside address operand of an instruction. And it looks like there is not such ability in disassembler now. Is this responsibility shifted on some other component of lldb? Or it was considered as useless and was removed at all?



I don't think anybody thought of it as useless. It's one of the things Jason has been trying to find time to do for a while now.

InstructionLLVMC is the main concrete Instruction subclass in llvm now, and relies on MCInst for most of the heavy lifting. So we should work with MCInst to figure out how to do this.


I believe libedis was deprecated many years ago and hasn't returned. We use the standard LLVM disassembler, so any features need to be built into llvm::MCInst.

Greg is right that this was a libedis feature and has no equivalent in LLDB today.

MCInst, however, doesn’t have enough information by itself to do this. The reason is that for many things that are considered “operands,” the MCInst has several underlying operands. For example, an operand that was expressed as a register + an offset would be represented in MCInst as a register operand and in immediate operand, and only correlating the opcode with the LLVM instruction tables (and possibly some special knowledge) would tell you that the two belong together.

Additionally, libedis could express the semantics of the instruction operands (e.g., “this is a source operand and represents the result of dereferncing rbp - 4”) , and inform the client what ranges of characters in the generated string represented each high level operand. Both of these features are not exposed anywhere at the moment, and in fact the underlying knowledge was lost when the edis TableGen backend was deprecated.

There are a few LLDB features that reads instructions and attempt to interpret them:

  • The fast unwinder looks for specific bit patterns (see UnwindAssembly_x86::GetFastUnwindPlan in UnwindAssembly-x86.cpp);

  • The ARM instruction emulator has its own home-grown instruction table (see EmulateInstructionARM64.cpp); and

  • The crash diagnose functionality actually parses the output strings from the disassembler (see DoGuessValueAt in StackFrame.cpp).


I got it. I hoped that this work was just removed in other library, despite I didn’t find something like that anywhere in lldb. Also I supposed there were some certain reason to remove it. But since it is just not implemented yet, I have no questions more) Thank you all for explanation!

Due to complicated instruction formats it is very undesired for me to implement instruction tables again. DoGuessValueAt in StackFrame looks like what I need, thanks for this hint!