Reusing LLVM Mips instruction info in lldb

Hello everyone,

in http://reviews.llvm.org/D7696 bhushan added a mips64 UnwindAssembly plugin (a plugin that looks at assembly code to find out how to unwind the stack frame). Since I was about to write such a plugin (though for mips32) myself, I used it as a starting point for a slightly different implementation [1], replacing hard coded instruction encodings by calls to the LLVM disassembler. This works great, except that the necessary header that defines the enum to interpret the opcode in MCInst is generated by llvm during the build process using tablegen and is hence not a public header. What is the best solution to be able to use this information from lldb (which needs to be able to build against a prebuilt copy of LLVM)? Would it make sense to move the appropriate .td to llvm/include/Target/Mips, so lldb could re-tablegen it and obtain the same header (I assume tablegening is deterministic?)?
Does anybody see any other good solutions?

Thanks,
Keno

[1] https://gist.github.com/Keno/ef471f766d8dddf074e7

Hello everyone,

in http://reviews.llvm.org/D7696 bhushan added a mips64 UnwindAssembly plugin (a plugin that looks at assembly code to find out how to unwind the stack frame). Since I was about to write such a plugin (though for mips32) myself, I used it as a starting point for a slightly different implementation [1], replacing hard coded instruction encodings by calls to the LLVM disassembler. This works great, except that the necessary header that defines the enum to interpret the opcode in MCInst is generated by llvm during the build process using tablegen and is hence not a public header. What is the best solution to be able to use this information from lldb (which needs to be able to build against a prebuilt copy of LLVM)? Would it make sense to move the appropriate .td to llvm/include/Target/Mips, so lldb could re-tablegen it and obtain the same header (I assume tablegening is deterministic?)?

Ugh no. (Though yes, it is deterministic afaik).

Does anybody see any other good solutions?

Develop an interface that works and have lldb use that? Might need to change things to have certain bits be made public if necessary, but I’d want more details there.

-eric

The basic problem is the following. In, LLDB I get an instruction from the target and then I ask the LLVM disassembler to disassemble it for me. Depending on the instruction and what the arguments are, I can now construct an unwind plan. I get an MCInst back from the disassembler, so what I want to do is sth like:

if (Inst.getOpcode() == Mips::Addiu || Inst.getOpcode() == Mips::DAddiu) {
// This is an addiu instruction, look at the registers and construct the unwind plan
}

but that enum is the one that’s tablegen’d. I guess a possible interface would be to ask for the opcode of an instruction by name? Seems somewhat ugly though.

Equally ugly (and subject to silent breakage) would be to match the
opcode names, as given by MCInstrInfo::getName.

-Ahmed

We should really try to get this header file generated if possible.

Would you have any objections to making
lib/Target/*/MCTargetDesc/*MCTargetDesc.h public?

It seems pretty useless to expose a disassembly interface that can't tell
you anything interesting about the instruction.

It’s painful, I think I’d rather come up with a generic way to split/expose the backend data so that users can ask things, but I think this is a good step to getting there. So let’s define a cpu interface?

Agreed.

-eric

What kind of interface are you suggesting? It seems like the opcode data is inherently target specific (and only used by sections of external code written for the same target), so I guess I’m not seeing how a generic interface would work here.

Would you have any objections to making lib/Target/*/MCTargetDesc/*MCTargetDesc.h public?

One worry that springs to mind is how easy it is to renumber the enum values used by the opcodes. If these numbers are public it will be hard to have stable releases that add new instructions.

I don't believe that there's an expectation of ABI stability for the LLVM C++ interfaces, even in public headers - there's no expectation that you can link against any version of LLVM other than the exact version whose headers you used, unless you use the C interfaces. As long as this is not in llvm-c, then it shouldn't be an issue.

David

It's not expected to be stable between major releases like 3.6.0 and 3.7.0, but I believe it is supposed to be stable between, say, 3.6.0 and 3.6.1.
During the 3.5.1 release we spent a fair bit of time learning how to use a ABI checking tool and Tom Stellard asked me to make a few changes to preserve the ABI compatibility with the 3.5.0 release.

Apparently some lld targets also need instruction encoding. It would
be nice to figure out one interface that can be used by both lld and
lldb.

Just want to make sure this doesn’t drop off the radar. I don’t know enough about how the backends are organized to make a coherent suggestion here, but I imagine the hard part here is coming up with a suggestion, while the actual changes should be fairly mechanical.