MCInst

Can someone explain what MCInst is vs. MachineIntr? I'm porting some
patches we have here that affect MachineInstrs and am wondering whether I
need to make similar changes in MCInst.

Why do we have two machine instruction representations?

                              -Dave

Can someone explain what MCInst is vs. MachineIntr?

Sure. MCInst is designed to be part of the "MC" set of libraries, which is stuff dealing with machine code. We're building a suite of assemblers and disassemblers out of this.

MCInst is integral to this plan. For an assembler you have two pieces:

1. "Recognize" an opcode + argument list to an MCInst (a .td instruction enum + MCOperands).
2. Run the "MCInst encoder" to emit a series of machine code bytes + relo entries.

For a disassembler, you have two other pieces:

3. Decode machine code bytes into an MCInst.
4. AsmPrint the MCInst to the ".s file" output.

Daniel will be working on #1 soon, #2 is basically a heavily refactored version of X86CodeEmitter.cpp:emitInstruction, #3 will be contributed soon by Sean, and #4 is basically a heavily refactored version of asmprinter:printInstruction (which I'm working on).

A strong goal for me is to make it so that we can build very small (as in code size) assembler and disassembler tools. This means that none of this stuff can depend on (e.g.) libx86, because that brings in the huge target plus libcodegen plus libtarget plus vmcore, ... etc. As a key part of this factoring, instruction asmprinting (for example) will no longer work directly on MachineInstr. Instead, asmprinter::printInstruction will lower a MachineInstr to an MCInst, then call the MCInst asmprinter to do the hard formatting work. You can see a horrible simple skeleton of this idea in X86ATTAsmPrinter::printMachineInstruction.

There is a ton of refactoring and cleanup left to do, but a great benefit of this is that it really helps clean up the targets and make them do the right things in the right places.

I'm porting some
patches we have here that affect MachineInstrs and am wondering whether I
need to make similar changes in MCInst.

You should almost certainly do everything on MachineInstr. MCInst is still very early on, if you make any changes to MachineInstr I'll update MCInst to match. Please discuss changes to core data structures like MachineInstr before you make them though.

Why do we have two machine instruction representations?

Hopefully I covered that above.

-Chris

asmprinter::printInstruction will lower a MachineInstr to an MCInst,
then call the MCInst asmprinter to do the hard formatting work. You
can see a horrible simple skeleton of this idea in
X86ATTAsmPrinter::printMachineInstruction.

Yep, that's where I hit the problem. I'm patching the sources for the
comment emitter and of course MCInst doesn't have the right interfaces
yet.

> I'm porting some
> patches we have here that affect MachineInstrs and am wondering
> whether I
> need to make similar changes in MCInst.

You should almost certainly do everything on MachineInstr. MCInst is
still very early on, if you make any changes to MachineInstr I'll
update MCInst to match. Please discuss changes to core data
structures like MachineInstr before you make them though.

In this case LLVM won't build without MCInst updates. Should I go ahead
and do those and submit the patch to the dev list?

                                   -Dave

Sure, sounds good. This is even more reason to make your stream part of libsupport, not libcodegen :slight_smile:

-Chris