Strategy for writing a new LLVM backend?


I'm playing around with an LLVM backend for the MC6809 8-bit processor. This is both so that I can learn LLVM and I would like a halfway-decent cross-compiler for that processor. Yes, I am a sucker for punishment!

I'm getting very tied up in the implementation details, hence my request for advice.

The MC6809 instruction set is very clean at the assembly language level, but the binary opcodes are not so helpful. Some instructions have a prefix byte, and due to the rich addressing modes, the instructions are very variable in length, and not necessarily neat or consistent. There is only very limited scope for packing known bit patterns like a a $src or $dst field.

Where it would be nice to have (e.g.)

  ADDA 1,X
  ADDB $20FF
  ADDD #23
  ADDA [123,Y]

... or ...

  NEG addr
  NEG [3,Y]

... all matched with (say) "[(set $dst, (<opcode> $dst, $src)),(set $dst, (add $dst, $imm))]" - note the 2-argument mode - constructing the opcodes at the same time appears to be not possible. I think I can't do instruction matching this way. Multiclasses don't seem to quite get there either.

So far I've got the whole instruction set in MC6809Instr(Format|Info).td (the indexed modes need some work as the postbytes are messy), without any matching, and I suspect I'll need to do all the instruction selection in the MC6809ISel* files. All the different types (immediate, indexed, inherent etc are there as separate instructions ordered by primary opcode (plus prefix byte if there is one).

1) Am I making sense, and is my approach so far sane?
1a) Once I have defined all the instructions, will grouping them by function and selecting the right one(s) in C++ code in the *ISel*.cpp files make more sense?

2) Using MSP430 as a base, how do I force the TwoAddressInstructionPass to be run at the right time?

Thanks and happy new year to all!


Hi Mark,

To deal with multiple addressing mode instruction selection for same opcodes following can be done :-

1/ Define different instruction patterns for a give operator DAG node with different operands and emitting different asm string.

2/ Define custom DAG nodes during target lowering (TargetLowering::LowerOperation) and write a dumb selector to emit necessary asm strings for custom nodes.

3/ Take a cue from X86 ComplexPattern where pattern matching for addressing mode is done using explicit C++ code.

Hope this helps.