Hello,
Im (trying) to write a backend for a simple 32bit processor architecture, with a single instruction format having no condition code registers.
www.docm.mmu.ac.uk/STAFF/A.Nisbet/Sabre.pdf is the short 15 page document describing the architecture of Sabre. It is a Celoxica developed research/teaching processor, pages 5-8 contain relevant information for targetting it from a new compiler backend, i,e, it is trivially simple with 25 actual instructions. Typo on page 5, operand A is clearly bits 9-5.
The general form for instructions is:--
opcode %a, %b, 17bit signed immediate.
%b is a source register.
%a is typically the source and the destination register for the operation, ie %a = operation %a,%b, immediate.
%b and the immediate act like a virtual operand c that is the sum of register b's contents and the immediate value.
%b can be omitted if it refers to the "zero valued register %0".
The immediate can be omitted if it has a zero value.
The exceptions to this are the various forms of conditional branch instructions that must compare the contents of 2 registers and specify a branch target address using the immediate, (textually the immediate is a label, in machine code the immediate is a relative offset for the PC).
I have spent some time looking at the PPC and SPARC backends, but obviously these are much more complicated than what I require to implement. Consequently, I am not correctly grasping the interactions between ARCHInstrInfo.td and ARCHDAGToDAGISel.cpp I did manage to hack something together based on a copy of SPARC (with a SABRE namespace etc) but the instruction selection was incorrect and I obtained a "Cannot yet select:0x..." assertion failure from SABREDAGToDAGIsel::SelectCode when I attempted a
llc -march sabre helloworld.bc -o helloworld.s
Can anyone offer any guidance on how to proceed with debugging instruction selection issues? Or perhaps some description of how the pattern matching and the instruction selection works with a verbose explanation for a single instruction (this would probably be more beneficial), relating the Processor instruction set to the LLVM supported instruction set and the actual code generation/printing.
WRT defining the instructions themselves: am I right in thinking that it is sensible (for instruction selection) to represent the instruction set as a collection of instructions targetting register register and register immediate, so for example I would create defs for
ADDrr to match ADD %a,%b
ADDri to match ADD %a, immediate
I have used multiclass to achieve this. Previously I was attempting to match the opcode %a,%b,immediate general form.
Clearly I also need a way to load a 32 bit constant value into a register in order to be able to address more than 64K of memory. I know the PPC does something similar ...
So for example for SABRE this instruction output would perform the necessary ...
MOVri %a, HI16(32 bit constant)
LSHri %a,16
ORri %a, LO16(same 32 bit constant)
LD %d, %a // ie load the contents of the memory at the address stored in %a into register %d
where the HI/LO16 are performed at code generation by LLVM. I'm a little confused as to how to specify this as a pattern in tablegen syntax, even with the PPC example.
Apologies for the naivety of these questions.
Thanks,
Andy
Dr. Andy Nisbet: URL http://www.docm.mmu.ac.uk/STAFF/A.Nisbet
Department of Computing and Mathematics, John Dalton Building, Manchester
Metropolitan University, Chester Street, Manchester M1 5GD, UK.
Email: A.Nisbet@mmu.ac.uk, Phone:(+44)-161-247-1556; Fax:(+44)-161-247-1483.
"Before acting on this email or opening any attachments you
should read the Manchester Metropolitan University's email
disclaimer available on its website
Email Disclaimer · Manchester Metropolitan University "