Hi Hal,
I'm not sure what kind implementation most of people may think it's unusual,
but I try my best to list them.
NDS32 V3 ISA is 32/16 bit mixed instruction set,
There might have same semantic instruction could mapping from 32 bit
to 16 bit, but have more restricted limitation.
E.g.
ADD ra, ra, rb -> ADD333 ra3, rb3, rc3
ra3, rb3, rc3 operand will encode in 3 bit
So the encoding register number have to be 0~7
To describe the low register number operand,
lGPR register class defined in NDS32RegisterInfo.td.
mGPR register class is for the register operand have to encode in 4 bit,
but the register number range is 0~11, 16~19 accroding to NDS32 V3 ISA spec.
FPReg, SPReg register classes are used to decribe 16 bit fp/sp base
imply load/store instructions.
To transfer the 32 bit fp/sp base load/store to 16 bit form,
I add the check function and transfer them in eliminateFrameIndex().
There are offset limitation for [reg + offset] addresing mode.
The immediate offset in word length load/store must be word alignment.
E.g lwi ra, [rb + 4] is allow, lwi, ra, [rb + 6] is not allow.
The immediate offset in half word length load/store must be half word alignment.
E.g lhi ra, [rb + 2] is allow, lhi, ra, [rb + 3] is not allow.
The offset of lwi will encode in the bit field with (offset >> 2).
The offset of lhi will encode in the bit field with (offset >> 1).
So range of the 15 bit signed immediate offset in lwi will be
isIntN (15 + 2, offset) && (offset % 4 == 0).
The immediate offset range checking for load/store define in
NDS32ISelDAGToDAG.cpp:selectAddrFrameIndexOffset.
NDS32 V3 have post-increment load/store instructions
which are the instructions naming with .bi suffix
E.g. sw.bi, lwi.bi
The assembly form for post-increment load/store is
instr [base], increment_offset
E.g.
sw.bi ra, [rb], rc
lwi.bi ra, [rb], imm
To generate the post-increment instructions,
I implement getPostIndexedAddressParts to allow the transformation
from normal load/store to post-increment form just like ARM did.
A slightly different from ARM is NDS32 V3 allow increment_offset to be
register or immediate.
So we don't have to check the immediate limitation in
getPostIndexedAddressParts.
Because if the immediate can't encode the offset, we could move it to
register and generate
register offset form.
The assembly form of NDS32 V3 multiple load/store is
lmw.{suffix} rb, [ra] re, enable4
which [ra] is the register contain base address,
rb is the first register in the register list,
re is the last register in the register list,
the register number between rb->re must be continuous.
enable4 is the 4 bit field to encode FP/GP/LP/SP.
Each bit in enable4 represent the specific register existance in register list.
E.g enable4 = 10 means FP(bit 3) and LP(bit 1) exist in the register list.
The Register list in selectionDag could be {base_reg, r1, r2, r3, .., FP, LP}.
before print out as assembly, InstPrinter have to traverse RegisterList
and find base register/first register/last register and the value of enable4.
Register list print method implement in
NDS32/InstPrinter/NDS32InstPrinter.cpp: printRegisterList
According to the V3 ISA spec, there is a special case that
if the register list not contain r0~r27, then rb and re should set to SP.
E.g.
RegisterList = {base_reg, FP,LP}
lmw.{suffix} SP , [base_reg], SP, 10
The {suffix} with i/d indicate the lmw/smw is increment or decrement.
the {suffix} with b/a indicate the lmw/smw use [ra] as base memory
address or [ra +/- 4]
E.g.
lmw.ai: multiple load with increment memory address and use [ra +
4] as base address
smw.ad: multiple store with decrement memory address and use [ra -
4] as base address
lmw.bi: multiple load with increment memory address and use [ra]
as base address
The {suffix} with m indicate modify base register value.
NDS32LoadStore optimization pass is added to generate multiple
load/store instructions
with different suffix.
Another unusual part is instruction always encode in big-endian,
no matter the toolchain is big or little.
You could see there is only big endian encoding in
MCTargetDesc/NDS32MCCodeEmitter.cpp: encodeInstruction->EmitInstruction.
Please let me know if there're still some implementation might be
unusual and I didn't mention,
I'll try my best to explain more detail.
Thanks,
Shiva