NDS32 V3 backend

Hi all,

On behalf of Andes Technology Corp,
I am proposing a backend targeting the NDS32 V3 ISA.

NDS32 V3 ISA is a 16/32 bit mixed instruction set architecture that
developed By AndesTech.
You can find more information at the Andes website <http://www.andestech.com/&gt;,
and reference AndeStar ISA Manual (V3 ISA) from document download page

This is an experimental porting.
Performance-wise, there is still lack target-specific optimization yet.
We focus on the correctness at the begin and definitely need many reviewers help
to point out the right direction.

I split the patch like RISC-V did and hope it could be easier for reviewing.
I have submitted a series of 22 patches implementing code generation
and assembler support.
Please let me know if you'd like to be CCed in or added as a reviewer
to future patches.

Please find the current set of patches for your review here:
* <⚙ Query: Advanced Search;

Your reviews and comments are very important to us for making this
contribution better.

Thanks for your time to review our contribution.

Best regards,
Shiva Chen

Hi Shiva,

Is there anything unusual, from LLVM's perspective, about the ISA? And higher-level design decisions you'd like to point out?


P.S. You don't need to cc llvm-commits on messages to llvm-dev.

Hi Hal,

I'm not sure what kind implementation most of people may think it's unusual,
but I try my best to list them.

NDS32 V3 ISA is 32/16 bit mixed instruction set,

There might have same semantic instruction could mapping from 32 bit
to 16 bit, but have more restricted limitation.

    ADD ra, ra, rb -> ADD333 ra3, rb3, rc3
    ra3, rb3, rc3 operand will encode in 3 bit
    So the encoding register number have to be 0~7

To describe the low register number operand,
lGPR register class defined in NDS32RegisterInfo.td.

mGPR register class is for the register operand have to encode in 4 bit,
but the register number range is 0~11, 16~19 accroding to NDS32 V3 ISA spec.

FPReg, SPReg register classes are used to decribe 16 bit fp/sp base
imply load/store instructions.
To transfer the 32 bit fp/sp base load/store to 16 bit form,
I add the check function and transfer them in eliminateFrameIndex().

There are offset limitation for [reg + offset] addresing mode.
The immediate offset in word length load/store must be word alignment.
E.g lwi ra, [rb + 4] is allow, lwi, ra, [rb + 6] is not allow.

The immediate offset in half word length load/store must be half word alignment.
E.g lhi ra, [rb + 2] is allow, lhi, ra, [rb + 3] is not allow.

The offset of lwi will encode in the bit field with (offset >> 2).
The offset of lhi will encode in the bit field with (offset >> 1).

So range of the 15 bit signed immediate offset in lwi will be
isIntN (15 + 2, offset) && (offset % 4 == 0).
The immediate offset range checking for load/store define in

NDS32 V3 have post-increment load/store instructions
which are the instructions naming with .bi suffix
E.g. sw.bi, lwi.bi

The assembly form for post-increment load/store is
instr [base], increment_offset
    sw.bi ra, [rb], rc
    lwi.bi ra, [rb], imm

To generate the post-increment instructions,
I implement getPostIndexedAddressParts to allow the transformation
from normal load/store to post-increment form just like ARM did.
A slightly different from ARM is NDS32 V3 allow increment_offset to be
register or immediate.
So we don't have to check the immediate limitation in
Because if the immediate can't encode the offset, we could move it to
register and generate
register offset form.

The assembly form of NDS32 V3 multiple load/store is

lmw.{suffix} rb, [ra] re, enable4

which [ra] is the register contain base address,
rb is the first register in the register list,
re is the last register in the register list,
the register number between rb->re must be continuous.
enable4 is the 4 bit field to encode FP/GP/LP/SP.
Each bit in enable4 represent the specific register existance in register list.
E.g enable4 = 10 means FP(bit 3) and LP(bit 1) exist in the register list.

The Register list in selectionDag could be {base_reg, r1, r2, r3, .., FP, LP}.
before print out as assembly, InstPrinter have to traverse RegisterList
and find base register/first register/last register and the value of enable4.
Register list print method implement in
NDS32/InstPrinter/NDS32InstPrinter.cpp: printRegisterList

According to the V3 ISA spec, there is a special case that
if the register list not contain r0~r27, then rb and re should set to SP.
    RegisterList = {base_reg, FP,LP}
    lmw.{suffix} SP , [base_reg], SP, 10

The {suffix} with i/d indicate the lmw/smw is increment or decrement.
the {suffix} with b/a indicate the lmw/smw use [ra] as base memory
address or [ra +/- 4]
    lmw.ai: multiple load with increment memory address and use [ra +
4] as base address
    smw.ad: multiple store with decrement memory address and use [ra -
4] as base address
    lmw.bi: multiple load with increment memory address and use [ra]
as base address

The {suffix} with m indicate modify base register value.
NDS32LoadStore optimization pass is added to generate multiple
load/store instructions
with different suffix.

Another unusual part is instruction always encode in big-endian,
no matter the toolchain is big or little.
You could see there is only big endian encoding in
MCTargetDesc/NDS32MCCodeEmitter.cpp: encodeInstruction->EmitInstruction.

Please let me know if there're still some implementation might be
unusual and I didn't mention,
I'll try my best to explain more detail.