Backend implementation for an architecture with only majority operation instruction

Hello everyone,

I was trying to create an LLVM backend for a processor with a very simple architecture and that does all instructions like load, store, arithmetic and logical instructions using a bunch of majority functions. The processor has only one instruction(majority function) in its ISA and breaks down all other instructions into a number of majority instructions depending on what instruction it is. All the instructions have different combinations of majority operations. Is there any way to implement this without creating a new Selection DAG node for the majority operation?

I was thinking of creation of a new Selection DAG node and mapping all the other instructions like loads, stores as pseudo instructions and breaking them up. Can someone please help me with this?

Thanks!

Sreejita

I’m having a hard time grasping what this ISA actually looks like.

When you say that it has a single instruction that is a majority function, I assume something like this:

MAJ rDst ← rSrc0, rSrc1, rSrc2
Semantics:
for (int i = 0; i < REGISTER_WIDTH; i++) {
rDst[i] = maj(rSrc0[i], rSrc1[i], rSrc2[i]);
}
Where maj(a, b, c) = (a & b) | (a & c) | (b & c)

But that doesn’t make sense given your question.

MAJ is a bitwise operation, so how do you implement arithmetic instructions with it? You would need at least one other instruction (such as a bit shift) to establish dependency chains between bits.

Also, how do you decompose load/store into majority functions? It’s not even clear to me what that would actually mean. You need to access the memory/IO bus somehow, and if your only instruction only reads/writes to registers, the only way to do that would be to have special registers that interface to the IO bus?

– Sean Silva

Hey Sean,
So the processor does in-memory computing, it reads instructions and operands from the memory array, performs the majority operations within the memory array itself.
It does instructions using resistive majority which is AB’+B’C+AC
Like it does AND operation as
1: 0, 1, @C; //C=0
2: 0, 1, @Binv; //Binv=0
3: 1, @B, @Binv; //Binv=B
4: @A, @Binv, @C; //C=A.B

where each operation is a resistive majority and operations are directly performed on the storage of C. It reads @A in a register , @B , reads A and B and directly writes into the memory @C. There are shift operators as well that are also performed in a similar way, loads, stores are also performed like this. So I am trying to define this resistive majority instruction in my ISA.

Thanks!
-Sreejita

If your architecture is alien enough (and so far this sounds pretty alien to me) it may make sense to just create a custom backend that goes from the IR representation to whatever machine format/assembly you are using.
While CodeGen gives you a lot of infrastructure that fits "normal" CPUs well (ABI lowering, instruction selection, register allocation, scheduling, object file formats, debug info). On the other hand if your target is a hypothetical CPU (just making guesses from what little you wrote) you may not need all that:
- Register allocation usually only makes sense if you have a bounded number of registers or special constraints.
- Scheduling usually only makes sense if you have a machine model with different latencies or want to optimize register pressure to minimize spill reloads (for bounded number of regs).
- If your object file format isn't ELF/macho/coff then you won't gain much from the MC infrastructure and debug generation infrastructure.

If you don't need most of these things then you are probably better off manually implementing a TargetMachine without using CodeGen (instead of implementing an LLVMTargetMachine with all the CodeGen utilities).

- Matthias

Hey Can you refer me to some documentation on manually creating a backend without codegen?

I don't think there is much documentation. In general it should work by implementing the TargetMachine interface (but not LLVMTargetMachine) and using RegisterTargetMachine to get the target into the target registry, possibly also some patches to Triple if you want it to get picked up based on the target triple and not just by -target.

This is from looking around the existing headers, I don't have actually done this myself so I may be missing some things.

- Matthias

I guess this is relevant here: https://en.wikipedia.org/wiki/One_instruction_set_computer

Hey Sean,
So the processor does in-memory computing, it reads instructions and
operands from the memory array, performs the majority operations within the
memory array itself.
It does instructions using resistive majority which is AB'+B'C+AC
Like it does AND operation as
1: 0, 1, @C; //C=0
2: 0, 1, @Binv; //Binv=0
3: 1, @B, @Binv; //Binv=B
4: @A, @Binv, @C; //C=A.B
where each operation is a resistive majority and operations are directly
performed on the storage of C. It reads @A in a register , @B , reads A
and B and directly writes into the memory @C. There are shift operators as
well that are also performed in a similar way, loads, stores are also
performed like this.

Can you show an example of the load/store and add/shift? For example, how
does your ISA represent this function?

void add(int *dst, int *src0, int *src1) {
  *dst = *src0 + *src1;
}

-- Sean Silva

Hey,

So for doing such pointer arithmetic it doesn’t load the values as the controller has no execution unit as such it does al computation in the memory itself.

So for such a function it would do just an add which looks like this :
ADD @A, @B=
XOR @A,@B,@T; / / T = A xor B
AND @A,@B,@R; / / R = A · B
iRM 3w 0 , 1 ,@CI ; / / CI = 0
iRM 3w 0 , 1 ,@U; / / U = 0
iRM 3w 0 , 1 ,@V; / / V = 0
iRM 3w 1 ,@T,@U; / / U = T

iRM 3 @CI+0 ,@U+0 ,@V+ 1 ; / / V=[0, (c 0 · t 0 ), . . . , 0]
iRM 3 @R+ 0 , 0 , CI + 1 ; / / CI=[0, r 0 , . . . , 0]
iRM 3 @V+1,0,@CI+1; // CI=[0, (r 0 + (c 0 · t 0 )), . . . ]

.
iRM 3 @CI+62 @U+62 @V+ 6 3 ;
iRM 3 @R+ 6 2 , 0 ,@CI ;
iRM 3 @V+ 6 3 , 0 ,@CI+ 6 3 ;
XOR @T, @CI ,@A; / / A = A xor B xor CI

It doesnt print out the AND and XOR instructions , it performs them is a similar fashion, you can look up the AND I have written earlier,

The load is done:
iRM 3w 0 , 1 , Accum ;
L i + n : iRM 3w A, 1 , Accum ;

where the value of A is known at compile time from @A thats signified. IRM 3w is basically a bitwise resistive majority done on an entire word(iRM 3) . Resistive majority is done as majority of A,B’,C

It’s quite complex

How do you think the instruction definition would work for such an architecture.

Thanks

Hey,
So for doing such pointer arithmetic it doesn't load the values as the
controller has no execution unit as such it does al computation in the
memory itself.

So for such a function it would do just an add which looks like this :

ADD @A, @B=
XOR @A,@B,@T; / / T = A xor B
AND @A,@B,@R; / / R = A · B
iRM 3w 0 , 1 ,@CI ; / / CI = 0
iRM 3w 0 , 1 ,@U; / / U = 0
iRM 3w 0 , 1 ,@V; / / V = 0
iRM 3w 1 ,@T,@U; / / U = T

iRM 3 @CI+0 ,@U+0 ,@V+ 1 ; / / V=[0, (c 0 · t 0 ), . . . , 0]
iRM 3 @R+ 0 , 0 , CI + 1 ; / / CI=[0, r 0 , . . . , 0]
iRM 3 @V+1,0,@CI+1; // CI=[0, (r 0 + (c 0 · t 0 )), . . . ]
..
.
iRM 3 @CI+62 @U+62 @V+ 6 3 ;
iRM 3 @R+ 6 2 , 0 ,@CI ;
iRM 3 @V+ 6 3 , 0 ,@CI+ 6 3 ;
XOR @T, @CI ,@A; / / A = A xor B xor CI

It doesnt print out the AND and XOR instructions , it performs them is a
similar fashion, you can look up the AND I have written earlier,

The load is done:
            iRM 3w 0 , 1 , Accum ;
L i + n : iRM 3w A, 1 , Accum ;

where the value of A is known at compile time from @A thats signified.

So if I understand correctly, this ISA can only operate on statically known
addresses in the memory array, correct? E.g. you could not walk a linked
list.

IRM 3w is basically a bitwise resistive majority done on an entire
word(iRM 3) .

To make sure I understand correctly, iRM 3w is equivalent to a sequence of
64 operations:

iRM 3w @A, @B, @C :=

iRM 3 @A+0, @B+0, @C+0
iRM 3 @A+1, @B+1, @C+1
...
iRM 3 @A+63, @B+63, @C+63

Is that correct?

Resistive majority is done as majority of A,B',C
It's quite complex
How do you think the instruction definition would work for such an
architecture.

You probably want to expand each input operation into equivalent
instructions for your isa, as you have done in the example above. That will
produce simple but correct code. You may want to write a follow-up peephole
pass to do algebraic simplifications using identities of the resistive
majority function.

You may want to watch 2016 LLVM Developers’ Meeting: A. Bougacha & Q. Colombet & T. Northover “Global Instr.." - YouTube
You'll probably be able to identify places where the simplicity of your ISA
makes certain parts trivial. E.g. in your case, there is only one "register
bank" so that whole aspect of the lowering is trivialized. I haven't been
keeping up to date with GlobalISel, so you might be better off using
SelectionDAG. I'll let someone more familiar with the backend comment on
that, but your target seems simple enough that even writing the backend
from scratch (directly implementing TargetMachine like Matthias suggested)
would be feasible. Hooking into LLVM's backend infrastructure seems like it
will mostly just save you from writing a register allocator, so it might be
worth doing just for that.

-- Sean Silva