Writing a backend for a very limited CPU

Hi all,
I’ve build a pretty limited 8-bit CPU from 74xx chips (Discrete logic IC CPU | Ivan's blog) which I want to port llvm to. My CPU has in total four 8-bit registers, two of which are used as a pair for memory addressing and jumps. It has no hardware stack right now, but I’m planning to add one. Due to the limitations, it’s quite tricky to program it. For example, adding two 16-bit numbers from memory can be done like that:

ldi ph, hi(var1) ; high byte of the address
ldi pl, lo(var1) ; low byte of the address
ld b             ; register B is loaded from memory at PH:PL, low byte of var1
ldi ph, hi(var2)
ldi pl, lo(var2)
ld a             ; A is now the low byte of var2
add b, a         ; B is the sum of low bytes, carry flag is set
ldi pl, lo(var2 + 1)
ld a             ; A is the high byte of var2
ldi ph, hi(var1 + 1)
ldi pl, lo(var1 + 1)
ld pl            ; PL is the high byte of var2
adc a, pl        ; A is sum of high bytes and carry
ldi ph, hi(result)
ldi pl, lo(result)
st b             ; low byte is stored
inc pl
st a             ; high byte is stored

As you can see, you have to juggle the registers and really make use of them. You can’t just dedicate PH and PL for addresses and use B and A for arithmetics, it will make code really inefficient (and sometimes even make some things impossible to implement).

I have already got some basic understanding about llvm backend architecture. I started porting and, to be honest, I’m completely overwhelmed with the complexity and that’s why I ask your advice.

My idea with llvm is to make some kind of virtual machine on top of my processor, which will have some number of “registers” of various width in memory and will be able to perform arbitrary arithmetics between them, each operation unfolding into bunch of real assembly. Another idea is to have no registers at all exposed to the code generator and make it perform all calculations on stack (which is going to be a floating mapping from one memory region to another; SP-relative addressing will be just an access by a constant address). Both ways are actually the same because in both cases computations will be performed between constant locations in memory, but they are different from the code generator point of view.

Considering all of this, I have following questions:

  1. Is it even realistic to port LLVM on that?
  2. What’s the best way to make virtual instructions and unfold them into real machine code?
  3. Which approach - based on virtual registers or based on stack is easier to implement on llvm?
  4. Maybe I’m overseeing something and there’s a better approach?

Thanks in advance for your replies.

This doesn’t answer your questions directly but there is a 6502 backend for llvm and there was a talk about it recently.

The limited registers and hardware stack seem to line up. That could at least give you an idea of what is possible overall, even if the details are still difficult to work out.

1 Like

Thank you, David! This looks surprisingly very close to what I want to do. They have even gone stackless for non-recursive cases. I did the same in my small compiler.