Stack implementation

I’m translating the source of stack-based virtual machine into LLVM IR and my plan is to implement the stack in LLVM IR (using alloca/load/store) in order to emulate the VM’s stack and then use the optimization phase “mem2reg”. Therefore I’m going to have a stack pointer that points to the top of my stack. I’m curious whether I will have to implement that in a linked-list fashion, i.e. each stack node would contain the pointer to the previous node as well as the current value of the top of the stack or maybe alloca instruction works in a way that the subsequent calls to it allocates the contiguous memory area and I could get rid of the “pointer to previous thing” and only count the number of calls to alloca instruction.

Another thing is that there might be a better (less expensive or easier) way of doing the whole stack → SSA transformation which I didn’t came up with, but someone else did?

Kind regards

Piotr Kaleta <piotrek.kaleta@gmail.com> writes:

I'm translating the source of stack-based virtual machine into LLVM IR and
my plan is to implement the stack in LLVM IR (using alloca/load/store) in
order to emulate the VM's stack and then use the optimization phase
"mem2reg". Therefore I'm going to have a stack pointer that points to the
top of my stack. I'm curious whether I will have to implement that in a
linked-list fashion, i.e. each stack node would contain the pointer to the
previous node as well as the current value of the top of the stack or maybe
alloca instruction works in a way that the subsequent calls to it allocates
the contiguous memory area and I could get rid of the "pointer to previous
thing" and only count the number of calls to alloca instruction.

Another thing is that there might be a better (less expensive or easier) way
of doing the whole stack -> SSA transformation which I didn't came up with,
but someone else did?

My compiler also emits instructions for a virtual stack-based
machine. The LLVM backend is implemented as a translator from those
instructions to LLVM IR.

In principle, creating an alloca for every value that is pushed into the
stack would work: when the VM should push a value to the stack, you
create the alloca and store the value into it; then you load the alloca
when the virtual machine should pop the value from the stack. You don't
need linked lists, just a stack (!) data structure that emulates the
state of the VM, where you push and pop alloca's as the VM does with the
real values. In essence, your translator would work like an interpreter
over the VM instructions that, instead of working with real data, works
with the alloca's where the real values will be stored when the LLVM
code is executed. Apart from that, you don't need to care about the
alloca's anymore.

The above method should work but has the inconvenience of generating
lots of alloca/store/load instructions. That's is fine if you don't care
too much about compilation time or readability of the unoptimized LLVM
IR code. My translator uses general LLVM values instead of just allocas,
so code like this:

that would translate to this with the alloca method:

(b, c and d are values already on the stack)

# pop the alloca for b
# pop the alloca for c
# generate LLVM load (for b)
# generate LLVM load (for c)
# generate LLVM instruction for b + c
# create LLVM alloca for result (let's call it T)
# push the alloca
# generate the LLVM store for T into the alloca
# pop the alloca for T
# pop the alloca for d
# generate LLVM load (for T)
# generate LLVM load (for d)
# generate LLVM instruction for T * d
# create LLVM alloca for result (let's call it U)
# push the alloca
# generate the LLVM store for U into the alloca
# pop
# generate LLVM load for U
# generate LLVM store for U into memory reserved for variable `a'

And with the general LLVM instruction method:

# pop (for b)
# pop (for c)
# generate LLVM instruction for (b + c)
# push the resulting LLVM instruction
# pop (for the previous result)
# pop (for d)
# generate LLVM instruction for T * d
# generate LLVM instruction for storing the previous result into the
  memory reserver for variable `a'

If any `b', `c', `d' are variables, you need to detect that and generate
the corresponding `load'. Note that `push' and `pop' are operations on
the stack of your translator, not LLVM instructions.

The key here is to realize that any LLVM instruction that yields a value
(imul, add, etc) *is* a LLVM Value object, the same as the result of a
`load' instruction.

HTH.

Hi Oscar,

Thank you for your reply. Unfortunately my stack instruction set contains instructions, that it’s impossible to predict how much stack space would they require (Unwind/Eval instructions from G-Machine code). Therefore I doubt that I would be able to implement it using compile-time stack only (and I believe that this is what you’re suggesting). I think I have to implement some sort of run-time stack and my question was what is the best way to do this. Now I know that subsequent alloca’s doesn’t allocate contiguous area of memory on the stack, so I believe that the only way to go would be to allocate stack items consisting of 2 elements:

  • the pointer to the previous stack cell
  • actual value of the cell

Do you see any better way of doing that?

Kind regards