TableGen operand question

Hi,

As a way to learn LLVM, I’m trying to write a backend for the Microchip PIC18 8-bit microcontroller.
On this device, the hardware stack is very small and is used only for storing function return addresses.
A “real” software stack implementation is not very efficient because of the limited instruction set, so for all the non-reentrant functions I’ve decided to use a similar solution to the one used by Microchip’s own XC8 compiler, which is to simulate the stack by assigning constant memory locations to every function.
Local variables/parameters could then be accessed by using constant memory addresses. Since this microcontroller can perform most instruction directly on such addresses, this seems the most efficient way to handle the stack.

The issue I’m facing right now is how to resolve the stack addresses in the LLVM backend.
For instance, when calling a function, I’d like to store its parameters directly in the memory area assigned to this function.
If I understand correctly, the exact stack size (and thus the memory needed by each function) cannot be determined before the epilogue/prologue insertion pass.This means that the target memory location is not known before this step.

My idea is to create an operand that would hold an identifier of the target function and the memory offset, and replace this identifier with a real memory address in a pass after emit prologue/epilogue.
However, I’m not sure how to achieve such thing in TableGen, and whether it’s at all possible.
I would be grateful for any tips/directions

Thanks,
Dawid Retkiewicz

That’s going to be a really challenging architecture to target from LLVM, and you’ll be fighting LLVM assumptions all the way, It would be a challenge for an LLVM expert, let alone an LLVM beginner.

Of course, if you succeed then you will be an expert :slight_smile:

Converting functions to use statically-allocated memory instead of stack memory would be a cool feature to have anywhere, and would also be almost essential on other stack-unfriendly ISAs such as 6502.

To do it properly, you’d want to have functions that can never be part of same call chain (i.e. never active at the same time) share memory locations. I guess the way to do that (once you’ve worked out the frame size for each function) would be to walk the complete program call tree and give each function’s frame the next addresses after its deepest caller.

And then there’s recursion. If a function calls itself, or a function up towards the root of the call tree, then you’ll have to save and restore all the function frames between to somewhere else.

Re-entrancy (e.g. things called from interrupt handlers) is even nastier.

Hi Dawid,

Oh, wow. Flashback time.

PIC18 is unfortunately not a good candidate for an LLVM backend. The separation of program and data memory in the ISA, accumulator based design, and banked data memory are the heavy hitters, but there’s plenty more once you get further. Efficient use of access memory vs. banked memory can get “fun,” for example.

The approach you’re describing for static activation records is a good start. You don’t want to try to assign explicit memory addresses at compile time, though. Do that in the linker. Have the compiler create a symbol for the activation record, including function arguments, that callers know how to reference (no need to get fancy. use something like __Afoo for function foo or something along those lines). Then have a metadata section that records for the linker which other functions get referenced by that function (or have the linker reconstruct via relocations). Use that to build a call graph at link time and allocate the locations for those activation records so that the memory for functions which can ever be live at the same time don’t overlap. Caveats: that means no recursion and no function pointers and no variable length argument lists. Getting bank selection done efficiently for referencing the activation records is hard (for globals in general, really, but extra important for locals). Also means that no function called from the main program can be called, directly or indirectly, from an interrupt function. All of that is solvable. None of it is trivial, and very tricky corner cases abound. All that being said, if you’re going to push forwards, I strongly recommend just doing a software stack to start with. That’s what the various INC/DEC addressing modes on the INDF<n> registers were added for (well, that and better array walking for stuff like memcpy()).

Now, as an alternative, the dsPIC, PIC24 and PIC30 micros are much more amenable to an LLVM backend. There’s still some really crazy stuff to deal with (24-bit wide program memory is happy times), but it’s a much more compiler friendly instruction set and deliberately so.

-Jim

Hi Dawid,

I've had to do something similar for a different backend, whereby I use
memory locations as a substitute for registers in something rather
register-contrained, and I hope this may be relevant to the issue you're
trying to solve.

How I would tackle this is to create a register class full of pseudo
registers where each one represents one memory location, and do
instruction selection and register allocation on that. Once that is
complete, I would have a pass which goes through each instruction in a
MachineFunction and replace the operands with the reference to memory
and emit that. In my case these pseudo registers lived on a stack, so
the replacement wasn't too difficult, I don't think replacing registers
with symbols would be any harder than that. Once complete it's just a
matter of having your assembler and linker handle these references
(though you should be able to generate some assembly directives to help
here also). Note may have to store some data in your targets
MachineFunctionInfo class to help keep track of the real size of a
frame, as asking a MachineFunction will now give the wrong result.

One consideration would be the size of this register class (that is, the
number of pseudo registers you create). If you make it too large and you
may end up wasting memory, though I'm not sure exactly what the register
allocations will do in this case if given far too many registers for a
given problem. If you make it too small, you'll have to start thinking
about how to spill memory locations to other memory locations (or
rewriting generated code post-regalloc to avoid this).

Hope this helps.

Thanks,
Simon

In your scheme, do you reduce the required stack storage to match the
number of pseudo-registers actually used?

The register allocator currently tries to use the minimum number of
registers, which in this case is exactly what you want. There was some
recent discussion on adding the ability to try to use the maximum
number of architectural registers in order to reduce false
dependencies http://lists.llvm.org/pipermail/llvm-dev/2016-May/099494.html

Alex