Well, the stack pointer be a single byte, so pushing things on there
doesn't work terribly well.
Assuming I pass by reference, that's 128 values absolutely total before it
wraps around and silently clobbers itself. It means single byte values will
be incredibly inefficient... Tricky stuff.
You absolutely don't want anything on the hardware stack except function
return addresses and possibly very temp storage e.g. PHA (push A); do
something that destroys A, PLA (pull A). Or you could use a ZP temp for
that. STA ZP; LDA ZP is I think cycle or two faster, PHA/PLA is two bytes
smaller ... size usually wins.
The "C" local variables stack absolutely needs to be somewhere else,
bigger, and using a pair of ZP locations as the stack pointer (SP). You
can't index off the hardware stack pointer, for a start.
As mentioned before, if possible you'd want to statically allocate as many
local vars as possible, as LDA $nnnn is a byte smaller and twice as fast (4
vs 8) as LDY #nn; LDA (SP),Y. (you'll sometimes be able to use INY or DEY
instead of the load .. or just reuse the last value. But still...)
With regard to code layout, ideally everything would get inlined since I
have gobs of memory compared to everything else. I wouldn't need to worry
as much about the stack as long as real values don't get stored there.
I actually think that the ideal 6502 compiler would output actual 6502
code (mostly) only for leaf functions, and everything else should be
compiled to some kind of byte code. The 6502 was a very very cheap chip to
build hardware wise, but the code is BULKY. Even when operating on 8 bit
values it's worse than, say, Thumb2, due to the lack of registers. On 16 or
32 bit values it's diabolical if everything is done inline.
Wozniak's "Sweet 16" is still not a terrible design for this, but I think a
bit more thought can come up with something better. The Sweet16 interpreter
is pretty small though (under 512 bytes I think?), which is pretty
The criteria whether to use native or bytecode for a given function is
pretty similar to the inlining decision. And a decent compact, small
interpreter, byte code solution could be reused on other 8 bit CPUs.
Some of which are still in active use today, and so even commercially
important e.g. 8051, AVR, and PIC.
Erm .. are we boring the rest of llvmdev yet?