Hi Ali,
Thanks for bringing this up. You're definitely under very tight design constraints from the hardware. I can certainly sympathize.
I think two design elements are being conflated here, and it would be worthwhile splitting them out. For correctness, you need to make sure any routines called from an ISR don't clobber equivalent routines called from mainline code. For efficiency, you want to overlay the static stack frames of functions as much as possible when you can prove those frames do not interfere. I believe you can solve these problems orthogonally with a bit of fiddling.
Parallel call trees for the ISR and the mainline have the following primary components:
* Constructing the parallel trees themselves such that there's no overlap
* Keeping things straight in the presence of user-written assembly
* Indirect calls (function pointers)
To straightforwardly create parallel call trees, one for the mainline and one for the ISR, I believe it will be best to create two versions of every function that's compiled very early in the process, before any lowering of frames to static addresses or anything of that sort is done. The ISR-tree functions should be flagged as such and have their names mangled to reflect. Any call instructions in the ISR-tree function should be modified to call the equivalent ISR-tree function of the callee. Thus, with no further analysis, we have disjoint call trees for mainline code and ISR code. We also have a bunch of extra functions laying around, but they are unreachable, so existing optimizations should get rid of them for us.
When it comes time to allocate static memory space for the call frames, we simply have to do it twice. Once for the mainline functions, and once for the ISR functions. We can assert that nothing can overlap between the trees since we know that the trees are disjoint except via the ISR itself. Thus, we don't overlay anything from one tree with anything from the other tree. Varargs and should come along for the ride without any special handling since the functions are distinct well before those are lowered to reference static buffers.
User assembly throws a big wrench into the works, but it's not irreconcilable. I suggest requiring the user to flag assembly functions as being either for mainline or for ISR code. Doing so via a naming convention would likely tie in the easiest with the function cloning done above. With the convention being sufficiently magic (read: not legal C identifiers so as to avoid collisions), the assembler can issue diagnostics if a call instruction is seen that references a function not in the proper tree. That is, the disjointness of the trees will be enforceable even in user assembly with a bit of cooperation from the assembler.
Function pointers are where things get fun. To do these, we need to determine at run time whether we need to call the ISR or the mainline version of a function, and since the same pointer can be dereferenced from both trees, and we're not guaranteed the pointer assignment will take place in the same tree, we can't rely on static analysis to figure out which of the two versions to reference. We need to defer 'til run time, and do so in a way that's not too horribly inefficient. We would likewise prefer not to have function pointers become unmanageably large, since they already essentially need to carry around a pointer to the function itself and a pointer to the argument frame for that function. Doubling that is not reasonable. If we use descriptor handles, however, we can reduce the load of what we're passing around at runtime to a single value. We build a list of functions whose addresses are taken, and create a descriptor table for them. Eventually, the descriptor handle will be relocated to the actual address of the proper half of the descriptor (since the invocation point knows at compile time which tree it's in).
Conceptually, something like this. Depending on what level of program memory access you have (sorry, hazy memory about what the enhanced PIC16 was or wasn't going to do about that), a trickier actual implementation may be better.
struct fptr_descriptor {
void (*main_tree_fn)();
void *main_tree_args;
void (*isr_tree_fn)();
void *isr_tree_args;
};
Each translation unit builds up a table of these, with function pointers being a relocation indicating the appropriate slot in the table. For global unreachable function removal to work, we just need to consider all functions in this table as being called (both versions). Good analysis should enable even better results, but the simple solution should still be pretty good.
All of this will work without any consideration of optimizing call frame allocation. That can be done completely separately without interfering (sorry for the pun) with any of these bits, including the function pointers, since we've kept the complete separation of ISR and mainline trees.
To summarize, make two copies of each function very early and keep the ISR and mainline trees separate all the way through. Unreachable function removal will get rid of any copies that aren't needed. Function pointers become a descriptor table that reference both copies, and the runtime invokes the proper bit based on which tree the call point is in.
Best regards,
Jim