[RFC] Interprocedural MIR-level outlining pass

Many years ago, Green Hills Software implemented CodeFactor(tm), an "outliner" like yours. BTW, I think "factor" is a good name to describe the new out-of-line code sequence, including the return instruction(s).

CodeFactor also used a Suffix Tree. Later evaluation showed it used a LOT of memory.

I think a better data structure and algorithm might be to create a fixed size record at each instruction, containing the PC followed by some number of bytes of instructions (enough bytes to be able to compute break even -- 12 or 16 bytes for typical RISC architectures). Now sort the records on the instruction bytes. This brings together all records with the same prefix. The point is that the sort can use external storage and the sorted file can be processed sequentially., a little at a time. You still need random access to the program.

Craig Franklin