We are getting ready to implement several heuristics for correctly using LEAs to avoid stalls in the address generator on Atom. Our plan is to:
Disabling LEA generation on Atom in X86ISelDAGToDAG:: SelectLEAAddr() for all but a few pseudo-instructions
Identify loads and stores in a X86PassConfig::addPreEmitPass() pass and examine several preceding instructions to determine if an add, subtract, or mov can profitably be turned into an LEA.
The heuristics for using LEAs efficiently must know how many cycles pass between the generation of an address and its use. This requires LEAs to be added after scheduling and register allocation. Also, LEAs should not be used for math operations due to a 3 cycle stall between the execution stage and the address generator.
Attached is an incomplete patch that disables isel LEA generation and includes an empty pre-emit pass that will contain the LEA selection heuristics.
Any feedback you may have on this plan is welcome.
UseEarlyAG_Template_svn.patch (4.35 KB)