This is the most straightforward solution that entirely disables LOHs which comes with performance loss. I believe that the loss brought by the outliner would be significantly worse than the loss of LOHs which could be ignored. This should be acceptable since we want to maximize the code size wins and performance loss is expected. The thing I am not sure about is how much performance loss it would bring and if it is worth trading it with 2% code size reduction.
Move AArch64CollectLOH after MachineOutliner
By letting the outliner run first we can still collect LOHs as usual so we could still benefit from them. However, adjusting the optimization pipeline seems to be a nontrivial change and I am not sure if it is doable or acceptable by the community.
For anyone following along, LOH is a MachO-specific directive which directs linker relocation optimizations. ELF uses a different mechanism to achieve a similar result.
I would guess the performance difference from enabling LOH is small; modern cores are very good at integer arithmetic.
Moving AArch64CollectLOH from addPreEmitPass to addPreEmitPass2 is reasonable at first glance… but I might be missing something; I only briefly glanced at the code.
In theory LOHs shouldn’t have any impact on compiler codegen so IMO the fact that it was inhibiting outlining was a bug that moving it LOH computing later is the correct fix for.
We also used to disable LOH to reduce the binary size.
Moving it to addPreEmitPass2 also seems reasonable. Although the uncompressed size might remain similar to when LOH is disabled, this approach can enhance the compressed size because the linker will emit more NOPs.
As a side note, if we aim to reduce the size further in this context (without emitting NOPs), tools like BOLT or other post-link optimization techniques could theoretically accomplish this.
I have identified a couple of issues by moving AArch64CollectLOH to addPreEmitPass2 on my local machine. My plan is to fix them individually and adjust the optimization pipeline once all problems are resolved.