I think I’ve fixed everything left to be fixed in block placement, and I would really like for it to be on by default in 3.1:
- I’ve been testing this as thoroughly as I can across a large amount of real-world code, so it shouldn’t be too flaky.
- The code has been checked in and reviewed previously.
- It completes a significant missing feature in LLVM with builtin_expect.
- It “fixes” some of the regressions introduced with the inlining work by more effectively ensuring the loops are laid out correctly. (yacr2 specifically exhibited this problem)
- I see more positive movement in performance than negative, but that may be an artifact of the testing I’ve been doing.
I’m planning to check in a change to enable it by default later tonight (PST) and hopefully allow nightly testers to run through it and report any significant problems it introduces. It should be relatively easy to revert if folks are opposed to it being in 3.1, or if the testers turn up any problems*.
Thoughts? Others agree with turning it on? Strong desire for it to be delayed until after the branch? I do understand the desire to avoid last minute features… I just wasn’t able to isolate the single big performance regression I saw until today, despite working on it for two weeks… :: sigh ::
[*]: There is one problem I’m ready for – the code is very assert heavy. FNT runs on other architectures may turn up other invalid asserts. I’ll be watching build bots for this case.