I believe you all can access the test file and the diff in my bug report https://bugs.llvm.org/show_bug.cgi?id=45690
In short, I’m experimenting with optimization and wrote branchless code -O2 generates a jump instruction causing my loop to be twice as slow. I changed the jne+movl to cmove and my loop seems to take half as long.
I’m a little worried that this will interfere with other optimizations I make in the same function (bug report is a small reproduce). Is there a temporary workaround I can use? I’m thinking no and that I would have to suck it up. But I seen stranger things so I thought I’d ask. I guess for now I’ll experiment allowing the slow jumps and I’ll hand change it once I’m done optimizing that function