Branches &,&&

I'm turning to you with regards to an unwanted optimization/un-optimization that clang++ (``all versions that I tested, in the following link) is generating, see the code in the following link:

[http://goo.gl/oiTPX5](https://urldefense.proofpoint.com/v2/url?u=http-3A__goo.gl_3NVjyc&d=AwMCaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=CnzuN65ENJ1H9py9XLiRvC_UQz6u3oG6GUNn7_wosSM&m=yIGKLVr90vHZeRmUfPB1Vnqw6faSUcAAWnrcHp2_MXE&s=A1iOgsVG_kjAU3sjLCcvMnkzwxH-vEKt_aifM1W7kyw&e=)

The assembly code generated for both methods "amp", "ampamp" is the ``practically the same, when using the optimization flag "-O3". However, ``I'm interested to have a single jump for the code in the method "amp", ``as branch misprediction penalty is very high otherwise. It should generate a single jump, when using single ampersand (&), something that icc13 is already doing (try it in the link above).

Is there any ``optimization flag that I should set, in order to avoid this feature when ``using "-O3"?

I made a similar question to the g++ community, however this seems to be a bug (performance bug) with g++.

Thank you in advance
F. 

+llvmdev

Hi Fisnik,

+llvmdev

I’m turning to you with regards to an unwanted optimization/un-optimization that clang++ (all versions that I tested, in the following link) is generating, see the code in the following link:
http://goo.gl/oiTPX5

The assembly code generated for both methods “amp”, “ampamp” is the practically the same, when using the optimization flag “-O3”. However, I’m interested to have a single jump for the code in the method “amp”, as branch misprediction penalty is very high otherwise. It should generate a single jump, when using single ampersand (&), something that icc13 is already doing (try it in the link above).

Is there any optimization flag that I should set, in order to avoid this feature when using “-O3”?
I made a similar question to the g++ community, however this seems to be a bug (performance bug) with g++.

Thank you in advance
F.

For reference, here’s the (slightly reduced) source code:

$ cat t.c
void foo(int x, int y);
void ampamp(int x, int y) {
if (x < 3 && y > 10)
foo(x, y);
}
void amp(int x, int y) {
if ((x < 3) & (y > 10))
foo(x, y);
}

This looks like an instruction selection issue.

The LLVM optimization passes reduce the && case down to &
neither case needs a branch here, since y > 10 has no side effects
– but then the instruction selector chooses a branchy instruction
sequence for some reason.

Note that -O0 and -O3 use completely different instruction selectors:
FastISel and SelectionDAG, respectively. It looks like FastISel is
doing the obvious thing, but for some reason SelectionDAG reintroduces
a branch.

To answer the question you actually asked, I don’t think there’s a way
to choose FastISel at other optimization levels.

Actually, there is a way to use FastISel at other optimization level: (-mllvm) -fast-isel.
That being said, this is a backend option and thus, there is no guarantee that the option will be stable or supported in future release. I do not expect it to change anytime soon, but something to be aware of.

Cheers,
-Quentin