OpenCL 'select' as a builtin?

Is there a builtin for CLang that implements the same semantics as OpenCL’s “select” function, but which can be used from C and C++?

I tried the obvious name ‘__builtin_select’ but that doesn’t exist, and a trawl through the online docs didn’t reveal anything either.



No, there is no such intrinsic yet. We also need it for AVX-512 masked intrinsics.

So it is worth to add.

We use __builtin_ia32_select[b|w|d…]* meanwhile, which are not polymorphic.


Wondering if it would make sense to share __builtin_select among languages, assuming it has the same format…


Hi Anastasia,

My own thoughts on this were to mirror the structure of ‘__builtin_shufflevector’, only with true/false instead of element indices. I just chose OpenCL’s ‘select’, because it is the closest construct already supported by CLang/LLVM that came close to what I want, and there was just a possibility this already had some built-in form that I had not discovered (my code search answered my question anyway - thanks also to Elana for your response).

For example, given an OpenCL ‘select’ of:

float4 f1, f2;

int4 sel = { 0, -1, -1, 0 };

float4 res = select(f1, f2, sel);

This might be represented by a built-in something like:

float4 res = __builtin_select(f1, f2, false, true, true, false);

And as with ‘__builtin_shufflevector’ the IR normalisation/canonicalization could analogously transform code as expressed in many different ways to a single canonical form using ‘__builtin_select’. Also, as with the ‘__builtin_shufflevector’, the selectors for ‘__builtin_select’ would be restricted to constants just as the indices are for ‘__builtin_shufflevector’ and so not as flexible as OpenCL’s actual ‘select’ which can accept a variable selector. This doesn’t really bring in any OpenCL versus C/C++ vs other language issues, as the ‘__builtin_select’ would be language agnostic.

The reason that I wondered about this built-in in the first place, was that I wanted to use lane-predication for some operations. For example, let’s say I have:

int4 selectiveMultiply(int4 values, int4 selector, int multiplier) {

return [OpenCL]select(values, values * multiplier /* splat */, selector);


If the ‘selector’ is the set ‘{false, true, true, false}’ then the code generated for the above is pretty convoluted, even when that set if known at compile-time via inlining. But if I could reduce this to a ‘__builtin_select’, my custom lowering implementation could lower this to a single lane-predicated multiply instruction.

Of course, this I can discover these patterns using exhaustive DAG-to-DAG analysis in a target specific fashion, but if the idea was raised to a more abstract idiom within LLVM, then the analysis could be shared by all targets (as is shuffle), and only the target specific lowering really cares, and it can always ‘expand’ if it can’t do anything special anyway - back to the generic status quo.


Hi Martin,

As far as I understand, you need Clang builtin that would be mapped to LLVM select instruction with vector types?

It seems fairly generic and just as it works in OpenCL indeed. Unfortunately, we don’t have support for this in Clang yet.