Canonicalizing conditional loads

Hi,
our target machine has conditional (predicated) load instructions. However it is hard to generate at dag-select
as clang emits conditional branches to avoid performing the load in most cases.
e.g.:https://godbolt.org/z/ohKTnqeqz
All these functions perform the same thing, but only the first one does not add a conditional branch instruction.

Is there a way to tell clang to always generate the code as the first function, or perhaps a llvm pass that converts
the conditional branch/load into a select or hoists the load before/after the branch, also using a select?

If not, would anyone else be interested in such thing? Any ideas for which would be the best solution?

Regards,
Diogo Sampaio

Is there a way to tell clang to always generate the code as the first function, or perhaps a llvm pass that converts the conditional branch/load into a select or hoists the load before/after the branch, also using a select?

Have you tried enabling either the EarlyIfConverter pass (for selects/conditional-moves) or EarlyIfPredicator pass (for predicated instructions) in the LLVM backend? If you are writing a downstream target, you will need to implement a few TargetInstrInfo APIs to get either pass up and running. Some in-tree backends use the former pass (e.g., AArch64, PPC). I use the latter in a downstream backend with reasonable results.

There is also an IR-level speculator in SimplifyCFG that may catch some of your opportunities (see caveats):

/// Speculate a conditional basic block flattening the CFG.
///
/// Note that this is a very risky transform currently. Speculating
/// instructions like this is most often not desirable. Instead, there is an MI
/// pass which can do it with full awareness of the resource constraints.
/// However, some cases are “obvious” and we should do directly. An example of
/// this is speculating a single, reasonably cheap instruction.
///
/// There is only one distinct advantage to flattening the CFG at the IR level:
/// it makes very common but simplistic optimizations such as are common in
/// instcombine and the DAG combiner more powerful by removing CFG edges and
/// modeling their effects with easier to reason about SSA value graphs.
///
///
/// An illustration of this transform is turning this IR:
/// \code
/// BB:
/// %cmp = icmp ult %x, %y
/// br i1 %cmp, label %EndBB, label %ThenBB
/// ThenBB:
/// %sub = sub %x, %y
/// br label BB2
/// EndBB:
/// %phi = phi [ %sub, %ThenBB ], [ 0, %EndBB ]
/// …
/// \endcode
///
/// Into this IR:
/// \code
/// BB:
/// %cmp = icmp ult %x, %y
/// %sub = sub %x, %y
/// %cond = select i1 %cmp, 0, %sub
/// …
/// \endcode
///
/// \returns true if the conditional block is removed.
bool SimplifyCFGOpt::SpeculativelyExecuteBB(BranchInst *BI, BasicBlock *ThenBB,
const TargetTransformInfo

Is there a way to tell clang to always generate the code as the first function, or perhaps a llvm pass that converts the conditional branch/load into a select or hoists the load before/after the branch, also using a select?