IR "canonicalization"

I’m implementing a front-end compiler and I need to do the most basic optimizations to the IR it generates, just like Clang does.

For example:

"if not x” becomes:

%1 = xor %x, true
br i1 %1 label %2, label %3

and Clang (even with -O0) just swaps labels:

bt i1 %x label %3, label %2

Another optimization is to remove “empty blocks” that simply br to another label:

br i1 %x label %emp1, label %emp2

  br label %cont


which becomes simply:

br i1 %x label %cont, label %emp2


Is there a pass Clang uses to do these basic optimizations? Where should I look in the Clang source code?

I don't know about clang, but this is usually done by the frontend when constructing IR. The trick is to have two expression evaluators, one that gets a True+False block and evaluates a boolean expression to conditional jumps etc. to those blocks and another one that can evaluate boolean expressions into 0/1 values. The branch version then looks something like this:

emitBranchOnExpression(Expression *E, Block *TrueTarget, Block *FalseTarget) {
    If(E->isNot()) {
        return emitBranchOnExpression(E->operand(), FalseTarget, TrueTarget);
    // ...

You usually want the branch version anyway to implement the short-circuit behaviour of && and ||.

- Matthias