Optimisation of select/cmp/br

Hi,

Is there anyway I can coerce LLVM into performing the following transformation?

   %1 = icmp eq i32 %flag, 1
   %2 = select i1 %1, %foo* %data, %struct.cell_t* null
   %3 = icmp eq %foo* %2, null
   br i1 %3, label %failure, label %success

Into:

   %1 = icmp eq i32 %flag, 1
   br i1 %1, label %success, label %failure

With %data substituted for %2 in %success and its successors, and null for %2 in %failure. The way that code generation deals with the select means that this is actually making a small but noticeable difference to performance due to the number of branches involved.

As way of justification, the code in question essentially comes from a C function similar to:

   foo *check_for_data() {
     if(flag == FULL) {
       return data;
     } else {
       return NULL;
     }
   }

which is called by code that branches on whether the result is NULL. After inlining and other optimisation passes I'm getting code like that above.

Best,
Pete

I think the first step in optimization here is to be able to convert:
%1 = icmp eq i32 %flag, 1
%2 = select i1 %1, %foo* %data, %struct.cell_t* null
%3 = icmp eq %foo* %2, null
br i1 %3, label %failure, label %success

to
%1 = icmp eq i32 %flag, 1
%2 = select i1 %1, %foo* %data, %struct.cell_t* null
%3 = icmp eq i32 %flag, 1
br i1 %3, label %failure, label %success

This would then permit instruction %2 to move down past the br.
This could also be achieved, in the short term, by changing your
source code to base the if after return on "flag" instead of the
returned value being NULL.

But what happens if %data is null? Can LLVM prove that it isn’t?

But what happens if %data is null? Can LLVM prove that it isn't?

It can't be null, but I admit that I hadn't considered whether LLVM could prove this. The actual code sequence in my case is as follows, and LLVM seems to be able to deduce that it is always non-null (if I add "if(cell == NULL) printf("HELLO");" to the function lower down this email, before "return cell", then the call to printf is not in the optimised bitcode).

   %temp.raw = getelementptr inbounds %instance.fast.1* %inst, i64 0, i32 2
   %temp = bitcast { i8*, { { void (%struct.worker_t*, %instance*, { i32 })*, %instance* } } }* %temp.raw to %struct.cell_t*
   %3 = bitcast { i8*, { { void (%struct.worker_t*, %instance*, { i32 })*, %instance* } } }* %temp.raw to i32*
   %4 = load i32* %3, align 4
   %5 = icmp eq i32 %4, 1
   %cell..i = select i1 %5, %struct.cell_t* %temp, %struct.cell_t* null
   %fcheck.2.temp = icmp eq %struct.cell_t* %cell..i, null
   br i1 %fcheck.2.temp, label %post.2, label %post.find.2.temp

The C function inlined is:

   __attribute__((always_inline)) cell_t *fast_cell_find(cell_t *cell) {
     if(cell->status == FULL) {
       return cell;
     } else {
       return NULL;
     }
   }

This function forms part of the runtime library for a language that I'm writing a compiler for. I would prefer not to change the interface, as there are other more complex versions of *_find and the one chosen depends on some properties of the channel being compiled. It's a bit neater/simpler to code gen if these versions share the same signature.

Thanks
Pete

It seems something that GVN should be able to handle if expressed with
branches instead of select. Unfortunately it currently doesn't :frowning:

It is a long project, but one way to fix this might be to extend
(rewrite?) GVN PRE (pr13307, pr10254) and then make GVN handle selects
too (on way to fix pr13590 and hopefully your example).

Cheers,
Rafael

I'm not at all familiar with the LLVM internals (just with the Ocaml bindings for generating IR), but could try and help out with this if there was agreement on how it needs doing / someone gave me a pointer in the right direction. I guess this case is the most tricky of the ones you listed since it relies on knowledge that %data is not null. I've filed this as a bug (#18754).

Pete

It seems something that GVN should be able to handle if expressed with
branches instead of select. Unfortunately it currently doesn't :frowning:

It is a long project, but one way to fix this might be to extend
(rewrite?) GVN PRE (pr13307, pr10254) and then make GVN handle selects
too (on way to fix pr13590 and hopefully your example).

I'm not at all familiar with the LLVM internals (just with the Ocaml
bindings for generating IR), but could try and help out with this if there
was agreement on how it needs doing / someone gave me a pointer in the right
direction. I guess this case is the most tricky of the ones you listed since
it relies on knowledge that %data is not null. I've filed this as a bug
(#18754).

Daniel Berlin was trying to rewrite GVN at some point. I am not sure
if it is still ongoing, but he should be able to comment on whether a
better GVN should handle these and give you some pointers if you want
to try.

Cheers,
Rafael