Telling the optimizer a value is always null at the start

How do I tell the optimizer that the (dereferenced) value of an i8**
parameter is NULL at the start so that it can eliminate the check?

I have code like:

       void test2(void** ex) {
         printf("go\n"); // does not change *ex
       }

       void call2(void** ex);

       void testeh(void** ex) {
         // I want to tell the optimizer *ex is null so it can eliminate the
first if below if test2 is inlined:
         test2(ex)
         if (*ex) return;
         printf("done")
       }

I tried with llvm.invariant.start/end but that doesn't seem to make any
difference, the icmp/br seems to stay. Can this be done with the current
optimizers?

My usecase is exception handling via an out parameter. The contract is that on entry ex always points to a stack position which value holds null.

; ModuleID = 'test.ll'
target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32"
target triple = "i686-pc-windows-msvc"

@str1 = linkonce_odr unnamed_addr constant [3 x i8] c"go\00", align 1
@str2 = linkonce_odr unnamed_addr constant [5 x i8] c"done\00", align 1

define void @inlineeh(i8** nocapture readnone %ex) {
     %1 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([3 x
i8]* @str1, i32 0, i32 0))
     ret void
}

declare i32 @printf(i8* nocapture readonly, ...) #0

define void @testeh(i8** nocapture %ex) #0 {
     %1 = bitcast i8** %ex to i8*
     %2 = tail call {}* @llvm.invariant.start(i64 4, i8* %1)
     tail call void @llvm.invariant.end({}* %2, i64 4, i8* %1)
     %3 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([3 x
i8]* @str1, i32 0, i32 0))
     %4 = load i8** %ex, align 4
     %5 = icmp eq i8* %4, null
     br i1 %5, label %6, label %8

; <label>:6 ; preds = %0
     %7 = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([5 x
i8]* @str2, i32 0, i32 0))
     br label %8

; <label>:8 ; preds = %0, %6
     ret void
}

declare void @llvm.invariant.end({}*, i64, i8* nocapture)
declare {}* @llvm.invariant.start(i64, i8* nocapture)

Hm, I don't know of an explicit way in the IR to do this. If anyone else does, feel free to chime in.

One approach would be to add a branch at the beginning of the function to an unreachable block. If you're testeh function started with:
if( *ex != null) llvm_unreachable();

You might get lucky and have the GVN or EarlyCSE propagate the null value. You're going to run into pass ordering problems here though. I suspect we'd drop the unreachable block before GVN runs. I looked at something like this previously: http://www.philipreames.com/Blog/2014/02/15/tweaking-llvm-to-exploit-assumex/

If you're willing to tolerate an extra branch at runtime, you could simply use a return in place of the unreachable. That would definitely work (i.e. no pass ordering problem), but the compare and branch would be emitted at runtime.

Are you willing to extend the optimizer? If so, adding a bit of metadata and the rules to propagate it in the optimizer would be pretty straight forward. If you need this to work from C code (rather than IR), you'd also need to pass the information through clang.

Philip

Hm, I don't know of an explicit way in the IR to do this. If anyone else
does, feel free to chime in.

One approach would be to add a branch at the beginning of the function to an
unreachable block. If you're testeh function started with:
if( *ex != null) llvm_unreachable();

You might get lucky and have the GVN or EarlyCSE propagate the null value.
You're going to run into pass ordering problems here though. I suspect we'd
drop the unreachable block before GVN runs.

Yep - SimplifyCFG does a great job of ripping branches to unreachable
out pretty early on.

Hm, I don't know of an explicit way in the IR to do this. If anyone else
does, feel free to chime in.

One approach would be to add a branch at the beginning of the function to an
unreachable block. If you're testeh function started with:
if( *ex != null) llvm_unreachable();

You might get lucky and have the GVN or EarlyCSE propagate the null value.
You're going to run into pass ordering problems here though. I suspect we'd
drop the unreachable block before GVN runs.

Looks like, at least I tried the code below and while it removes the unreachable, it doesn't remove the second checck. Philips code looks like it could do this though.

Are you willing to extend the optimizer? If so, adding a bit of metadata
and the rules to propagate it in the optimizer would be pretty straight
forward. If you need this to work from C code (rather than IR), you'd also
need to pass the information through clang.

Willing for sure but I don't know enough about llvm internals yet to do that, However it looks like the hack above will work for now. Thanks!

define void @Test(i8** %exception) {
BasicBlock24:
   %0 = load i8** %exception
   %1 = icmp ne i8* %0, null
   br i1 %1, label %BasicBlock25, label %BasicBlock26

BasicBlock25: ; preds = %BasicBlock24
   unreachable

BasicBlock26: ; preds = %BasicBlock24
   call void @ExtPrintf(%String* bitcast ({ i8*, i8*, i32, [14 x i8] }* @str22 to %String*))
   call void @TestCall(i8** %exception)
   %2 = load i8** %exception
   %3 = icmp eq i8* %2, null
   br i1 %3, label %BasicBlock28, label %BasicBlock27

BasicBlock27: ; preds = %BasicBlock26
   ret void

BasicBlock28: ; preds = %BasicBlock26
   call void @ExtPrintf(%String* bitcast ({ i8*, i8*, i32, [18 x i8] }* @str23 to %String*))
   ret void
}

define void @TestCall(i8** %exception) {
BasicBlock29:
   %0 = load i8** %exception
   %1 = icmp ne i8* %0, null
   br i1 %1, label %BasicBlock30, label %BasicBlock31

BasicBlock30: ; preds = %BasicBlock29
   unreachable

BasicBlock31: ; preds = %BasicBlock29
   call void @ExtPrintf(%String* bitcast ({ i8*, i8*, i32, [14 x i8] }* @str24 to %String*))
   ret void
}

Into:

define void @Test(i8** nocapture readonly %exception) {
BasicBlock24:
   tail call void @ExtPrintf(%String* bitcast ({ i8*, i8*, i32, [14 x i8] }* @str22 to %String*))
   tail call void @ExtPrintf(%String* bitcast ({ i8*, i8*, i32, [14 x i8] }* @str24 to %String*))
   %0 = load i8** %exception, align 4
   %1 = icmp eq i8* %0, null
   br i1 %1, label %BasicBlock28, label %BasicBlock27

BasicBlock27: ; preds = %BasicBlock24
   ret void

BasicBlock28: ; preds = %BasicBlock24
   tail call void @ExtPrintf(%String* bitcast ({ i8*, i8*, i32, [18 x i8] }* @str23 to %String*))
   ret void
}

LLVM's can't exploit information introduced by unreachable() like GCC can (or like MSVC can for the equivalent __assume() expression). This has been a long-standing issue, see the later comments in
http://llvm.org/bugs/show_bug.cgi?id=810

- Stephan

From: "Stephan Tolksdorf" <st@quanttec.com>
To: "Carlo Kok" <ck@remobjects.com>, "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>
Sent: Monday, July 14, 2014 1:06:33 PM
Subject: Re: [LLVMdev] Telling the optimizer a value is always null at the start

>> Hm, I don't know of an explicit way in the IR to do this. If
>> anyone else
>> does, feel free to chime in.
>>
>> One approach would be to add a branch at the beginning of the
>> function
>> to an
>> unreachable block. If you're testeh function started with:
>> if( *ex != null) llvm_unreachable();
>>
>> You might get lucky and have the GVN or EarlyCSE propagate the
>> null
>> value.
>> You're going to run into pass ordering problems here though. I
>> suspect we'd
>> drop the unreachable block before GVN runs.
>
> Looks like, at least I tried the code below and while it removes
> the
> unreachable, it doesn't remove the second checck. Philips code
> looks
> like it could do this though.

LLVM's can't exploit information introduced by unreachable() like GCC
can (or like MSVC can for the equivalent __assume() expression). This
has been a long-standing issue, see the later comments in
810 – Make better use of conditional unreachable's in LLVM

FYI: I've started posting patches to the llvm-commits list for review which implement an 'invariants' intrinsic which will address this shortcoming.

-Hal