Helping the optimizer along (__assume)

Hi,

I'm interested in whether or not there is a way of providing
source-level annotations to help LLVM with optimizations, similar to
VisualC++'s __assume facility
(http://msdn.microsoft.com/en-us/library/1b3fsfxw.aspx).

As part of our PHP compiler (phpcompiler.org), it would be great to be
able to annotate our generated C code with, for example, (var !=
NULL), or (var->type == STRING), and have that information passed
around (esp interprocedurally at link-time) by the LLVM optimizers. It
would also greatly simplify our code generation.

I can provide a detailed example if what I'm asking is unclear...

Thanks in advance,
Paul
phpcompiler.org

No, nothing of the sort exists. There have been some rough
discussions of what would be involved, but there isn't really a plan
for how to implement it. It would probably end up being similar to
TBAA.

-Eli

Sorry, I meant that the way we store such information will probably be
similar to whatever we end up doing for TBAA.

-Eli

For some odd reason I was thinking this was going to be done with an assert (or a special no code generating assert). Just gen up assert (i != 0); and put it in anytime it is true. The optimizers in the fulness of time would the recognize and propagate this information.

There is a way to provide source level annotations.

int * foo (int *x __attribute((annotate("I_say_not_null"))))
{
  return x;
}

llvm-gcc will produce following IR

@.str = internal constant [9 x i8] c"/tmp/a.c\00", section "llvm.metadata" ; <[9 x i8]*> [#uses=1]
@"\01LC" = internal constant [15 x i8] c"I_say_not_null\00" ; <[15 x i8]*> [#uses=1]

define i32* @foo(i32* %x) nounwind {
entry:
  %x_addr = alloca i32* ; <i32**> [#uses=1]
  %x_addr1 = bitcast i32** %x_addr to i8* ; <i8*> [#uses=1]
  call void @llvm.var.annotation(i8* %x_addr1, i8* getelementptr ([15 x i8]* @"\01LC", i32 0, i32 0), i8* getelementptr ([9 x i8]* @.str, i32 0, i32 0), i32 1)
  ret i32* %x
}

However, this mechanism (or other alternative approaches suggested by others) has not been extended to communicate annotated information meaning to LLVM optimization passes.

Can't you implement __builtin_assume(cond) to codegen to something like:

  %cond = i1 ...
  br i1 %cond, label %always, label %never
never:
  unreachable
always:

Then in assert.h:

#ifdef NDEBUG
# define assert(cond) __builtin_assume((cond))
#else
# define assert(cond) ...
#endif

— Gordon

Mike Stump wrote:

  

As part of our PHP compiler (phpcompiler.org), it would be great to be
able to annotate our generated C code with, for example, (var !=
NULL), or (var->type == STRING), and have that information passed
around (esp interprocedurally at link-time) by the LLVM optimizers.
    

For some odd reason I was thinking this was going to be done with an assert (or a special no code generating assert). Just gen up assert (i != 0); and put it in anytime it is true. The optimizers in the fulness of time would the recognize and propagate this information.
  

The Microsoft implementation is scary. I'd be much happier with __assume if it caused a syntax error in the misprint example given.

One generalization I would like to see (perhaps after spending enough time to understand how to safely inject attributes or _Pragma via macros) is how to make an assert generate syntax errors when it is provably violated even in release mode. This gets even better if you control the intermediate format and can store the flow-of-control syntax error conditions with the bytecode representation of the function.

(Am I misreading C99/C0X/C++98/C++0x: does the exact specification of the expansion of assert in release mode prohibit slipping in a _Pragma or other implementation-extension constructs to inject flow of control constraints?)

That is, instead of hoping for an extern "C" function so that a debug wrapper could be written to check function preconditions as:

#ifndef NDEBUG
#ifdef assert
#define foo(x,x_len) (assert(NULL!=x),assert(0<x_len),foo(x,x_len))
#endif

I'd like to have the following always "bubble up" syntax errors:

template<class T>
int foo(T* x, size_t x_len)
{
    assert(NULL!=x);
    assert(0<x_len);

    // ....
}

and then....

const char* test = NULL;

foo(test,0); // two syntax errors in both release and debug modes is so much better than a runtime error only in debug mode

Kenneth

Can't you implement __builtin_assume(cond) to codegen to something like:

  %cond = i1 ...
  br i1 %cond, label %always, label %never
never:
  unreachable
always:

The code generators will remove the branch to %never.
I already tried this :slight_smile: What would work is to define
an llvm.abort intrinsic, and do:

   %cond = i1 ...
   br i1 %cond, label %always, label %never
never:
    call void @llvm.abort()
   unreachable

At codegen time @llvm.abort() can be lowered to
nothing at all. I'm not saying that this is my
favorite solution, but it is simple.

Ciao,

Duncan.

How is this different than just branching to unreachable? Branching to unreachable says that "this condition is true or else the program has undefined behavior". This means that the condition must be true :slight_smile:

-Chris

> %cond = i1 ...
> br i1 %cond, label %always, label %never
> never:
> call void @llvm.abort()
> unreachable
>
> At codegen time @llvm.abort() can be lowered to
> nothing at all. I'm not saying that this is my
> favorite solution, but it is simple.

How is this different than just branching to unreachable? Branching
to unreachable says that "this condition is true or else the program
has undefined behavior". This means that the condition must be true :slight_smile:

The difference is that simplifycfg will remove the branch
to %never if %never only contains unreachable. The role
of the intrinsic call is to fool simplifycfg into not
doing this!

Ciao,

Duncan.

Hi all,

I've been thinking about this issue as well, since I'm working with a
architecture that can do hardware based loops, but only when the loop count is
more than some minimal value. To probably use this, we need some way for the
code to specify that a loop variable has a minimum value.

Can't you implement __builtin_assume(cond) to codegen to something like:

  %cond = i1 ...
  br i1 %cond, label %always, label %never
never:
  unreachable
always:

I've looked into an approach like this as well, but it doesn't quite work. My
approach wasn't to use an unreachable, but simply a return (IIRC). However,
the problems I saw were as follows:
* If the optimizers are smart enough to actually verify that your assertion
   holds (ie, %cond is provably true), the branch is removed alltogether (or
   at least %cond is simplified and replaced by true).
   
   Now, you might say that if the optimizers are smart enough to prove the
   condition true, they shouldn't need the assertion in the first place.
   However, it usually requires quite some reasoning to prove the condition,
   which is often done in incremental step (sometimes even by different
   passes).

   Moreover, not all passes do the same complicated reasoning to prove some
   property, even though they might benefit from the conclusions. We don't
   want all passes to do this either, since drawing the same conclusions over
   and over again is pointless.

   Also, the code might want to make some assertions which are not provably
   true, (ie, preconditions on external functions can't be proven until after
   linking), but can lead to significant optimizations.

* If the optimizers are not smart enough to verify the assertion, modeling
   such an assertion with a conditional branch instruction will leave the
   branch instruction in place, and you wil end up with a branch instruction
   in the resulting code.

The above shows that using normal branches for marking assumptions is not a
very usable strategy. The example above tries to mark the branch as a special
branch by putting in an unreachable instruction. While this should prevent the
branch from showing up in the resulting code, as pointed out the branch gets
eliminated way too early.

One could think of making a new "codegen-unreachable" instruction, which is
counted as unreachable only just before or at codegen. This would help the
branch to stay alive longer, so the optimization passes can use the info it
encodes. However, this still allows the branch to be removed when the
condition can be proven somewhere along the way.

So, to really be able to encode this data, one could think of having an
"assume" intrinsic, i.e.:

  %cond = i1 ...
  call void @llvm.assume( i1 %cond )

Optimization passes won't just delete this, but we could teach codegen that
this intrinsic is not generated into any code. However, this still doesn't
completely solve the problem indicated at the first point above. If %cond is
provably true, we will end up with

  call void @llvm.assume( i1 true )

which is quite pointless.

I can see two ways of fixing this.
* Don't use normal IR in the encoding of assumptions, i.e.:

    call void @llvm.assume( [ 7 x i8 ] "p != 0" )

  (Not sure if this is proper IR for encoding a string, but well..)

  The main downside here is that you are limited in which assumptions you can
  encode, which have to be defined quite clearly. On the other hand, when
  encoding using IR, you can encode almost everything, but optimizations will
  still be able to understand a limited amount of it.

  Another downside here is that it is harder to keep assumptions correct when
  the code is transformed.
* Mark the instructions used for assumptions explicitely, so they won't get
   modified, i.e.:

    %cond = immutable icmp ne i32 %p, 0
    call void @llvm.assume( i1 %cond )

  This probably has lots of other problems (such as preventing other
  transformations from taking place and needing updates to almost every
  optimization pass we have), but seems like it could work.

Are there any other suggestions for solutions? I don't quite like either one
of the two I proposed, but can't think of any others just now.

Gr.

Matthijs

This certainly sounds like the simplest approach, even though it adds
an intrinsic. Is there general interest in adding this?

Thanks,
Paul

Hi,

* Mark the instructions used for assumptions explicitely, so they won't get
   modified, i.e.:

    %cond = icmp ne i32 %p, 0
    call void @llvm.assume( i1 %cond )

gcc uses the equivalent of:

  %p2 = call @llvm.assume(i32 %p, "ne", i32 0)

with %p2 being used for %p wherever this assumption
if valid (for example in the "true" branch of an
"if (%p != 0) {...}").

Ciao,

Duncan.

Hi,
I'm using LLVM for JIT compilation of shaders for my ray tracer (http://www.indigorenderer.com).
My primary development target is 32 bit and 64 bit Windows.
JIT compilation of shaders is working great for x86 code, but for x64 code LLVM doesn't really work, due to ABI incompatibilties in the form of calling convention errors with x64 windows, I think.

Anyway, my questions are as follows: Is x64 JIT on Windows supposed to work currently? If not, is x64 JIT on Windows a LLVM development goal? And if so, is there a time-line or roadmap for achieving such a goal?

Thanks,
    Nicholas Chapman

Hi Nicholas,

You're right, it's an ABI issue. I opened a bug report
(http://llvm.org/bugs/show_bug.cgi?id=2801) and found a workaround but it
never got committed entirely so things are still broken. It should probably
be reopened, have my workaround committed, and then later a proper fix that
doesn't cost (a tiny bit of) performance could be done.

Kind regards,

Nicolas Capens

The thing I don't like about this, is that this has conditional branches all over the place, which break up basic blocks, which, if you do that, they screw with optimization passes. The idea would be to use a form that doesn't screw with optimization passes as hard. Now, I'm just a front end person, so if the backend people like it, who am I to say otherwise...

One generalization I would like to see (perhaps after spending enough
time to understand how to safely inject attributes or _Pragma via
macros) is how to make an assert generate syntax errors when it is
provably violated even in release mode.

I like this idea. Sounds good. One can imagine enrolling static analysis and automated theorem provers to help out on the harder problems. :slight_smile:

(Am I misreading C99/C0X/C++98/C++0x: does the exact specification of
the expansion of assert in release mode prohibit slipping in a _Pragma
or other implementation-extension constructs to inject flow of control
constraints?)

Technically, yes, but we can reword future standards to have the latitude to give compilation errors for conditions that can be proved to be false, then the implementation is conforming. We could always have a flag to control the behavior if people want/need it, though, I can't hardly see why they'd want it to compile if they assert something that is false.

Technically, yes, but we can reword future standards to have the
latitude to give compilation errors for conditions that can be proved
to be false, then the implementation is conforming. We could always
have a flag to control the behavior if people want/need it, though, I
can't hardly see why they'd want it to compile if they assert
something that is false.

you never seen assert(0 && "Not yet implemented"); ?
You may want to compile a program like this :slight_smile:

regards,

Cédric

So, to really be able to encode this data, one could think of having an
"assume" intrinsic, i.e.:

  %cond = i1 ...
  call void @llvm.assume( i1 %cond )

I like this the best.

Optimization passes won't just delete this, but we could teach codegen that
this intrinsic is not generated into any code. However, this still doesn't
completely solve the problem indicated at the first point above. If %cond is
provably true, we will end up with

  call void @llvm.assume( i1 true )

No, trivially the optimizer can be taught to not do this, if we don't want it to. The optimizer can see that this is @llvm.assume by checking the spelling (code). It all comes down to how much memory you want to burn to remember things that at one time you knew about the code and the likelihood of the utility of knowing that. Put another way, if there are any downstream consumers of the information.

If we do:

   assert (p!=0);
   (base*)p;

when base is a virtual base, this must generate code to preserve 0 values:

   if (p != 0)
     p += *(int*)(((char*)p)+n)

this does the conversion if p is non-zero, and if it is 0, it leaves it alone.

And for any reason we can figure out that p is not zero, we then can eliminate the assert as it were, and remove the conditional branch (nice win). If there are no more down stream consumers of the information, the information itself can die at this optimization pass, and save the memory. If this isn't the last consumer of the information, we can leave the assert untouched, remove the condition. For example, after -O4 inlining, there might be another virtual base conversion then exposed, and we figure out:

   assert (p!=0)
   p += *(int*)(((char*)p)+n)
   if (p != 0)
     p += *(int*)(((char*)p)+n1)

should be optimized as:

   assert (p!=0)
   p += *(int*)(((char*)p)+n)
   p += *(int*)(((char*)p)+n1)

as we can know that the first p += vbase adjustment cannot wrap back to 0.

I don't see the value in the additional complexity. We already can express (p != 0), so I see this as isomorphic to assume (cond). I like the simplicity of just assume (cond).