Marking memory as immutable in LLVM IR

Hello LLVM devs,

I am working on LLVM backend for Chapel compiler. I’d like to use ‘llvm.invariant.start’ and constant type-based alias analysis metadata (tbaa). I have read documentation and looked how clang uses ‘llvm.invariant.start’ in code generation and it’s still not clear to me how to use both of these correctly. The problem I have is when should I use any of them and how are they really different.

Here is one possible example:

void f(int x, int z)

{

const int y = x+g(z);

//…

}

One of the ways to compile above code is to:

%y = alloca i32

; … perform y computation here and store result into %y_tmp

store i32 %y_tmp, i32* %y

; … continue execution

From now on we have two choices on what to do with it.

  1. I can either use llvm.invariant.start on %y to mark that %y is never going to change. In this case, I’m unsure whether I should unmark %y with ‘llvm.invariant.end’ after I’m done executing.

  2. I can mark store and subsequent loads using constant tbaa metadata.

Now here are few questions I have:

  1. Should I go with 1, or with 2 in this case? If I have to go with 1, should I unmark memory with llvm.invariant.end after function is done executing?

  2. In general, when should I use tbaa const and llvm.invariant? I can think of: global constants, local loop constants, local if constants, constant arguments in function.

  3. Which optimizations does llvm.invariant.start and tbaa impact? How can I possibly check that I’ve added this metadata correctly and it indeed helps? Possibly by seeing that some optimization occured with new information.

Cheers, Przemek

Hello LLVM devs,

I am working on LLVM backend for Chapel compiler. I'd like to use 'llvm.invariant.start' and constant type-based alias analysis metadata (tbaa). I have read documentation and looked how clang uses 'llvm.invariant.start' in code generation and it's still not clear to me how to use both of these correctly. The problem I have is when should I use any of them and how are they really different.

Here is one possible example:

void f(int x, int z)
{
   const int y = x+g(z);
   //...
}

One of the ways to compile above code is to:

%y = alloca i32
; ... perform y computation here and store result into %y_tmp
store i32 %y_tmp, i32* %y
; ... continue execution

From now on we have two choices on what to do with it.

1. I can either use `llvm.invariant.start` on %y to mark that %y is never going to change. In this case, I'm unsure whether I should unmark %y with 'llvm.invariant.end' after I'm done executing.

If you have a location which *never* changes (even before the point of use, *no* change can ever be visible to the compiler), then use !invariant.load. If you have a location which never changes *after a given point* use invariant.start (without an end). If you have a *region* in which a value is known not to change, then use a start/end pair. Note that it's safest to use these only when the *backing memory location* is actually unchanging during the lifetime specified. The rules for reading through one assumed constant pointer while another writes to it get hairy quickly; I don't actually even remember what all the cornercases are.

2. I can mark store and subsequent loads using constant tbaa metadata.

I don't remember if we treat TBAA is-constant as being control dependent. If we don't, then this use is invalid. You should check.

Now here are few questions I have:
1. Should I go with 1, or with 2 in this case? If I have to go with 1, should I unmark memory with llvm.invariant.end after function is done executing?

In your example, the abstract memory location (alloca) will never become non-constant. invariant.start w/out end would be fine here.

2. In general, when should I use tbaa const and llvm.invariant? I can think of: global constants, local loop constants, local if constants, constant arguments in function.

This depends greatly on the exact semantics of your source language which I don't know and thus can't answer your question. Please see the general advice above.

3. Which optimizations does llvm.invariant.start and tbaa impact? How can I possibly check that I've added this metadata correctly and it indeed helps? Possibly by seeing that some optimization occured with new information.

Most memory optimizations. I know that EarlyCSE and GVN have some support, I haven't checked recently to see how complete this is.

Philip

p.s. If you haven't found it already, you might find Performance Tips for Frontend Authors — LLVM 18.0.0git documentation useful.