pragmas

Hello

Is there a "generic" pragma that is supported by LLVM and is visible by optimization passes? or any other way for a programmer to pass meta-data information to the compiler?
I am writing an analysis pass that could benefit from user provided information. At this stage, I could like to keep the kind of information that the user can provide as general as possible. Examples would be "x, y are (not) aliased", "loop trip count = x" (where x is either a static constant, or an expression", "branch cond. is true most of the time" (or x% of the time), and other information that might be statically undecidable, but the developer knows the answer to.

thanks,
Anthony

You could encode this information as simple library function calls and then find them again in the generated LLVM IR. The client then just needs a header declaring the functions and information on what they mean. Since there are never any definitions of them they won't end up going anywhere.

A more ambitious plan would be to modify llvm-gcc with new __builtins and create intrinsics in LLVM to map them to. There's really no advantage to this other than not needing the header file while compiling. There's a big disadvantage in that you end up mucking with both the front end and the llvm intrinsics.

Finally, you can modify llvm-gcc pragma handling to insert things that you want, but this is more work. You have to deal with the c-parser and c++ parser, and understand more of the front end internals. I would avoid this unless you feel like you want pragmas that have some sort of lexical semantics, and don't want to force people to use BEGIN and END macros.

Hope this is helpful,
Luke

Anthony Danalis wrote:

Pre-empting Chris's inevitable response: don't add intrinsics!

I really like the 'disappearing function calls' idea. Chris suggested
practically the same thing for a previous question about adding BigInt
support.

Anthony, whichever route you take in the end, please consider
documenting your 'code adventure' on the wiki so others can learn from
your experience.

Justing
Registered Wiki Pimp.

Thanks for the quick responses. "disappearing function calls" is by far the preferred way for me, as I want my pass to work with standard LLVM and not a hacked version that supports extra pragmas, or intrinsics. I am just new to LLVM and wanted to make sure that there isn't already a mechanism for passing meta-data between the user and the optimizer.

I am planning to contribute to the wiki soon, especially little howtos for things that take me a day to figure out how to do and turn out to be < 50 lines of code that I could have copy pasted from ... a wiki!

Anthony.

Anthony Danalis wrote:

Thanks for the quick responses. "disappearing function calls" is by far the preferred way for me, as I want my pass to work with standard LLVM and not a hacked version that supports extra pragmas, or intrinsics. I am just new to LLVM and wanted to make sure that there isn't already a mechanism for passing meta-data between the user and the optimizer.

I wrote a patch added embedded metadata here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090316/075433.html

which I'm sure will land "soon" (this weekend?). Unfortunately it doesn't permit the metadata to reference registers in your function (so, if you have "%x = load ..." then you can't have metadata on %x.

There's always a phase two I suppose.

The design is here: http://nondot.org/sabre/LLVMNotes/EmbeddedMetadata.txt

Nick

I've been using this approach now for almost two years in a project and it
_seems_ to work fine. However, I'd like to get some feedback from the experts
on which kinds of code could be moved by the standard passes over the points
in code marked by such calls.
For me, loads, stores, and calls to other functions are most interesting. Do
the current optimization passes do that? (For example, by checking whether a
global's address would escape to external entities like the dummy function
that is the marker and allowing accesses to the global to be moved if it
doesn't escape.)

Thanks,
Torvald

Yes, there are current optimizations which do things like this. Global
variables are implicitly escaping, because they can be referenced by
name, but in other cases LLVM does determine variables that don't
escape and performs optimizations accordingly.

Dan

>> You could encode this information as simple library function calls
>> and
>> then find them again in the generated LLVM IR. The client then

just

>> needs a header declaring the functions and information on what

they

>> mean. Since there are never any definitions of them they won't

end up

>> going anywhere.
>
> I've been using this approach now for almost two years in a project
> and it
> _seems_ to work fine. However, I'd like to get some feedback from
> the experts
> on which kinds of code could be moved by the standard passes over
> the points
> in code marked by such calls.
> For me, loads, stores, and calls to other functions are most
> interesting. Do
> the current optimization passes do that? (For example, by checking
> whether a
> global's address would escape to external entities like the dummy
> function
> that is the marker and allowing accesses to the global to be moved
> if it
> doesn't escape.)

Yes, there are current optimizations which do things like this. Global
variables are implicitly escaping, because they can be referenced by
name, but in other cases LLVM does determine variables that don't
escape and performs optimizations accordingly.

Can you tell me more about "the other cases", or categorize them in
some way? For example, does this apply to thread-local variables only
(which would be okay in my case), or do LLVM passes also check whether
arbitrary pointers (eg, passed in via function arguments) escape?

Thanks,
Torvald

If the functions will never really execute because your optimization pass is
going to remove them, I would expect that setting "setDoesNotAccessMemory"
on the callsites could help other optimizations.

Anthony