[PATCH] R600: Set the noduplicate attribute on barrier() intrinsics

This will prevent LLVM optimization passes from creating illegal uses
of the barrier() intrinsic (e.g. calling barrier() from a conditional
that is not executed by all threads).

I may be mis-understanding the LLVM Language Reference, but can't we
specify noduplicate with alwaysinline? Or would that produce
undesired behavior?

From the language ref (http://llvm.org/docs/LangRef.html):

[quote]
alwaysinline
    This attribute indicates that the inliner should attempt to inline
this function into callers whenever possible, ignoring any active
inlining size threshold for this
    caller.

...snip...

noduplicate

This attribute indicates that calls to the function cannot be
duplicated. A call to a noduplicate function may be moved within its
parent function, but may not be duplicated within its parent function.

A function containing a noduplicate call may still be an inlining
candidate, provided that the call is not duplicated by inlining. That
implies that the function has internal linkage and only has one call
site, so the original call is dead after inlining.
[/quote]

If that isn't an option, then there's also inlinehint... I'm not
NAK'ing this patch, just asking for clarification (if you happen to
know the answer).

Also, is there an upcoming patch to implement the intrinsic for
llvm.AMDGPU.barrier.global? I scanned recent llvm-commits posts, but
if there's one there, I must've missed it. If there's nothing yet,
let me know and I'll take a crack at it.... I had started looking at
the necessary plumbing for this a while ago, but I hadn't written any
code except a few tests.

--Aaron

I may be mis-understanding the LLVM Language Reference, but can't we
specify noduplicate with alwaysinline? Or would that produce
undesired behavior?

From the language ref (http://llvm.org/docs/LangRef.html):

[quote]
alwaysinline
    This attribute indicates that the inliner should attempt to inline
this function into callers whenever possible, ignoring any active
inlining size threshold for this
    caller.

...snip...

noduplicate

This attribute indicates that calls to the function cannot be
duplicated. A call to a noduplicate function may be moved within its
parent function, but may not be duplicated within its parent function.

A function containing a noduplicate call may still be an inlining
candidate, provided that the call is not duplicated by inlining. That
implies that the function has internal linkage and only has one call
site, so the original call is dead after inlining.
[/quote]

Are you asking about adding alwaysinline to the intrinsic calls?
The patch is already setting alwaysinline on the barrier function.

If that isn't an option, then there's also inlinehint... I'm not
NAK'ing this patch, just asking for clarification (if you happen to
know the answer).

Also, is there an upcoming patch to implement the intrinsic for
llvm.AMDGPU.barrier.global? I scanned recent llvm-commits posts, but
if there's one there, I must've missed it. If there's nothing yet,
let me know and I'll take a crack at it.... I had started looking at
the necessary plumbing for this a while ago, but I hadn't written any
code except a few tests.

I have not been working on this, and I don't have plans to work on it in
the near future.

-Tom