llvm memory barrier as a builtin

I would like access to LLVM memory barrier instruction as a built-in from clang, which means that I need a name for it. In gcc, I see names like __builtin_ia32_mfence but those refers to X86 SSE instruction that we support. I don't see a gcc name that has the same semantics as our barrier instruction. For a name, I was thinking of __builtin_memory_barrier or __builtin_llvm_memory_barrier. Does anyone object of adding it as built-in or have a better idea for a name?

-- Mon Ping

I would like access to LLVM memory barrier instruction as a built-in
from clang, which means that I need a name for it. In gcc, I see names
like __builtin_ia32_mfence but those refers to X86 SSE instruction
that we support. I don't see a gcc name that has the same semantics
as our barrier instruction. For a name, I was thinking of
__builtin_memory_barrier or __builtin_llvm_memory_barrier. Does
anyone object of adding it as built-in or have a better idea for a name?

What are the semantic differences?

Thanks,

Duncan.

As far as I know, there isn't a difference, just that the intrinsic
isn't platform-specific. I think SSE2 mfence is equivalent to
"llvm.memory.barrier(i1 true,i1 true,i1 true,i1 true,i1 true)".

-Eli

__sync_synchronize is the gcc builtin for a memory barrier.

Andrew

By semantic difference, I only meant that the memory barrier is more generic that mfence, i..e., it has a different signature that allows us to express various kinds of memory barriers.

   -- Mon Ping

Thanks for the info. My impression is that __sync_synchronize takes no arguments and is the memory barrier, i.e.,
  "llvm.memory.barrier(i1 true,i1 true,i1 true,i1 true,i1 true)". Is that right? I would like a little finer control to express just a write barrier (st-st) or a read barrier.

   -- Mon Ping

Mon Ping Wang wrote:

Thanks for the info. My impression is that __sync_synchronize takes no arguments and is the memory barrier, i.e.,
  "llvm.memory.barrier(i1 true,i1 true,i1 true,i1 true,i1 true)". Is that right?

That's my understanding as well.

I would like a little finer control to express just a write barrier (st-st) or a read barrier.

My understanding is that the only types of finer grained control in gcc are the __sync_lock_test_and_set and __sync_lock_release which appear to implement acquire/release style barriers.

I expect that there will be large changes once the memory model for C++0X is released, and there may be things implemented in gcc branches or even undocumented in the mainline that give you the kind of control you want.

It might make sense to allow __sync_synchronize to be overloaded in the same way that LLVM's builtin is for use in clang, with the default version being the full memory barrier.

Luke

Hi Luke,

What you say makes sense but I'm not sure it is a good way to go. If we are using a gcc function name __sync_synchronize, I generally feel like we should support it with exactly the same signature and not try to extend it. Otherwise, it might lead to some confusion in the future unless they also plan to extend it the same way.

   -- Mon Ping

Mon Ping Wang wrote:

Hi Luke,

What you say makes sense but I'm not sure it is a good way to go. If we are using a gcc function name __sync_synchronize, I generally feel like we should support it with exactly the same signature and not try to extend it. Otherwise, it might lead to some confusion in the future unless they also plan to extend it the same way.

Nono, I definitely agree. I was just offering it as an option. I highly doubt that they intend to make any changes along these lines. My (very uneducated) expression is that they are more inclined to simply implement the upcoming memory model directly, rather than offering very fine grained barriers to user level code.

The back end will clearly have to spit out whatever asm barriers are required in order to convince the particular hardware to play along.

Luke