atomic operations for ARM

Hi,
I am missing atomic operations support for the ARM backend (see PR
#3887) and started trying to implement them.

Since this is the first time that I work on such stuff (and llvm) I am
going to take the supposedly easy route and provide an implementation
that will work on Linux systems.

This involves calling a special function which the kernel handles
itself. Details here:
http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00025.html

In order to extend the implementation within LLVM I read the respective
documentation. After taking a longer look at how its done for x86 and
ARMInstrInfo.td I came up with the following:

let isCall = 1,
  Defs = [R0, R1, R2, R3, R12, LR,
          D0, D1, D2, D3, D4, D5, D6, D7, CPSR] in {
  def ARM_KERNEL_CMPXCHG : ABXI<0b1011, (outs), (ins i32:$oldval,
i32:$newval, addr:$dst),
               "bl 0xffff0fc0",
               [(ATOMIC_CMP_SWAP addr:$dst, i32:$oldval, i32:$newval)]>;
}

This is probably far from being correct but thats because I could not
find some things:

Can I actually use 'i32' and 'addr' is shown above?

The line with 'ABXI' has most likely something to do with binary code
generation, right? Now I wonder how to tell it, that the function at
"0xffff0fc0" is to be called.

Any help is appreciated. :slight_smile:

Regards
Robert

Hi,
I forgot to attach a preliminary patch which also contains a small
modification to ARMISelLowering.cpp.

Btw: I don't know how to deal with 64 bit atomic compare and swap at
this time. AFAIK without kernel support you can't do it on <= armv5
machines.

Regards
Robert

Robert Schuster schrieb:

Hi,
I am missing atomic operations support for the ARM backend (see PR
#3887) and started trying to implement them.

[...]

llvm-armcas.patch (1.51 KB)

Hi,
I have reworked my previous example and got something which is accepted
by tblgen:

let isCall = 1,
  Defs = [R0, R1, R2, R3, R12, LR,
          D0, D1, D2, D3, D4, D5, D6, D7, CPSR] in {

def ARM_ATOMIC_CMP_SWAP : ABXI<0b1011, (outs GPR:$dst), (ins
i32imm:$ptr, i32imm:$old, i32imm:$new),
           "do_something",
           [(set GPR:$dst,
                 (atomic_cmp_swap_32 globaladdr:$ptr, imm:$old,
imm:$new))]>;
}

What I want to achieve first is that llc picks this definition when it
finds an occurance of the @atomic.cmp.swap.i32 intrinsic.

Therefore I wrote a basic .ll file containing only a call to this
function. I then let it run through "llvm-as | llc".

With the above definition I expected "do_something" to appears in the
assembler output when -march=arm. Unfortunately atm I still get the
error, telling me that the selection failed:

llvm-as < Atomics-32.ll | llc -march=arm
Cannot yet select: 0x1f7c540: i32,ch = AtomicCmpSwap 0x1f7cd00:1,
0x1f7cd00, 0x1f7c448, 0x1f7c350 <0x1f6fbe8:0> <volatile> alignment=4
Stack dump:
0. Program arguments: llc -march=arm
1. Running pass 'ARM Instruction Selection' on function
'@test_compare_and_swap'

I took a look at how the atomic_compare_swap implementation is done for
PowerPC and X86 but got no clue how it gets from atomic_cmp_swap_32 to
the target specific variant. Well, for X86 its done via a custom lowering.

Any idea how the DAG entry for ARM_ATOMIC_CMP_SWAP should look like so
that it is picked up by for the @atomic.cmp.swap.i32 intrinsic?

Regards
Robert

It would be useful if you can post some example code and what you think the assembly code should look like.

Hi,
I have reworked my previous example and got something which is accepted
by tblgen:

let isCall = 1,
Defs = [R0, R1, R2, R3, R12, LR,
         D0, D1, D2, D3, D4, D5, D6, D7, CPSR] in {

def ARM_ATOMIC_CMP_SWAP : ABXI<0b1011, (outs GPR:$dst), (ins
i32imm:$ptr, i32imm:$old, i32imm:$new),
          "do_something",
          [(set GPR:$dst,
                (atomic_cmp_swap_32 globaladdr:$ptr, imm:$old,
imm:$new))]>;
}

What I want to achieve first is that llc picks this definition when it
finds an occurance of the @atomic.cmp.swap.i32 intrinsic.

Therefore I wrote a basic .ll file containing only a call to this
function. I then let it run through "llvm-as | llc".

It would be useful if you post this.

With the above definition I expected "do_something" to appears in the
assembler output when -march=arm. Unfortunately atm I still get the
error, telling me that the selection failed:

llvm-as < Atomics-32.ll | llc -march=arm
Cannot yet select: 0x1f7c540: i32,ch = AtomicCmpSwap 0x1f7cd00:1,
0x1f7cd00, 0x1f7c448, 0x1f7c350 <0x1f6fbe8:0> <volatile> alignment=4
Stack dump:
0. Program arguments: llc -march=arm
1. Running pass 'ARM Instruction Selection' on function
'@test_compare_and_swap'

It's hard to guess what the problem is from this. Are you able to step through the code in ARMGenDAGISel.inc to see why it fails to match?

Evan

Hi Evan,
thanks for your answer.

Evan Cheng schrieb:

What I want to achieve first is that llc picks this definition when it
finds an occurance of the @atomic.cmp.swap.i32 intrinsic.

Therefore I wrote a basic .ll file containing only a call to this
function. I then let it run through "llvm-as | llc".

It would be useful if you post this.

Here is the code. It is basically a stripped down variant of
tests/Codegen/X86/Atomics-32.ll:

; RUN: llvm-as < %s | llc -march=x86

define void @test_compare_and_swap() nounwind {
entry:
  %a = malloc i32

  call i32 @llvm.atomic.cmp.swap.i32.p0i32( i32* %a, i32 100, i32 200 )

  br label %return

return:
  ret void
}

declare i32 @llvm.atomic.cmp.swap.i32.p0i32(i32*, i32, i32) nounwind

As an example. If I run 'llvm-as < Atomics-32.ll | llc -march=x86' to
generate x86 assembly I get

  .file "<stdin>"

  .text
  .align 16
  .globl test_compare_and_swap
  .type test_compare_and_swap,@function
test_compare_and_swap:
  subl $4, %esp
  movl $4, (%esp)
  call malloc
  movl %eax, %ecx
  movl $100, %eax
  movl $200, %edx
  lock
  cmpxchgl %edx, (%ecx) ; <- !!!
.LBB1_1: # return
  addl $4, %esp
  ret
  .size test_compare_and_swap, .-test_compare_and_swap

  .section .note.GNU-stack,"",@progbits

Here the marked cmpxchgl instruction is the one that resulted from the
matching of atomic_cmp_swap. I know that for X86 a custom selection is
applied, since X86ISelLowering.cpp contains:

  setOperationAction(ISD::ATOMIC_CMP_SWAP, MVT::i32, Custom);

However such a line does not exist in the PowerPC implementation and as
such it is (AFAIU) being done using an properly written SDNode.

All the processor which have a proper implementation for the atomic
operation luckily have proper assembler instructions which help. In the
case of ARM it is different. AFAIK every ARM ISA below armv6 misses
proper atomic operations instructions. That is why they use this kernel
hack (switch to kernel mode, disable interrupts, make the cmp swp). So
in my case I just need to call a function and that it. There is also no
need to use a special argument format (like 8 bit immediate). Whatever
fits for function argument is OK.

For me as a beginner with LLVM these differences make it hard to find a
start.

With the above definition I expected "do_something" to appears in the
assembler output when -march=arm. Unfortunately atm I still get the
error, telling me that the selection failed:

It's hard to guess what the problem is from this. Are you able to step
through the code in ARMGenDAGISel.inc to see why it fails to match?

I will do this and report back. However I thought that I just made a
mistake in the definition of ARM_ATOMIC_CMP_SWAP. Something which only
someone does who is not yet versed with these things. :slight_smile:

Do you think the definition should work as expected by me?

Regards
Robert