Hi all,
I'm trying to figure out how to add the instructions required for
llvm.atomic.cmp.swap.i32 on PowerPC. I figured out LWARX (patch
attached) but the other two (CMP_UNRESw and STWCX) require multiple
instructions:
let Defs = [CR0] in {
def STWCX : Pseudo<(outs), (ins GPRC:$rS, memrr:$dst, i32imm:$label),
"stwcx. $rS, $dst\n\tbne- La${label}_entry\nLa${label}_exit:",
[(PPCstcx GPRC:$rS, xoaddr:$dst, imm:$label)]>;
def CMP_UNRESw : Pseudo<(outs), (ins GPRC:$rA, GPRC:$rB, i32imm:$label),
"cmpw $rA, $rB\n\tbne- La${label}_exit",
[(PPCcmp_unres GPRC:$rA, GPRC:$rB, imm:$label)]>;
}
...and I can't figure out the syntax for that. Any suggestions?
Cheers,
Gary
lwarx.patch (694 Bytes)
Hi Gary,
You have to write custom encoding logic in C++ for this. This should go in PPCCodeEmitter::emitBasicBlock. I admit this isn't very elegant. A better solution would be to fix the lowering to produce two instructions instead of this pseudo instruction.
-Chris
Chris Lattner wrote:
> def CMP_UNRESw : Pseudo<(outs), (ins GPRC:$rA, GPRC:$rB, i32imm:
> $label),
> "cmpw $rA, $rB\n\tbne- La${label}_exit",
> [(PPCcmp_unres GPRC:$rA, GPRC:$rB, imm:
> $label)]>;
> }
>
> ...and I can't figure out the syntax for that. Any suggestions?
You have to write custom encoding logic in C++ for this. This
should go in PPCCodeEmitter::emitBasicBlock. I admit this isn't
very elegant. A better solution would be to fix the lowering to
produce two instructions instead of this pseudo instruction.
Hi Chris,
I'd prefer to fix the lowering if possible; the pseudo instructions
are only used in three places, so it shouldn't be a huge change.
I need to generate labels in PPCTargetLowering::LowerAtomicCMP_SWAP
however: how do I do that? FWIW the code it needs to emit is:
; inputs: ptr, oldval, newval
loop:
lwarx $tmp, 0, $ptr
cmpw $oldval, $tmp
bne- exit
stwcx. $newval, 0, $ptr
bne- loop
exit:
...
Cheers,
Gary
You need to insert new basic blocks and update CFG to accomplish this. There is a hackish way to do this right now. Add a pseudo instruction to represent this operation and mark it usesCustomDAGSchedInserter. This means the intrinsic is mapped to a single (pseudo) node. But it is then expanded into instructions that can span multiple basic blocks. See PPCTargetLowering::EmitInstrWithCustomInserter().
Evan
Evan Cheng wrote:
You need to insert new basic blocks and update CFG to accomplish this.
There is a hackish way to do this right now. Add a pseudo instruction
to represent this operation and mark it usesCustomDAGSchedInserter.
This means the intrinsic is mapped to a single (pseudo) node. But it
is then expanded into instructions that can span multiple basic
blocks. See PPCTargetLowering::EmitInstrWithCustomInserter().
How does this look? It's a big patch, but it basically does this:
- Adds ATOMIC_LOAD_ADD, ATOMIC_CMP_SWAP and ATOMIC_SWAP nodes,
and ATOMIC_LOAD_ADD_I{32,64}, ATOMIC_CMP_SWAP_I{32,64} and
ATOMIC_SWAP_I{32,64} pseudo-instructions with custom inserters.
- Replaces L[WD]ARX and ST[WD]CX pseudo-instructions with the
actual PPC instructions they represent.
- Removes CMP_UNRESERVE nodes and CMP_UNRES[wd]{,i} pseudo-
instructions.
Cheers,
Gary
new-atomics.patch (26.7 KB)
Hi Gary,
The patch looks great. But I do have one comment:
+let usesCustomDAGSchedInserter = 1 in {
+ let Uses = [CR0] in {
+ let Uses = [R0] in
+ def ATOMIC_LOAD_ADD_I32 : Pseudo<
The "let Uses = [R0]" is not needed. The pseudo instruction will be expanded like this later:
+ BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
+ .addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
+ .addReg(incr).addReg(dest);
+ BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
+ .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
The second instruction defines R0 and the 3rd reads R0 which is enough to tell the register allocator what to do.
I do have a question, must it use R0? If it's not fixed, it's probably better to create a new virtual register and use that instead.
Thanks,
Evan
Hi Evan,
Evan Cheng wrote:
The patch looks great. But I do have one comment:
+let usesCustomDAGSchedInserter = 1 in {
+ let Uses = [CR0] in {
+ let Uses = [R0] in
+ def ATOMIC_LOAD_ADD_I32 : Pseudo<
The "let Uses = [R0]" is not needed. The pseudo instruction will be
expanded like this later:
+ BuildMI(BB, TII->get(is64bit ? PPC::LDARX : PPC::LWARX), dest)
+ .addReg(ptrA).addReg(ptrB);
+ BuildMI(BB, TII->get(is64bit ? PPC::ADD4 : PPC::ADD8), PPC::R0)
+ .addReg(incr).addReg(dest);
+ BuildMI(BB, TII->get(is64bit ? PPC::STDCX : PPC::STWCX))
+ .addReg(PPC::R0).addReg(ptrA).addReg(ptrB);
The second instruction defines R0 and the 3rd reads R0 which is
enough to tell the register allocator what to do.
I do have a question, must it use R0? If it's not fixed, it's
probably better to create a new virtual register and use that
instead.
It's not fixed. How do I create a new virtual register? This is my
first real go at adding to LLVM, so I'm kind of figuring it out as I
go along 
Cheers,
Gary
Look for createVirtualRegister. These are examples in PPCISelLowering.cpp.
Evan
Cool, thanks, I'll check that out.
Cheers,
Gary
Evan Cheng wrote:
Would it be acceptable to change MachineInstr::getRegInfo from private
to public so I can use it from PPCTargetLowering::EmitInstrWithCustomInserter?
Cheers,
Gary
Evan Cheng wrote:
PPCTargetLowering::EmitInstrWithCustomInserter has a reference
to the current MachineFunction for other purposes. Can you use
MachineFunction::getRegInfo instead?
Dan
Ah, didn't see that, that's what comes of trying to do something at
5pm
I attached an updated patch which creates a virtual register
instead of using R0. How does this look?
Cheers,
Gary
Dan Gohman wrote:
ppc-atomics-take2.patch (26.9 KB)
Looks good.
+ unsigned temp;
+ if (is64bit)
+ temp = RegInfo.createVirtualRegister(&PPC::GPRCRegClass);
+ else
+ temp = RegInfo.createVirtualRegister(&PPC::G8RCRegClass);
How about?
const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass : &PPC:G8RCRegClass;
unsigned TmpReg = RegInfo.createVirtualRegister(RC);
Evan
Evan Cheng wrote:
How about?
const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass :
&PPC:G8RCRegClass;
unsigned TmpReg = RegInfo.createVirtualRegister(RC);
I tried something like that yesterday:
const TargetRegisterClass *RC =
is64bit ? &PPC::GPRCRegClass : &PPC::G8RCRegClass;
but I kept getting this error no matter how I arranged it:
error: conditional expression between distinct pointer types
‘llvm::PPC::GPRCClass*’ and ‘llvm::PPC::G8RCClass*’ lacks a cast
Any suggestions?
Cheers,
Gary
Just cast both values to const TargetRegisterClass*.
Evan
Cool, that worked. New patch attached...
Cheers,
Gary
Evan Cheng wrote:
ppc-atomics-take3.patch (26.9 KB)
Hi Gary,
This does not patch cleanly for me (PPCISelLowering.cpp). Can you prepare a updated patch?
Thanks,
Evan
Hi Evan,
Evan Cheng wrote:
This does not patch cleanly for me (PPCISelLowering.cpp). Can you
prepare a updated patch?
This should work, though I won't have access to my test box now until
next Thursday so no guarantees 
Cheers,
Gary
ppc-atomics-take4.patch (26.5 KB)