Problems with 64-bit register operands of inline asm on ARM

r175088 attempted to fix gcc inline asm compatibility with 64-bit
operands by forcing these into even/odd register pairs the same way gcc
always allocates such values.

While the fix appears to work as such, it is not always activated when
required. The patch makes the assumption that any inline asm statement
relying on the even/odd allocation will make use of the %Hn syntax to
reference the high (odd) register. However, this is not the case.

A common pattern where llvm still fails is code using the abbreviated
syntax for LDRD and friends supported by gas:

    __asm__ ("ldrd %0, [%1]" : "=r"(a) : "r"(b));

Here the second destination register is implicitly one higher than the
first. Because of this, the %H0 construct is never used, so the forced
even/odd allocation is skipped.

One possible fix, which I have tested, is to look for the specific
instructions requiring such a pair (LDRD/STRD and LDREXD/STREXD) in
addition to the 'H' modifier. However, there are probably other
creative ways in which inline asm might rely on the specific pairing.
Thus I believe the safest solution is to always force 64-bit operands
into even/odd pairs for any inline asm. In other words, we should
probably do something like this (untested):

--- a/lib/Target/ARM/ARMISelDAGToDAG.cpp
+++ b/lib/Target/ARM/ARMISelDAGToDAG.cpp
@@ -3457,19 +3457,6 @@ SDNode *ARMDAGToDAGISel::SelectInlineAsm(SDNode *N){
   bool Changed = false;
   unsigned NumOps = N->getNumOperands();

- ExternalSymbolSDNode *S = dyn_cast<ExternalSymbolSDNode>(
- N->getOperand(InlineAsm::Op_AsmString));
- StringRef AsmString = StringRef(S->getSymbol());

One possible fix, which I have tested, is to look for the specific
instructions requiring such a pair (LDRD/STRD and LDREXD/STREXD) in
addition to the 'H' modifier. However, there are probably other
creative ways in which inline asm might rely on the specific pairing.

Hi Mans,

Either that method is ignoring an inline asm parser or there isn't one, but
I agree, we should be able to have something better than just grep for
possible extensions for paired registers.

Thus I believe the safest solution is to always force 64-bit operands

into even/odd pairs for any inline asm. In other words, we should
probably do something like this (untested):

I tested this, and it fails on other inline assembly tests. I think that
the non-paired asm is correctly selected by the table generated parser, but
when you pair things that didn't need pairing, the parser goes bad.

I don't think we should force pairing on every inline assembly either.
Maybe someone knows how to parse inline assembly in a better way than it is
currently being done? If there isn't any, than possibly creating a function
to return "needsPairedRegister()" or something would still be ugly, but
better than incrementing it inline.

cheers,
--renato

Hi Måns,
Always forcing 64-bit operands into even/odd pairs may lead to subpoptimal
register allocation because not all 64 bit data requires paired regs. It
seems there is no general pattern to match. Maybe we should treat it case by
case.

Jakob, do you have any suggestions?

Thanks,
Weiming

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux Foundation

Hi Renato,

It seems to me that LLVM doesn’t parse the inline asm body. It just checks the constraints, (ie. Input/output interface). During ASM writing, it then binding those constraints to placeholders like %0, %1.

So it a constraint is a 64-integer type, it probably needs paired GPR.

Weiming

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

Hi Renato,

It seems to me that LLVM doesn’t parse the inline asm body. It just checks the constraints, (ie. Input/output interface). During ASM writing, it then binding those constraints to placeholders like %0, %1.

This is correct.

So it a constraint is a 64-integer type, it *probably* needs paired GPR.

This is not. Consider the Thumb2 encodings of LDRD, LDREXD, etc, for example. They allow arbitrary Rt and Rt2, not just a register pair. It's only in ARM mode that the instructions are more constrained.

-Jim

I’m not sure I understand. How does one use unpaired registers in inline asm? Could you give an example?

/jakob

Ok, so maybe checking all possible ways to require paired registers is not
such a bad idea after all.

--renato

The constraints are the right way to do it. There shouldn’t be any magic beyond that.

-Jim

Renato Golin <renato.golin@linaro.org> writes:

One possible fix, which I have tested, is to look for the specific
instructions requiring such a pair (LDRD/STRD and LDREXD/STREXD) in
addition to the 'H' modifier. However, there are probably other
creative ways in which inline asm might rely on the specific pairing.

Hi Mans,

Either that method is ignoring an inline asm parser or there isn't one, but
I agree, we should be able to have something better than just grep for
possible extensions for paired registers.

Thus I believe the safest solution is to always force 64-bit operands
into even/odd pairs for any inline asm. In other words, we should
probably do something like this (untested):

I tested this, and it fails on other inline assembly tests. I think that
the non-paired asm is correctly selected by the table generated parser, but
when you pair things that didn't need pairing, the parser goes bad.

I don't think we should force pairing on every inline assembly either.
Maybe someone knows how to parse inline assembly in a better way than it is
currently being done? If there isn't any, than possibly creating a function
to return "needsPairedRegister()" or something would still be ugly, but
better than incrementing it inline.

Yes, ideally we should be able to look at the inline assembly and
determine, for each operand, whether it requires pairing. The problem
is that gcc _always_ allocates an even/odd pair, and I don't want to
attempt enumerating and detecting every possible way this could be
relied on. Searching for LDRD-family instructions is one thing, but I'm
sure there are other ways code could rely on the way gcc allocates
registers.

One possible fix, which I have tested, is to look for the specific
instructions requiring such a pair (LDRD/STRD and LDREXD/STREXD) in
addition to the ‘H’ modifier. However, there are probably other
creative ways in which inline asm might rely on the specific pairing.

Hi Mans,

Either that method is ignoring an inline asm parser or there isn’t one, but I agree, we should be able to have something better than just grep for possible extensions for paired registers.

Thus I believe the safest solution is to always force 64-bit operands
into even/odd pairs for any inline asm. In other words, we should
probably do something like this (untested):

I tested this, and it fails on other inline assembly tests. I think that the non-paired asm is correctly selected by the table generated parser, but when you pair things that didn’t need pairing, the parser goes bad.

I don’t think we should force pairing on every inline assembly either. Maybe someone knows how to parse inline assembly in a better way than it is currently being done? If there isn’t any, than possibly creating a function to return “needsPairedRegister()” or something would still be ugly, but better than incrementing it inline.

cheers,

–renato

Jim Grosbach <grosbach@apple.com> writes:

It seems to me that LLVM doesn’t parse the inline asm body. It just
checks the constraints, (ie. Input/output interface). During ASM
writing, it then binding those constraints to placeholders like %0,
%1.

This is correct.

Ok, so maybe checking all possible ways to require paired registers
is not such a bad idea after all.

The constraints are the right way to do it. There shouldn't be any
magic beyond that.

Since there is no special operand constraint for a register pair, there
is no way to tell at that level.

GCC has (implicitly) defined 64-bit register operands as residing in
even/odd pairs, thus leaving inline asm free to make all manner of
assumptions based on this. The only way I see to guarantee
compatibility is to mimic the gcc behaviour here. It may be slightly
suboptimal in a few cases, but it's the safe choice.

Sure, that’s fine for ARM mode. No realistic other option there. So long as Thumb2 code can get the more expressive syntax for the more relaxed regalloc availability, it’s all good. This basically falls into “using the constraints to figure it out.”

0Jim

Jim Grosbach <grosbach@apple.com> writes:

I can agree with this. Bugged me last time I was looking at it too.

-eric

Currently, there are possible issues in the inlineasm pair patch. I see two unit test fails when I enable paired reg for all ARM mode inline asm.
I'm looking into it.

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation

Hi,

I sent a patch to llvm-commits for this issue. Please help to review it.

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/168354.html

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation