Possible fix for Bug 1388 - CPY instruction emitted on < ARMv6T

Hi,

I’ve been thinking about ways to get around this in the short term for some time,

http://llvm.org/bugs/show_bug.cgi?id=1388

An end-user workaround is possibly to use at least one register > r7 for the MOV form of the instruction.

In that case, what is listed in the bug as the CPY instruction (which is the ARMv6 version generated if both Rd and Rm are <= r7) will

become a valid MOV instruction all the way down to ARMv4T.

Even if flags aren’t modeled correctly on ARM yet, it seems like this instruction could be forced to a high register safely,

but unfortunately I don’t understand the target code enough yet to implement this.

The other (ugly and slow) option might be to encode the low-register versions with a PUSH Rm / POP Rd pair when

on ARMv5 or less.

Any thoughts?

-Gordon Keiser

Software Development Engineer

Arxan Technologies, Inc.

1305 Cumberland Ave, Ste 215

West Lafayette, IN

47906

Hi Gordon,

You're talking about Thumb mode code here, right? As I understand it, the ARM mode MOV instruction is valid for everything ARMv4 and up. (nit picky side note: CPY is an obsolete mnemonic as of the introduction of unified syntax)

Assuming so, using a high register causes headaches, as R8+ are call preserved registers, and the Thumb push/pop instructions can't reference them. LLVM's Thumb1 prologue/epilogue code doesn't know how. It could, of course, but the code would be pretty ugly. r12 might work, though, since the lifetime for these uses is only two instructions.

These days, ARM does model the condition codes at least somewhat. There should be explicit or implicit defs of CPSR for any instruction that clobbers them. That gives the basic tools to be able to handle this, anyway. The register allocator expects to be able to insert a copy instruction w/o side-effects, though, and not having one can potentially cause problems. Jakob would be able to more specifically comment on what's required there.

Off the top of my head, I'd consider leaving the copy instructions as pseudos until the ARMExpandPseudos pass. Keep track there of whether the CPSR is live and when we see a copy, do whatever tricks we can to try to use a good instruction, including reordering instructions if necessary and possible. That seems a bit heavyweight though, so hopefully someone has a better idea.

-Jim

That's right. The register allocator will insert copies and spill/fill instructions anywhere it feels like it.

It might be an option to simply bundle the CPSR def and use instructions, so CPSR is never live outside a bundle.

/jakob