thumb2 has divide instructions

The thumb2 instructions include unsigned and signed divide.
Attached are a patch and test routine.

div.diff (1.47 KB)

div.ll (426 Bytes)

Hello,

As I understand it, the divide instructions are only available on the v7-R profile of the v7 architecture. Is that incorrect?

-Jim

I'm working with a Cortex-M3 core which is v7-M profile, and it has udiv and sdiv.

bagel

Jim Grosbach wrote:

Ah, ok. I was comparing v7-A and v7-R only. The M3 is described in separate documentation (mostly since it lacks the ARM mode instructions, I suspect). In any case, as far as I can tell, not all v7 processors support the hardware divide instructions.

It's definitely desirable to support them for processors which do have them, but they need to be conditionalized such that they're only used when they're available. The instruction predicates are the best way to do that. For now, I would suggest adding a predicate such as "HasThumb2HardwareDivide" and hooking it up to a command line option to enable (see UseNEONFP in ARMSubTarget.cpp for an example of how to do that). You can then auto-enable it when the CPU string is "cortex-m3", as is done for the UseNEONFP option on the A8 (see the bottom of the ARMSubtarget::ARMSubtarget() constructor in ARMSubTarget.cpp).

Thanks for looking at this!

-Jim

OK, here's a patch that follows your suggestion. I'm not an authorized developer, so I can't commit it myself. The test case is also attached again.

bagel

Jim Grosbach wrote:

div-v2.diff (3.45 KB)

div.ll (426 Bytes)

Hello

OK, here's a patch that follows your suggestion. I'm not an authorized
developer, so I can't commit it myself. The test case is also attached
again.

"T2Divide" should be a subtarget feature bit. This way it can be
"automatically" assigned to the procesor.
The instruction selection patterns for t2{S,U}DIV should be also
guarded by this predicate.

Also, for cortex-m3 it will be nice to have separate V7M feature profile.

Anton Korobeynikov wrote:

Hello
"T2Divide" should be a subtarget feature bit. This way it can be
"automatically" assigned to the procesor.

I agree this is a better approach.

The instruction selection patterns for t2{S,U}DIV should be also
guarded by this predicate.

Is this necessary? Since the absence of the predicate causes lowering to expand divides, the pattern should never show up.

Also, for cortex-m3 it will be nice to have separate V7M feature profile.

Agreed. Now how do we get this done?

regards, bagel

Hello

Is this necessary? Since the absence of the predicate causes lowering to
expand divides, the pattern should never show up.

Just to guard codegen bugs. If anything went wrong (when predicates
will be used) then you'll get nice assertion "cannot yet select".

Agreed. Now how do we get this done?

Just look how ArmV7A is defined and do something similar...

Anton Korobeynikov wrote:

Hello

Is this necessary? Since the absence of the predicate causes lowering to
expand divides, the pattern should never show up.

Just to guard codegen bugs. If anything went wrong (when predicates
will be used) then you'll get nice assertion "cannot yet select".

Agreed. Now how do we get this done?

Just look how ArmV7A is defined and do something similar...

It's not clear to me how to add the V7m architecture without breaking something. The predicates that use ARMArchEnum assume an ordering. And V7m is a subset of v7a (and of v6t2). So a strict ordering scheme starts to break down.

I think I'll just enter this in the bug list and let people who understand the subtarget implications do the fix.

regards, bagel