[PATCH] Emit rbit, clz on ARM for __builtin_ctz

David_Conrad · January 15, 2010, 6:13am

Hi,

On ARMv6T2 this turns cttz into rbit, clz instead of the 4 instruction sequence it is now.

I'm not sure if adding RBIT to ARMISD and doing this optimization in the legalize pass is the best option, but the only better way I could think of doing it was to add a bitreverse intrinsic to llvm ir, which itself might not be the best option since bitreverse probably isn't too common.

Other targets that I know of that could potentially benefit from this optimization being global (that have a clz and bitreverse instruction but not ctz) are AVR32 and C64x, neither of which llvm has backends for yet.

llvm-ctz-arm.diff (5.04 KB)

Chris_Lattner · January 15, 2010, 6:03pm

Hi,

On ARMv6T2 this turns cttz into rbit, clz instead of the 4 instruction sequence it is now.

I'm not sure if adding RBIT to ARMISD and doing this optimization in the legalize pass is the best option, but the only better way I could think of doing it was to add a bitreverse intrinsic to llvm ir, which itself might not be the best option since bitreverse probably isn't too common.

I haven't looked at the patch in detail, but this approach makes sense to me.

Other targets that I know of that could potentially benefit from this optimization being global (that have a clz and bitreverse instruction but not ctz) are AVR32 and C64x, neither of which llvm has backends for yet.

When/if another target wants this, we could add a ISD::RBIT operation, it doesn't need to be added at the llvm ir level,

-Chris

Richard_Osborne1 · January 15, 2010, 7:37pm

The XCore also has ctlz and bitreverse instructions and not cttz. At the moment in the XCore backend cttz is marked as legal and expanded to this pair of instructions in a pattern in the InstrInfo.td.

Sandeep_Patel · January 15, 2010, 8:04pm

Bit reversal turns up in most FFT algorithms, so it wouldn't hurt to
be able to add an instcombine that recognizes it, etc.

deep

James_Grosbach · January 15, 2010, 10:52pm

In that case, perhaps it makes sense to add it as an ISD::RBIT operation straight away.

The rest of the patch looks good to me.

-Jim

Evan_Cheng1 · January 18, 2010, 7:04pm

Hi,

On ARMv6T2 this turns cttz into rbit, clz instead of the 4
instruction sequence it is now.

I'm not sure if adding RBIT to ARMISD and doing this optimization in
the legalize pass is the best option, but the only better way I
could think of doing it was to add a bitreverse intrinsic to llvm
ir, which itself might not be the best option since bitreverse
probably isn't too common.

I haven't looked at the patch in detail, but this approach makes sense
to me.

Other targets that I know of that could potentially benefit from
this optimization being global (that have a clz and bitreverse
instruction but not ctz) are AVR32 and C64x, neither of which llvm
has backends for yet.

When/if another target wants this, we could add a ISD::RBIT operation,
it doesn't need to be added at the llvm ir level,

Bit reversal turns up in most FFT algorithms, so it wouldn't hurt to
be able to add an instcombine that recognizes it, etc.

I agree with Chris it doesn't make sense to add a llvm instruction for this since it's rare. But it's something that can be recognized in dag combine / isel. Can you attach some examples?

Evan

Evan_Cheng1 · January 18, 2010, 7:07pm

Other targets that I know of that could potentially benefit from
this optimization being global (that have a clz and bitreverse
instruction but not ctz) are AVR32 and C64x, neither of which llvm
has backends for yet.

When/if another target wants this, we could add a ISD::RBIT
operation,
it doesn't need to be added at the llvm ir level,

The XCore also has ctlz and bitreverse instructions and not cttz. At
the moment in the XCore backend cttz is marked as legal and expanded
to this pair of instructions in a pattern in the InstrInfo.td.

In that case, perhaps it makes sense to add it as an ISD::RBIT
operation straight away.

Since only a couple of targets can use this, it shouldn't block this patch from going in. Jim, can you commit this?

Thanks,

Evan

James_Grosbach · January 18, 2010, 7:59pm

Works for me. Done in r93758.

Thanks for doing this, David.

-Jim

Jakob_Stoklund_Olese · January 19, 2010, 1:10am

Blackfin can add with backwards carry, essentially doing

(rbit (add (rbit a), (rbit b)))

This is used for FFTs.

I wasn't hoping to be able to pattern-match something so complicated.

Eli_Friedman1 · January 19, 2010, 2:29am

Feel free to add target intrinsics where appropriate...

-Eli

Topic		Replies	Views
[RFC][RISCV] Selection of complex codegen patterns into RISCV bit manipulation instructions LLVM Dev List Archives	5	269	August 28, 2019
[RFC] Introducing elementwise clz/ctz builtins Clang Frontend clang	2	315	April 16, 2025
ctlz pattern LLVM Dev List Archives	2	133	August 16, 2013
How would I go about implementing new bit manipulation builtins for my proposal, especially generic builtins? Beginners clang , llvm	3	350	March 14, 2024
Rather poor code optimisation of current clang/LLVM targeting Intel x86 (both -64 and -32) LLVM Dev List Archives	8	227	December 3, 2018

[PATCH] Emit rbit, clz on ARM for __builtin_ctz

Related topics