Instructions cost definition

I have issues with the LLVM backend, where it prefer to use some instructions over others, where they are actually slower.

is there a way to define the cost of an instruction in the backend?

In addition, is there a way to define different copy cost for registers based on the source and destination?

Can you be more specific about what you’re trying to do?

LLVM has the TargetTransformInfo cost callbacks to indicate throughput/latency/codesize costs for IR operations - these are mainly used for vectorization, but do get used in other middle passes as well.

A backend can also use scheduler models to indicate latency, micro-op and resource usage to help improve scheduling.

If you are trying to control backend instruction selection based on costs, then you’re probably going to end up using a mixture of TargetLowering callbacks to control some DAG combines (e.g. isCtlzFast(), convertSetCCLogicToBitwiseLogic(), etc…), backed tuning flags for specific hardware (to control both combines and instruction selection patterns) and maybe even custom MachineFunctionPass passes.

1 Like

Thank you for answering :slight_smile:

Specifically, I’m working on a backend in which bitwise operations are slower than arithmetic, and I saw that llvm prefers to use the bitwise where possible. Probably since that’s not the case in most backends.

To be honest, I’m having a hard time following your flow. Given the goal, can you describe the general actions I need to do?

By bitwise do you mainly mean bitshifts (constant/variable shift amounts?) or boolean ops or something else?

I expect you will need to investigate the various TargetLowering callbacks which will control various combines/canonicalizations in the SelectionDAG.

e.g. preferIncOfAddToSubOfNot will fold between:

(sub %y, (xor %x, -1)) ↔ (add (add %x, 1), %y)

For your target I guess you’d prefer IncOfAdd? This is the default already, but hopefully you get the idea.

If you have more specific examples you can share that would help - the callbacks are not exhaustive and have been added over the years so there will almost certainly be cases you need addressing that nobody has considered (but might actually be useful for some in tree targets making it easier to add support upstream).

I see.

I’m talking about basically any bitwise operation, such as shl, shr, and, xor etc.

First of all, is there a list of those functions (like preferIncOfAddToSubOfNot you mentioned) to follow?

Also, besides the operation cost, is there a way to define the copy register cost?

I’d recommend looking at llvm-project\llvm\include\llvm\CodeGen\TargetLowering.h

Regarding copy costs - please can you give a bit more explanation as to what you’re wanting to do? Is this just to optimize instruction scheduling or is this to influence instruction combines / selection?

Thank you, I will :slight_smile:

The thing about the copy cost is that the copy speed depends on the source and destination, and I would like it to allocate physical registers according to that cost.