PTX optimizations

Hi everyone,
I am trying to add some optimizations to LLVM’s PTX backend. But i am unaware of the existing optimizations. Can you please guide me about the same?

Thank You:)

Hi everyone,
I am trying to add some optimizations to LLVM's PTX backend. But i am
unaware of the existing optimizations. Can you please guide me about the same?

Hi Adarsh,

have you had chance to take a look at lib/Target/PTX? All PTX-specific
logic is there and it's only 13 cpp files.
Specifically, you can focus on lib/Target/PTX/PTXISelDAGToDAG.cpp

krasin

Hi everyone,
I am trying to add some optimizations to LLVM’s PTX backend. But i am unaware of the existing optimizations. Can you please guide me about the same?

So far, we have been focusing on code correctness and coverage, not PTX specific optimization. Unfortunately, I have not had the time I had hoped to work on this over the summer. We do collapse multiply-add pairs into FMA, but that’s about the extent of our optimizations. I want to start looking into converting branches into predicated code and load/store scheduling, but I want to finish up the function call implementation first.

What kinds of optimizations are you wanting to implement? We should coordinate on this so as not to duplicate work.