NVPTX codegen for llvm.sin (and friends)

bondhugula · July 12, 2022, 5:49am

I didn’t understand this part – is this post PTX? The scenario I was referring to is the MLIR JIT (not AOT). There isn’t a linker step post-PTX and you need to know where libdevice is when linking prior to that AFAIU.

Looks like I described it wrong. This is how one can link it in manually, and it’s not late
to perform optimizations: https://cs.opensource.google/tensorflow/tensorflow/+/master:tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc;l=283;drc=3381da37560d64c7cb62b53879a0a931ff9036c4
I have a workaround that does exactly this in the MLIR gpu-to-cubin pass and then runs LLVM passes at a desired opt level post that.

Topic		Replies	Views
[NVPTX] We need an LLVM CUDA math library, after all LLVM Dev List Archives	13	110	July 13, 2013
[RFC] How to fix sqrt vs llvm.sqrt optimization asymmetry LLVM Dev List Archives	8	126	November 12, 2013
Math functions for CUDA patch LLVM Dev List Archives	1	93	June 15, 2018
[PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation LLVM Dev List Archives	1	115	May 15, 2012
[RFC] design doc for straight-line scalar optimizations LLVM Dev List Archives	7	96	August 25, 2015

NVPTX codegen for llvm.sin (and friends)

Related topics