CUDA kernel call with template parameters not appearing in Clang AST

Hello all,

I’m using Clang and am trying to refactor CUDA code. I want to traverse the AST to access a CUDAKernelCallExpr node where the kernel call has template parameters. I have been able to successfully match on and access CUDA kernel calls which have no template parameters by using the AST matchers to match on a CUDAKernelCallExpr. However, when my code comes across a kernel call with template parameters, it appears that the expression is not even existent in the AST when it is dumped.

This is the expression that I’m having trouble parsing:

ex. matrixMulCUDA<16><<< grid, threads >>>(d_C, d_A, d_B, dimsA.x, dimsB.x);

This expression does not show up at all in the AST when dumping the AST. I have been able to successfully parse the following kernel call expression and it shows up in the AST as a “CudaKernelCallExpr” node.

ex. matrixMulCUDA<<< grid, threads >>>(d_C, d_A, d_B, dimsA.x, dimsB.x);

As you can see, the only difference is the <16> after the name of the call, and this node is clearly represented in the AST.

Here is a screenshot showing the lack of a node to represent the call with template parameters.

If I could get any assistance on why this is happening, I would appreciate it greatly.

Thank you,

Michael Ehmke

Hi, Michael.

Are you able to create a reduced testcase which demonstrates your problem?

I tried

template
global void Kernel() {}

void test() {
Kernel<16><<<100, 200>>>();
}

which I compiled with

$ clang -c -Xclang -ast-dump test.cu

and the AST looks fine to me – the CUDAKernelCallExpr shows up as expected.

Regards,
-Justin

Hi Justin,

Thanks so much for the response. I’ll try that out and get back to you.

Sincerely,
Michael Ehmke

Hi Justin,

To follow up, I tried out the sample code that you provided and the CUDAKernelCallExpr was in fact in the AST like you have said. I did some further troubleshooting with my code alongside making modifications to the code you provided me with and I discovered that the source of my issue was that I had been getting a missing file error for a header file for my source code (which had not interfered with anything for weeks now, so I disregarded it for the time-being) and the fact that I was getting that error was somehow interfering with the AST generation and causing the CUDAKernelCallExpr node to not appear in the AST only when it has template parameters. I added the includes directive for the missing header to your sample code and it had the same effect of causing the CUDAKernelCallExpr node to disappear. I have since resolved the missing dependency and all is well now with my original source code; all of the kernel call expressions are now showing up.

Thank you for your help!
Michael Ehmke