Hello all,
I’m using Clang and am trying to refactor CUDA code. I want to traverse the AST to access a CUDAKernelCallExpr node where the kernel call has template parameters. I have been able to successfully match on and access CUDA kernel calls which have no template parameters by using the AST matchers to match on a CUDAKernelCallExpr. However, when my code comes across a kernel call with template parameters, it appears that the expression is not even existent in the AST when it is dumped.
This is the expression that I’m having trouble parsing:
ex. matrixMulCUDA<16><<< grid, threads >>>(d_C, d_A, d_B, dimsA.x, dimsB.x);
This expression does not show up at all in the AST when dumping the AST. I have been able to successfully parse the following kernel call expression and it shows up in the AST as a “CudaKernelCallExpr” node.
ex. matrixMulCUDA<<< grid, threads >>>(d_C, d_A, d_B, dimsA.x, dimsB.x);
As you can see, the only difference is the <16> after the name of the call, and this node is clearly represented in the AST.
Here is a screenshot showing the lack of a node to represent the call with template parameters.
If I could get any assistance on why this is happening, I would appreciate it greatly.
Thank you,
Michael Ehmke
Hi, Michael.
Are you able to create a reduced testcase which demonstrates your problem?
I tried
template
global void Kernel() {}
void test() {
Kernel<16><<<100, 200>>>();
}
which I compiled with
$ clang -c -Xclang -ast-dump test.cu
and the AST looks fine to me – the CUDAKernelCallExpr shows up as expected.
Regards,
-Justin
Hi Justin,
Thanks so much for the response. I’ll try that out and get back to you.
Sincerely,
Michael Ehmke
Hi Justin,
To follow up, I tried out the sample code that you provided and the CUDAKernelCallExpr was in fact in the AST like you have said. I did some further troubleshooting with my code alongside making modifications to the code you provided me with and I discovered that the source of my issue was that I had been getting a missing file error for a header file for my source code (which had not interfered with anything for weeks now, so I disregarded it for the time-being) and the fact that I was getting that error was somehow interfering with the AST generation and causing the CUDAKernelCallExpr node to not appear in the AST only when it has template parameters. I added the includes directive for the missing header to your sample code and it had the same effect of causing the CUDAKernelCallExpr node to disappear. I have since resolved the missing dependency and all is well now with my original source code; all of the kernel call expressions are now showing up.
Thank you for your help!
Michael Ehmke