Recursively Rewriting Expressions

I am working on a source-to-source translator plugin for clang to transform CUDA into OpenCL. I have made my own ASTConsumer and am using the Rewriter class to perform the rewrites. The way I traverse the AST is manually, stopping to rewrite the AST nodes of interest.

Much of the rewriting involves changing function calls into other ones (CUDA API calls to equivalent OpenCL ones), along with rewriting some data structures. My issue is with nested expressions in which there may be multiple portions that must be rewritten at separate levels. The simplest way I could think of is to rewrite the portions I have to in a string and combine those with clang’s statement printer so as to get a final string. I would then pass the string to the Rewriter and replace the existing Expr at the top level.

For example, let’s look at creating a dim3 in CUDA:

dim3 a(1, 2, 3);
a = dim3(a.y, a.z, a.x);

In OpenCL, the dim3s would be rewritten as size_t arrays, and would be initialized as arrays normally are:

size_t a[3] = {1, 2, 3};
a = {a[1], a[2], a[0]};

The issue here is that, in rewriting the dim3 constructor expression on the second line, I do not know how to recursively rewrite the argument expressions. I currently rewrite the constructor to use braces, but then just use the Stmt class’s printPretty call to print the argument expressions. As a result, the argument expressions are not rewritten. This is what I get, then:

size_t a[3] = {1, 2, 3};
a = {a.y, a.z, a.x};

The reason for this is that I use clang’s existing printPretty call for each argument (to pass to the Rewriter), which allows me to avoid creating my own printing methods for every possible Expr type. In order to make use of this but allow for my own printing when rewrites are necessary, my idea is to create a new class that extends clang’s StmtPrinter. In that way, I could overwrite the methods that print Expr classes of interest (say, MemberExprs that reference members of dim3s). As for the rest, I would allow the existing Visit* implementations to do what they normally do. However, StmtPrinter is an internal class, so extending it outside of clang isn’t really possible.

So, my questions are:

  1. Is this a good approach, trying to rewrite expressions recursively by outputting the rewrites to a string?
  2. Would it be possible to extend the StmtPrinter?
  3. If not, is there a good alternative?

Thanks,
Gabriel

I am working on a source-to-source translator plugin for clang to transform CUDA into OpenCL. I have made my own ASTConsumer and am using the Rewriter class to perform the rewrites. The way I traverse the AST is manually, stopping to rewrite the AST nodes of interest.

Much of the rewriting involves changing function calls into other ones (CUDA API calls to equivalent OpenCL ones), along with rewriting some data structures. My issue is with nested expressions in which there may be multiple portions that must be rewritten at separate levels. The simplest way I could think of is to rewrite the portions I have to in a string and combine those with clang's statement printer so as to get a final string. I would then pass the string to the Rewriter and replace the existing Expr at the top level.

For example, let's look at creating a dim3 in CUDA:

dim3 a(1, 2, 3);
a = dim3(a.y, a.z, a.x);

In OpenCL, the dim3s would be rewritten as size_t arrays, and would be initialized as arrays normally are:

size_t a[3] = {1, 2, 3};
a = {a[1], a[2], a[0]};

The issue here is that, in rewriting the dim3 constructor expression on the second line, I do not know how to recursively rewrite the argument expressions. I currently rewrite the constructor to use braces, but then just use the Stmt class's printPretty call to print the argument expressions. As a result, the argument expressions are not rewritten. This is what I get, then:

size_t a[3] = {1, 2, 3};
a = {a.y, a.z, a.x};

The reason for this is that I use clang's existing printPretty call for each argument (to pass to the Rewriter), which allows me to avoid creating my own printing methods for every possible Expr type. In order to make use of this but allow for my own printing when rewrites are necessary, my idea is to create a new class that extends clang's StmtPrinter. In that way, I could overwrite the methods that print Expr classes of interest (say, MemberExprs that reference members of dim3s). As for the rest, I would allow the existing Visit* implementations to do what they normally do. However, StmtPrinter is an internal class, so extending it outside of clang isn't really possible.

So, my questions are:

1. Is this a good approach, trying to rewrite expressions recursively by outputting the rewrites to a string?
2. Would it be possible to extend the StmtPrinter?
3. If not, is there a good alternative?

Why are you using the Rewriter just to replace at the top level and not inside the nested expressions ?

Well, in the example I gave I could. The bigger problem comes with rewriting some function calls, as the order of arguments in OpenCL may differ from the ones found in CUDA.

For example:

cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice);

Becomes:

clEnqueueWriteBuffer(commandQueue, src, CL_TRUE, 0, count, dst, 0, NULL, NULL);

Where the original dst, src, and count are argument expressions that are reused, but in different locations of the new call. In order to support this, I want to be able to take the relevant argument expressions and write them in the string that will replace the full CUDA call. Even better, I wanted to create a generic method of rewriting expressions to strings while printing the rest as normally, which is what I proposed.

Gabriel

I am working on a source-to-source translator plugin for clang to transform CUDA into OpenCL. I have made my own ASTConsumer and am using the Rewriter class to perform the rewrites. The way I traverse the AST is manually, stopping to rewrite the AST nodes of interest.

Much of the rewriting involves changing function calls into other ones (CUDA API calls to equivalent OpenCL ones), along with rewriting some data structures. My issue is with nested expressions in which there may be multiple portions that must be rewritten at separate levels. The simplest way I could think of is to rewrite the portions I have to in a string and combine those with clang’s statement printer so as to get a final string. I would then pass the string to the Rewriter and replace the existing Expr at the top level.

For example, let’s look at creating a dim3 in CUDA:

dim3 a(1, 2, 3);
a = dim3(a.y, a.z, a.x);

In OpenCL, the dim3s would be rewritten as size_t arrays, and would be initialized as arrays normally are:

size_t a[3] = {1, 2, 3};
a = {a[1], a[2], a[0]};

The issue here is that, in rewriting the dim3 constructor expression on the second line, I do not know how to recursively rewrite the argument expressions. I currently rewrite the constructor to use braces, but then just use the Stmt class’s printPretty call to print the argument expressions. As a result, the argument expressions are not rewritten. This is what I get, then:

size_t a[3] = {1, 2, 3};
a = {a.y, a.z, a.x};

The reason for this is that I use clang’s existing printPretty call for each argument (to pass to the Rewriter), which allows me to avoid creating my own printing methods for every possible Expr type. In order to make use of this but allow for my own printing when rewrites are necessary, my idea is to create a new class that extends clang’s StmtPrinter. In that way, I could overwrite the methods that print Expr classes of interest (say, MemberExprs that reference members of dim3s). As for the rest, I would allow the existing Visit* implementations to do what they normally do. However, StmtPrinter is an internal class, so extending it outside of clang isn’t really possible.

So, my questions are:

  1. Is this a good approach, trying to rewrite expressions recursively by outputting the rewrites to a string?
  2. Would it be possible to extend the StmtPrinter?
  3. If not, is there a good alternative?

Why are you using the Rewriter just to replace at the top level and not inside the nested expressions ?

Well, in the example I gave I could. The bigger problem comes with rewriting some function calls, as the order of arguments in OpenCL may differ from the ones found in CUDA.

For example:

cudaMemcpy(dst, src, count, cudaMemcpyHostToDevice);

Becomes:

clEnqueueWriteBuffer(commandQueue, src, CL_TRUE, 0, count, dst, 0, NULL, NULL);

Where the original dst, src, and count are argument expressions that are reused, but in different locations of the new call. In order to support this, I want to be able to take the relevant argument expressions and write them in the string that will replace the full CUDA call. Even better, I wanted to create a generic method of rewriting expressions to strings while printing the rest as normally, which is what I proposed.

Instead of pretty-printing the expression, how about getting the string that is contained in its SourceRange ?

Hmm, that very well could work. I will give it a try and let you know how it goes.

Thanks,
Gabriel

Well, after sitting down and thinking things over for a while, I have come up with a (good) solution. I began by trying to make use of the strings already contained in each Exprs SourceRange, which worked for the most part. However, things got complicated if multiple child Exprs had to be rewritten.

I realized that I was starting to duplicate a lot of functionality found in the Rewriter class. So, instead, I found a simple way to make use of Rewriters. What I do now is create a new Rewiter for the current Expr. I walk the children and recursively invoke my rewriting function. When I hit an Expr of interest, I rewrite it and return the rewritten string. Then, in the parent Expr, I use the new string to replace the original child Expr’s text. After that, I simply return the rewritten text using the Exprs SourceRange.

I figure that using multiple Rewriters will be more efficient than my own string manipulations.

Thanks for the help, Agyrios.

Gabriel

Hi Gabriel,

We talked a little through email and I appreciate that,

Some questions I have based on your last post, I know this is a little dated
:wink:

Would you suggest implementing the VisitCallExpr(CallExpr *CE) function in
RecursiveASTVisitor and then visiting the expressions in the argument list
recursively?

Last, sourceRange seems to be the write technique for replacement but for
the recursive call you spoke of are we talking something strictly based on
expressions?