Inserting a high level transformation pass that uses TreeTransform

Hi,

I want to do some high level transformations in clang (the compiler not a
clang based tool) where I would like
to use TreeTransform to generate new code. After spending a couple of hours
browsing through
clang's code I still could not find a suitable place to insert my
transformation stuff. Can anybody help me?

I tried to do it in a clang plugin but there CompilerInstance::hasSema()
returns false. It it possible to use
TreeTransform in a plugin?

I would like to do my high level transformation as early as possible in the
compiler so that what the
transformation sees corresponds directly to the input source code.

Regards,
John.

Could you elaborate on what your transformation does?

– Sean Silva

Hi Sean,

I would like to transform loops like this:

  for(...)
    S;

to:

  enter_loop(&loop_info_123);
  for(...)
  {
    iteration_loop(&loop_info_123);
    S;
  }
  leave_loop(&loop_info_123);

and certain data accesses like

  a[i]++;

to

  a[(index_expression(i, &index_info_123), i)]++;

The loop_info_XXX/index_info_XXX structs contain line and column information
of the loop
and index. I understood that column information is not available at IR level
and therefore I would like
to do this transformation at AST level. I am looking for a place in the
compiler where the AST tree
is built where I can run TreeTransform passes and I hope that I can express
my transformations.
Doing this in a clang plugin would be nice but is not necessary.

Any help is highly appreciated.

Regards,
John.

Hi Sean,

I would like to transform loops like this:

  for(...)
    S;

to:

  enter_loop(&loop_info_123);
  for(...)
  {
    iteration_loop(&loop_info_123);
    S;
  }
  leave_loop(&loop_info_123);

and certain data accesses like

  a[i]++;

to

  a[(index_expression(i, &index_info_123), i)]++;

The loop_info_XXX/index_info_XXX structs contain line and column information
of the loop
and index. I understood that column information is not available at IR level
and therefore I would like
to do this transformation at AST level.

These aren't the only two options - the 3rd option would be to modify
Clang IRGen to generate the extra function calls at the relevant
places. (no guarantees about being able to upstream such changes, of
course)

Hi,

I should have told that I want to do a bit more than inserting calls. I also
want to duplicate some pieces of code like:

  for(...)
    S;

to:

  if(...)
  {
    for(...)
      S;
  }
  else
  {
    for(...)
      S;
  }

So I hope that I can duplicate code in a TreeTransform, for example, when I
encounter the for loop in the
example above.

I still have not found a place to insert my transformation. Can anybody
point me to a suitable place in the
code? Moreover, how should I use TreeTransform? I construct it with a Sema,
but how do I start it? With
the root AST of a translation unit or a function?

Is there documentation available that could help me?

Regards,
John.

Hi John,

JohnH wrote

Moreover, how should I use TreeTransform? I construct it with a Sema, but
how do I start it? With
the root AST of a translation unit or a function?

I have just finished a small toy tool, which merely duplicate declarations.
I used LibTooling, which may not be exactly what you want, but I guess that
part of what I did can be re-used in another context.

I defined a sub-class of ASTFrontEndAction, to redefine CreateASTConsumer in
order to create a SemaConsumer instead of an ASTConsumer. My
handleTopLevelDecl method creates a RecursiveASTVisitor by handing it the
Sema, which will be necessary for TreeTransform. Then the VisitStmt method
of my RecursiveASTVisitor creates an instance of my subclass of
TreeTransform, and then calls the transforming method.

Hope this helps.
Regards,

Béatrice.

Looks a little bit like loop unswitching. I do this in Scout (http://scout.zih.tu-dresden.de/) but without using TreeTransform.
However you will also find a TreeTransform example in Scout.

Best Olaf

Hi Béatrice,

This is really helpful, but I got stuck somewhere.

I tried to create a similar setup, getting ahold of the Sema in my
ASTFrontendAction, and then trying to transform function bodies.
My VisitFunctionDecl runs TransformStmt (ultimately TransformCompoundStmt)
on the function's body, but it crashes because at this point
Sema::FunctionScopes is an empty vector, and the first thing
Sema::TransformCompoundStmt tries to do (via
Sema::ActOnStartOfCompoundStmt() -> Sema::PushCompoundScope()) is to push
the new CompoundScope onto the current FunctionScope .. which of course
doesn't exist since FunctionScopes is empty.

When exactly do you start traversing with the RecursiveASTVisitor? Is it
after semantic analysis, after the Parser had already populated the entire
AST, or somehow during?

Hi Dan,

I haven’t tried to modify whole function bodies yet, just variable declarations and inner loop bodies, so I may not be very helpful. Is your ASTConsumer a SemaConsumer? It is fully done by clang machinery, so I don’t know exactly when it is performed, after semantic analysis I guess. In my sub-class of ASTFrontEndAction I redefine CreateASTConsumer to create an object that derives from SemaConsumer, so I expect that when the HandleTopLevelDecl of my sub-class of SemaConsumer is called, then the Sema is correctly initialized. Regards, Béatrice.

Hi Béatrice,In fact the problem I'm having is with transforming CompoundStmt,
which I believe is also what inner loop bodies are (when they contain more
than 1 instruction), so from that point of view it looks to me like inner
loop bodies and function bodies are the same kind of node. I am indeed using
a SemaConsumer subclass, otherwise i wouldn't have a Sema to initialise my
TreeTransform with. My setup sounds exactly like yours, however it crashes
when constructing the CompoundScopeRAII, the first thing that happens in
TransfornCompoundStmt. Would you happen to have a small piece of code that
illustrates transforming an inner function body successfully? I think that
would be of great help to try and figure out why you're not encountering
this problem. Regards,Dan

Hi Dan,

In fact, I realized that I had not used TreeTransform for my compound statements modifications.

However, as I am also interested in doing it this way, I tried it, and encountered the same problem as you. My program crashes in ActOnCompoundStmt, at line 334 (revision 179053):

if (NumElts != 0 && !CurrentInstantiationScope &&
getCurCompoundScope().HasEmptyLoopBodies)

in the call to getCurCompondScope(). It seems that the FunctionScopes is empty.

I temporarily solved the issue by adding:

SemaRef.PushFunctionScope();
SemaRef.PushCompoundScope();

before the call to RebuildCompoundStmt in my sub-class of TreeTransform. It seems to work, but I don’t know if this is reliable for a more complex tool, and if the default behavior of TreeTransform for CompoundStmt is intentional.

If somebody could comment on this, I would also appreciate.

Regards,

Beatrice.