Rewriter in a macro

Hi,

It seems that when I rewrite source code with the Rewriter class where the
source code comes
from an expanded macro that the rewrite does not take place. If I would make
a source-to-source
transformation that rewrites while-loops to for-loops then the following
code:

#define WHILE(x) while(x)

  WHILE(cond)
    ;

  while(cond)
    ;

would become:

#define WHILE(x) while(x)

  WHILE(cond)
    ;

  for(;cond;)
    ;

Is this a known limitation of the Rewriter class?

Regards,
John.

I do not know but this is an inherently intractable issue with
source-to-source transformation in presence of a macro processor or
template system because there is no semantics associated to what macro
or template represent really.

For example, imagine you have
#define W w
#define HILE(x) hile(x)

        W##HILE(x)

Do you plan to have your program rewritten as
#define W f
#define HILE(x) or(x)

        W##HILE(x)

?

And what if W and HILE are also used in another context with different
semantics and you do not want it to be changed there, such as:
#define W w
#define HILE(x) hile(x)

puts("This is a complex " ##W "orld");
        W##HILE(x)

which displays "This is a complex world"

Is it to be changed to:
#define W f
#define HILE(x) or(x)

puts("This is a nice" ##W "orld");
        W##HILE(x)

which displays "This is a complex forld"
?

I've not tested these contrived examples, so there have certainly some
errors and they may need another level of macro expansion to have
the ## working, but you get the idea :slight_smile:

I work also on some source-to-source translator projects and it is
something difficult to do in the real life. The heuristic we use right
now is to do the transformations on the expanded source and try to
recover the #include afterwards.

This is a simplistic approach which is wrong in the general case but
this does not give that bad results in real life.

Anyway, I'm interested by hearing more about any solutions about your
issue and others in this area. :slight_smile:

The result of rewriting should be preprocessed code. I can imagine that there
are applications where
you want unpreprocessed output so perhaps the Rewriter class should have the
option to work in
the preprocessed and the unpreprocessed domain.

The result of rewriting should be preprocessed code. I can imagine that
there
are applications where
you want unpreprocessed output so perhaps the Rewriter class should have
the
option to work in
the preprocessed and the unpreprocessed domain.

As far as I understand, the Rewriter works on the actual source code. We've
done refactorings which work inside macros without problems - the hard part
is getting the right source locations, the Rewriter deals with whatever
source locations you throw at it.

Clang has in principle all information you need to deal with macros, but
sometimes it's hard to get to them :slight_smile:
You'll obviously not be able to refactor arbitrarily crazy token-pasted
code, but I've not seen this to be a problem in practice.

Similar problems exist with templates, by the way: for example, if you want
to rename Foo::f to Foo::g, and you instantiate a template with Foo which
calls f(), you cannot figure out in general whether renaming the call to
f() is correct.

Cheers,
/Manuel

I am not interested in code such as the token paste example. I mentioned the
hypothetical for to while
loop transformation. Code like this is not uncommon:

#define copy_array(a, b, n) { int i; for(i = 0; i < n; i++) a[i] = b[i]; }

  copy_array(foo, bar)

I would like that the for loop inside the expanded macro will also be
rewritten.

If clang is able to rewrite the definition of the macro. What would happen
when the macro is expanded
multiple times? In the for to while loop rewrite example, a for loop will be
detected multiple times but
it needs to be rewritten only once.

I am not interested in code such as the token paste example. I mentioned
the
hypothetical for to while
loop transformation. Code like this is not uncommon:

#define copy_array(a, b, n) { int i; for(i = 0; i < n; i++) a[i] = b[i]; }

  copy_array(foo, bar)

I would like that the for loop inside the expanded macro will also be
rewritten.

If clang is able to rewrite the definition of the macro. What would happen
when the macro is expanded
multiple times? In the for to while loop rewrite example, a for loop will
be
detected multiple times but
it needs to be rewritten only once.

Yes, you need to generally deduplicate rewrites. This is even true for
simple things as includes :slight_smile:

Note that
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Tooling/Refactoring.h?view=markup
has some infrastructure around the rewriter to help with that :slight_smile:

Cheers,
/Manuel