Source to source transformation w/ comments

Hi all,

I am working on a project to rework a given program having a large code
base. While using Clang's fine and elaborate ways to understand and
massage an AST, I am missing a way to parse and transfer comments. I
have seen some efforts so far to get doxygen comments for methods and
so on. But what about the comments that document the working of
functions, i.e., all the comment that are in the .cpp files?

Is there any support on this?

What would be a good way to get them transfered? A new AST node on
stmt-level to refer to a comment in the SourceManager, i.e., quite light
weight? Or are there other loose ends that I can attach to?

Because I am constructing the transformed code in a bottom-up way
(bottom being the leaves of the AST), and am just streaming the result
into a stringstream, I can't traverse the resulting code a second time
and inject the comments while looking for source locations.

Furthermore would it be appreciated to have a way of control on the
transformation process, i.e. tell the clang-tool that for a certain
routine it should do something specific. I.e. a directive should be
available. Any thoughts on this?

I am willing to develop and provide patches to get that functionality,
but for this I like know a design that would be well perceived and has
chances to make it into clang, i.e., not to waste the effort.

- Andre

Generally roundtripping through ASTs is not something that’s encouraged/reliable for reasons such as this. Most tools doing source transformations do so by editing the original code, rather than producing a new derived piece of code.

Well, how is this supposed to guide me on the right way? I mean even when I stick to the Ast when transforming, the comments aren’t in there one way or the other. And given that we are transforming to Java it will get somewhat difficult to store the transformed program in the existing Ast structures, cause they aren’t made for Java. So what are you proposing, David?

Regards,
Andre

Well, how is this supposed to guide me on the right way? I mean even when
I stick to the Ast when transforming, the comments aren't in there one way
or the other.

I think there might be a mode/flag/setting for parsing comments that's not
on by default for performance reasons.

And given that we are transforming to Java it will get somewhat difficult
to store the transformed program in the existing Ast structures, cause they
aren't made for Java. So what are you proposing, David?

The intention isn't to change the AST, but to use the AST to figure out how
to change the source (eg: get the source location from the AST, then insert
text at the right location, etc)

If the transformation is sufficiently invasive, like changing languages,
that approach might not be feasible - in which case you'll be in some
pretty untested territory & I'm not sure of the right direction.

- Dave