Clang -ast-dump-xml question

Hi

we are using clang and its -ast-dump-xml feature. Our goal is to convert the serialized clang-AST to a different code representation.
We think that the output, that -ast-dump-xml produces, is not well suited for parsing. Statements are represented as ASCII styled trees and also contain parts that
confuse XML parsers (e.g. <line:18:2, col:18>).
Is there a better way to get a serialized version of the clang AST (XML, or any other format that is easier to parse)?
We would like to avoid writing a clang AST visitor for this purpose.

Thanks in advance

- Matthias Grimmer

Any reason you couldn't write a clang tool to work directly on the AST to produce the different code representation?

Hi

we are using clang and its -ast-dump-xml feature. Our goal is to convert the serialized clang-AST to a different code representation.
We think that the output, that -ast-dump-xml produces, is not well suited for parsing. Statements are represented as ASCII styled trees and also contain parts that
confuse XML parsers (e.g. <line:18:2, col:18>).

The pseudo-XML dump is a debugging aid. It’s not a stable, useful format on which to build tools. Tools should be built on top of the Clang AST, either through libclang (for a stable but not-very-rich AST representation) or the C++ AST.

Is there a better way to get a serialized version of the clang AST (XML, or any other format that is easier to parse)?

No, there isn’t.

We would like to avoid writing a clang AST visitor for this purpose.

Writing a Clang AST visitor or libclang client is really the best way to do this. There is no way to get sufficient information out of the debugging dumps to build a tool, unless your goal is to build a simple toy example that handles only a small part of C(++).

  • Doug

Hi

we are using clang and its -ast-dump-xml feature. Our goal is to convert
the serialized clang-AST to a different code representation.
We think that the output, that -ast-dump-xml produces, is not well suited
for parsing. Statements are represented as ASCII styled trees and also
contain parts that
confuse XML parsers (e.g. <line:18:2, col:18>).

The pseudo-XML dump is a debugging aid. It's not a stable, useful format
on which to build tools. Tools should be built on top of the Clang AST,
either through libclang (for a stable but not-very-rich AST representation)
or the C++ AST.

Wasn't there a plan to get rid of the XMLish dump?

Cheers,
/Manuel

I removed an XML printer that was underdeveloped and unmaintained (see http://llvm.org/viewvc/llvm-project?view=revision&revision=127141). Now that the normal AST dumping has been greatly improved, it’s probably time to remove the XMLish dump as well. It’s only advantage is that it was (at one point) more detailed than the normal AST dump.

  • Doug

No, the goal is definitely not to built a toy example.
We have to be able to process all real life C applications, and therefore serialize all information that is necessary.
Can you recommend the ASTDumper.cpp / -ast-dump as a good reference implementation? It seams that ASTDumper.cpp / -ast-dump does pretty much what we want, except from the output format.

Thanks in advance

- Matthias

As Doug mentioned, a far easier solution would be to use a clang plugin with an AST consumer that simply serializes the AST in a way you want. Check out http://clang.llvm.org/docs/LibTooling.html and friends under the "Using Clang as a Library" from http://clang.llvm.org/docs/.

No, the goal is definitely not to built a toy example.
We have to be able to process all real life C applications, and therefore
serialize all information that is necessary.
Can you recommend the ASTDumper.cpp / -ast-dump as a good reference
implementation? It seams that ASTDumper.cpp / -ast-dump does pretty much
what we want, except from the output format.

You may want to use RecursiveASTVisitor instead of modelling it on
ASTDumper. I think RAV is the preferred interface for C++ tools, and I
consider it a flaw that ASTDumper duplicates some of the traversal
logic that RAV contains.