Tokens for rewriting C++ constructor initializers


I'm writing a refactoring tool as a way to learn Clang internals. A problem I run into is: how do you get the locations of the tokens that are not present in the AST?

What I'm doing is like this: I have an ASTConsumer, which I call ParseAST to run. In the consumer, I look for CXXConstructorDecl with the TraverseDecl method, then I want to rewrite the method (for example, to move the function body from the header to an implementation file).

Getting the function body is easy, but I'm not sure what to do with the initializers. For each item in the initializer list that I get from init_begin() and init_end(), the source range encompasses the item itself, but not the preceding colon or the in-between commas. Is there a way to get those parts from the AST, or do I need to get those lex info elsewhere?

Thank you,

The typical way to handle this is to point the lexer to the position just before where you expect to see the token, the lex until you find the token you're looking for. It doesn't work in the cases of insane macro hackery, but it works fairly well in general.

  - Doug

A less complicated idea would be to always infer them from the information you have:

  • put the colon right after the closing paren
  • put the comma right after the closing paren

It may change the file slightly, but that will harmonize the style and won’t cause any major issue.

– Matthieu