If I’m writing a clang-tidy check that focuses on replacing macros
with a more suitable modern construct, it’s easy to analyze the macro
definitions by using the PP callbacks. From the callbacks I can also
monitor the macro expansions, but the callback only gives me a
SourceRange for the expansion.
Is there a way I can correlate the SourceRange to the appropriate
nodes in the AST for further analysis?
As you’re aware there’s no map from source locations to AST nodes, nor would that be a good use of memory in general; and while you could manually search the AST I suppose, that is a lot of work, given that you’d have to look at nodes in the right order etc.
So having exhausted the alternatives let’s consider what I think is the “right” way to do it, given that this could be very helpful in other tools too: introduce a handle for ASTConsumers to call during ASTContext::Allocate(). Something like this:
class ASTContext {
...
bool AnyAllocateCallbacks;
… DeclAllocateCallbacks; //a DeclVisitor or chain of them a la PPCallbacks
… TypeAllocateCallbacks; //a TypeVisitor “"
… StmtAllocateCallbacks; //a StmtVisitor “"
public:
template<typename T, typename = std::enable_if_t<std::is_base_of_v<Decl, T>>>
void DispatchAllocateCallback(T *D) {
if (DeclAllocateCallbacks)
DeclAllocateCallbacks->Visit(D);
}
//…Same for Type, Stmt
void DispatchAllocateCallback(
template <typename T> T *Allocate(size_t Num = 1) const {
T *res = static_cast<T *>(Allocate(Num * sizeof(T), alignof(T)));
DispatchAllocateCallbacks(res);
return res;
}
};
An ASTConsumer would add its Decl/Type/StmtAllocateCallbacks via some ASTConsumer::handle* virtual method called before parsing, i.e. same way it would add PPCallbacks.
I can’t imagine this would add noticeably to compilation times when there are no callbacks defined, but that would be the key measurement.
Then, for your case I think you could just keep track of the last macro expansion, and the last Decl/Type/StmtAllocateCallbacks, and just check if their source locations match — probably a little more complexity involved, but not more than any other solution.
More importantly, this would be a big step toward e.g. enabling ASTMatchers to work during parsing, for consumers that wanted to see e.g. Sema or Parser state info at the time the node was created. Maybe doing that would even be trivial — if it were, then you could use your old solution and IdentifierInfo::hasMacroDefinitinion/Preprocessor::isMacroDefined.
[Please reply *only* to the list and do not include my email directly
in the To: or Cc: of your reply; otherwise I will not see your reply.
Thanks.]
In article <00DEDF73-162C-4EA7-93D3-90D3ED411F42@gmail.com>,
David Rector via cfe-dev <cfe-dev@lists.llvm.org> writes:
As you're aware there's no map from source locations to AST nodes, nor
would that be a good use of memory in general; and while you could manually
search the AST I suppose, that is a lot of work, given that you'd have to
look at nodes in the right order etc.
I was thinking of writing an ASTVisitor. At the moment I'm playing
around with a matcher for TranslationUnitDecl(); is there an AST node
at a higher level than that?
So having exhausted the alternatives let's consider what I think is the
``right'' way to do it, given that this could be very helpful in other
tools too: introduce a handle for ASTConsumers to call during
ASTContext::Allocate<T>(). Something like this:
[...]
Then, for your case I *think* you could just keep track of the last macro
expansion, and the last Decl/Type/StmtAllocateCallbacks, and just check if
their source locations match --- probably a little more complexity
involved, but not more than any other solution.
That's an interesting idea. In my case, I don't think it's essential
that I be able to examine the nodes during parsing, but it feels like
less work than building the entire AST and visiting it after the fact.
Worth a shot?
I'm going to prototype something with the existing matching framework
first and see how poorly that performs.
I'm less concerned about "torture" cases and more concerned with
common cases that occur in normal code. Anything that helps migrate
away from macros (particularly now that modules are in C++20) is
useful.
As you’re aware there’s no map from source locations to AST nodes, nor
would that be a good use of memory in general; and while you could manually
search the AST I suppose, that is a lot of work, given that you’d have to
look at nodes in the right order etc.
I was thinking of writing an ASTVisitor. At the moment I’m playing
around with a matcher for TranslationUnitDecl(); is there an AST node
at a higher level than that?
Yup, that’s the highest. Actually a RecursiveASTVisitor might not be so bad, so long as there are not too many locations to look up.
This is probably the most efficient way to write it:
/// Finds the outermost declaration whose getBeginLoc() matches SearchLoc.
struct FindOuterDeclFromBeginLoc : RecursiveASTVisitor<FindOuterDeclFromBeginLoc> {
SourceLocation SearchLoc; //initialize from ctor
Decl *Res = nullptr;
static bool LocIsBetween(SourceLocation Loc, SourceLocation Begin, SourceLocation End) {…}
bool TraverseDecl(Decl *D) {
if (D->getBeginLoc() == SearchLoc) {
Res = D;
return false; // We’re done, halt traversal
}
if (!LocIsBetween(MacroSourceLoc, D->getBeginLoc(), D->getEndLoc())
return true; // Don’t traverse children, but don’t halt traversal
// Traverse children
return RecursiveASTVisitor<FindOuterDeclFromBeginLoc>::TraverseDecl(D);
}
};
So having exhausted the alternatives let’s consider what I think is the
``right’’ way to do it, given that this could be very helpful in other
tools too: introduce a handle for ASTConsumers to call during
ASTContext::Allocate(). Something like this:
[…]
Then, for your case I think you could just keep track of the last macro
expansion, and the last Decl/Type/StmtAllocateCallbacks, and just check if
their source locations match — probably a little more complexity
involved, but not more than any other solution.
That’s an interesting idea. In my case, I don’t think it’s essential
that I be able to examine the nodes during parsing, but it feels like
less work than building the entire AST and visiting it after the fact.
Now that I think about it it wouldn’t work to add the callback in ASTContext::Allocate(…); many nodes allocate via a llvm::TrailingObjects base, which calls the non-typed Allocate overload, through a line like Context.Allocate(totalSizeToAlloc<…>(…)); so all of those would need special handling. Probably, the callback would need to be called in each node’s Create function, instead of in ASTContext::Allocate. Not too bad, but not as simple as before.
And it definitely wouldn’t be trivial to get ASTMatchers to work with this. Instead, a user would probably use these callbacks to simply map relevant Parser/Sema/Preprocessor state data accessed at the time of allocation to the nodes of interest, then do the real processing accessing that map during HandleTranslationUnit.
Still though, I think introducing these handles would be a huge step forward for AST tooling.