Template representation in AST (and RAV) with explicit instantiations

The AST nodes for template code, and RecursiveASTVisitor’s traversal, have inconsistencies that cause bugs in tools. (Example, another and I’ve heard of similar in clang-tidy). I’d like to fix some of these, but want some feedback on what we’re aiming for:

  • which nodes should be generated?
  • which source locations these nodes correspond to?
  • what subset should RAV traverse when shouldVisitTemplateInstantiations() is false?

It’d be really useful if by default RAV visited each significant location exactly once. This seems to be the spirit behind !shouldVisitTemplateInstantiations() - to traverse the code as written. But this is pretty badly broken today in the presence of explicit instantiation.


An easy case is implicit template instantiation. This looks like:

template <typename T> T var = T();
int z = var<int>;

Here there are three interesting declarations:

  • a VarTemplateDecl
  • a generic “templated” VarDecl for var<T>
  • a specific VarTemplateSpecializationDecl for var<int>, with getTemplateSpecializationKind() == TSK_ImplicitInstantiation

Their locations all point at the template, and RAV skips the VarTemplateSpecializationDecl.
:+1: This is as expected, and classes/functions behave the same way.
(There’s no FunctionTemplateSpecializationDecl, just another FunctionDecl, but it’s okay).

My mental model is: template instantiation produces some fictitious code: int var<int> = int(); Because this code was produced by instantiation, it is not traversed at all.


A difficult and buggy case is explicit template instantiation. This looks like:

template <typename T> T var = T();
template int var<int>;

Again, this generates three nodes:

  • a VarTemplateDecl
  • a generic “templated” VarDecl
  • a VarTemplateSpecializationDecl with getTemplateSpecializationKind() == TSK_ImplicitInstantiation

Locations are inconsistent between template kinds:

  • :warning: The VarTemplateSpecializationDecl's locations point at the template. This means there is no AST node pointing at the second line of code, and there are multiple traversed AST nodes pointing at the first line.
  • :warning: Functions have the same behaviour as variables.
  • :+1: Classes behave differently: the locations of a ClassTemplateSpecializationDecl will point at the explicit specialization! (Obviously not for locations inside the body).

Traversal is also inconsistent:

  • :+1: the VarTemplateSpecializationDecl is traversed, but not its initializer (which was instantiated, rather than written)
  • :warning: the FunctionDecl is not traversed at all! If explicitly passed to TraverseDecl(), the instantiated body will also be visited.
  • :+1: the ClassTemplateSpecializationDecl is traversed, but not its body (which was instantiated, rather than written).

My conclusion is that class templates behavior is basically correct here and we should fix functions/variables to match:

  • explicit instantiations produce a node whose “body” is instantiated and whose “head” was written.
  • The node should be visited and its head should be traversed, but not its body, including when passed to TraverseDecl().
  • Its head locations should point at the explicit instantiation, its body locations should point at the template.

Do others agree?

If this makes sense, my first step is to document this in RecursiveASTVisitor, and add tests of the current behavior (with FIXMEs). Details: ⚙ D120498 [AST] Test RecursiveASTVisitor's current traversal of templates.

That patch uncovers a couple of other discrepancies (e.g. initializers of VarTemplateSpecializationDecl being skipped too often). I think these are fairly clear-cut bugs where the right thing to do is more obvious, but feel free to comment if I got something wrong.

There is an alternative model that might make sense: for any instantiation case, a fullly instantiated SpecializationDecl exists under the TemplateDecl, and then the explicit instantiation is a redecl of it.

This seems a bit neater in some ways: you have a more obvious and consistent parent of each node, and you can still see in the AST the correspondence between e.g. instantiated function parameters and generic ones.

However it’s further from where we are today: none of the nodes behave this way. Even if that’s where we want to go eventually, I’d prefer to first get to a closer reasonable behavior that’s consistent between template kinds.
(In the short term, I can’t take the larger project on myself)

It’d be really useful if by default RAV visited each significant location exactly once . This seems to be the spirit behind !shouldVisitTemplateInstantiations() - to traverse the code as written.

+1, I think this is the principle of RAV, it should traverse the code as written.

Another alternative: similar to the behavior of implicit instantiation, all these 3 decl locations point at the template, and by default (shouldVisitTemplateInstantiations() is false), RAV skips the VarTemplateSpecializationDecl. In addition, we create a new AST node (ExplicitInstantiationDecl etc) for the explicit instantiation on the second line, the new AST node holds all source locations (BuiltinTypeLoc etc) of the line, and RAV visits it.

This seems clearer, and matches the mental model that code produced by implicit/explicit instantiation is not written code thus it should not traverse in RAV by default.

A specific AST node for explicit instantiations is an unimplemented feature in clang, there are some FIXMEs. However, it might take some work…

Just be clear I think you meant TSK_ExplicitInstantiationDefinition for this one, right?

Re the traversal problems, I think the root issue is that the FunctionDecl explicit instantiations are not represented in syntax, whereas classes/vars are. This necessitates hacks like traversing the function decl explicit instantiations while traversing the template…

…but as you have found this only currently happens when shouldVisitTemplateInstantiations()==true, even though it also needs to happen when that is false (but with the further complexity, as you note, that the function body should not be traversed in that case, since it is not written).

To hack around it by adding another loop somewhere is doable, and easier than creating a whole new node, but before doing either actually I don’t see why we can’t just fix this by treating FunctionDecl explicit instantiations exactly like VTSD/CTSD explicit instantiations, and add them to their lexical context in addition to adding them to specializations():

Check out that linked FIXME, that overload of ActOnExplicitInstantiation handles variables and functions I believe (which also implies the SourceLocation problem lives there, incidentally). Suppose that FIXME line were replaced with CurContext->addDecl(Specialization). That would add a FunctionDecl marked TSK_ExplicitInstantiation* to the lexical context.

That would mean that for each of var, class, and function explicit instantiations we would have a separate node representing the written syntax, which could be distinguished as an explicit instantiation by its TSK – no separate node needed. And they could all be treated the same in RAV, no hacks needed.

A simple patch just to show what I mean:

https://reviews.llvm.org/D120779

There is an alternative model that might make sense: for any instantiation case, a fullly instantiated SpecializationDecl exists under the TemplateDecl, and then the explicit instantiation is a redecl of it.
This seems a bit neater in some ways: you have a more obvious and consistent parent of each node, and you can still see in the AST the correspondence between e.g. instantiated function parameters and generic ones.

Looks like this is the alternative I suggest in the reply (sorry, I somehow missed that). IMO this is a clearer model, which aligns with the explicit instantiation decl/def defined in the C++ spec.

I think we are discussing about a generic question – how do we model the explicit instantiations (e.g. template int var<int>;) in the clang AST? We have two options: 1. use the head part of the VarTemplateSpecializationDecl (original proposal) 2. use a separate AST node (the alternative mentioned here).

However it’s further from where we are today: none of the nodes behave this way.

Yeah, this would require a big piece of work.

Even if that’s where we want to go eventually, I’d prefer to first get to a closer reasonable behavior that’s consistent between template kinds.
(In the short term, I can’t take the larger project on myself)

+1, agree, it is sad that we don’t have enough bandwidth. Considered all constrains and the badly-broken behavior, I think the original proposal would improve the current situation, it is a good plan.
My slight concern is that it might create more work if we switch to the alternative model in the future…

It sounds not much disagreement on where we want to go short-term.
I agree that splitting the “instantiation request” into a fourth node distinct from the instantiated definition would be cleaner in some ways, but changes to the AST carry risks and costs to many parties, so I’d like to proceed with the smaller change that solves the pressing problems.

I’ve seen this comment, but it’s not correct to suggest there’s any node missing from the AST (maybe I misunderstood it) compared to class/var.
In each case (Function/Class/Var), defining a template and explicitly instantiating it creates 3 nodes.

Yeah, that looks like a clean approach. I suppose this is adding a new edge to the AST rather than new nodes.
And reusing the code that handles = default is much nicer than the mess I came up with: ⚙ D120504 [AST] RAV doesn't traverse explicitly instantiated function bodies by default. Thanks for the pointers, I’ll try this approach out!

Yes I definitely agree with the goal of always traversing the written part of the explicit instantiation, and traversing the body as well only when shouldVisitTemplateInstantiations().

Yes precisely; while the node already exists at least in the parent template’s specializations() in all three cases, the idea would be to add an extra pointer to the FunctionDecl in its lexical context. So long as that doesn’t screw anything up, I think that placement would make sense to users of the AST, and would mirror what we do for classes.

To add some complexity I now recall there is something funky about var template instantiations which only make them appear to be represented in their lexical context in simple examples. In other words the three template instantiations kinds are each handled completely differently in this regard:

Takeaways:

  1. Class template explicit instantiations are always represented in their lexical context: perfect.
  2. Variable template instantiations, both explicit and implicit, are added to their semantic context, which is not necessarily their lexical context. (This doesn’t hinder traversal so long as the whole TU is being traversed, but does mean the AST does not always represent explicit instantiations in quite the right place syntactically, as you indeed might have noted in the other thread.)
  3. Function template explicit instantiations are not added to any context; that’s the immediate problem necessitating traversal hacks.

Frankly I’m not sure why var instantiations are added to their semantic context, given that they are already in specializations() of their template – can’t be needed lookup I assume; is it something to do with CodeGen? Would be interesting to see if removing them had any effect at all. If not, perhaps a future patch could remove those and instead add the explicit instantiations to their lexical context, same as functions, so long as that change doesn’t break stuff.

Correction, this does cause a slight traversal problem: implicit var template instantiations are currently visited twice when shouldVisitTemplateInstantiations()==true: once during TraverseTemplateInstantiations() of their parent template, and once while traversing their semantic context.