RFC: Improving Diagnostics with Template Specialization Resugaring

Hi everyone!

I have been working on the Clang GSoC project which has the objective of improving the situation with regards of loss of type sugar when dealing with template specializations.

The problem we are trying to solve here is old and well-known:

template <class T> auto foo(T) -> T;

int x = foo(std::string("hello"));

On the last line you would get this diagnostic: error: no viable conversion from 'std::basic_string<char>' to 'int'
Ie the infamous basic_string, which you never wrote and had no reason to even know existed, appearing on the diagnostics.

The problem here is that we always instantiate templates using the ‘simplest’ type, stripped of all decorations, when instantiating a template.

std::string is just a typedef to std::basic_string<char>, and typedefs are always just a decoration, which I will refer here on by the term ‘type sugar’.

The simplest type, which I will refer here on by the term ‘canonical type’, is used to instantiate the template, and the important reason for that is performance: we don’t want a different instance for each possible way to decorate a type, because it would be slow and waste much more memory.

Suppose the user created his own type alias: using MyString = std::basic_string<char>.

We don’t want foo<std:::string> and foo<MyString> to refer to two different instances of the function template.

Note that clang already does have a hacky, incomplete solution for a very narrow set of this problem:
The clang::preferred_name attribute.
It allows you to, with a lot of inconvenience as described in the link, associate, to the template definition, type aliases to particular specializations which will be used by the type printer as the preferred name of that specialization. libc++ uses this attribute, and it notably though that can’t deal with the MyString case: It will print std::string instead.

Alright, so with problem and limited existing solutions explained, I hereby explain my proposal.

We will implement a new pass that will add back the lost type sugar when performing member access on a template specialization.

First though, note that we will not change the situation with regards to the template instantiation themselves, they will still only be performed with the canonical type. Ie when instantiating this template with std::string:

template <class T> auto foo() {
    int n = T();
}

We will still get the diagnostic: error: no viable conversion from 'std::basic_string<char>' to 'int'.

But we will fix the diagnostic from the initial example so it will print: no viable conversion from 'std::string' (aka 'basic_string<char>') to 'int'.

To implement this new pass, we will leverage an existing feature of the AST: When Clang performs substitution as part of template instantiation, it leaves behind type sugar nodes which indicate what was substituted, ie what template parameter was replaced by what canonical type.

Implementing this as a new kind of TreeTransform, upon member access we will build a naming context used to map template parameters into sugared template arguments, and upon encountering these substitution nodes, we will get the replaced type, look it up in the naming context, and replace the replacement type again with the sugared argument we found, rebuilding the whole type.

Note that this pass will only rebuild types, not expressions, but we will hook this up in the right places in Sema so that member access expressions will benefit from this, Ie the expression will be built with the type already resugared, instead of resugaring the expression itself later.

Note as well that we will never emit new declarations as part of resugaring, in order to avoid their heavy cost.
Example: when resugaring a typedef instantiated from a dependent one, instead of emitting a TypedefDecl redeclaration with the new resugared type, we will modify TypedefType so that it can carry an underlying type different from the declared one, but still canonically the same.

The work in progress implementation here is in an advanced state feature wise, which I wanted to get to before writing this RFC so that folks would have something to toy around to explore the effects we want to achieve.

The compiler explorer team added an experimental Clang variant based upon my branch (kudos!) and here
is the link to a small demo: https://godbolt.org/z/o8aePToMf

It still has a few bugs, and has not been worked on from the performance side yet.

Right now this resugaring is always performed, eagerly, unless turned off by a frontend flag, as you can see on the demo.

There is also the WIP PR I have been working this feature on: https://reviews.llvm.org/D127695

The PR is not clean enough for detailed review (please don’t complain about strew around dumps and other litter!), but I think the big picture is already there and can be discussed.

There is lots that can be discussed on the performance side:

  • Do we do it eagerly or lazily? Lazily would make sense if we wanted to perform this resugaring only for diagnostics. Though we would have to pack more information around in the AST, increasing memory use. But some users, like the ROOT project, want this functionality so they can use type sugar with semantics. This can also be used to preserve a Typedef which has attributes like gnu::aligned attached. Other such examples are the magic typedefs that ObjC specifies (NSInteger and friends). One problem perhaps that makes those two languages cooperate badly when mixed together in ObjC++ is that these are not preserved in templates. If this resugaring had to be performed for every type so that it survives to CodeGen, lazily would not buy much and would perhaps even be detrimental, so this would at least need to be controlled by a switch.
  • We can implement a new kind of dependence, substitution dependence, so that we can avoid recursing into types which don’t contain substitution nodes.
  • We can implement caching to avoid repeated resugaring of the same thing.

Note that the patch is getting big, and there is also a stack in there with other interesting stuff that needs to progress as well, and only Richard Smith has been reviewing the stack consistently, and even then I think he hasn’t even looked at the main patch yet, because as you know he is quite busy and it’s maybe a bit unfair to put all of this burden on his desk :slight_smile:

So review volunteers are certainly welcome, we have about two months to merge this whole thing :slight_smile:

But even just testing this, and noticing bugs and other limitations I might have missed is certainly helpful as well. So I can already consider feature requests and bug reports. Specially if they already come with reduced test cases :slight_smile:

Note that I have been working on type sugar preservation for some time, including some of the prerequisites here.

Last year we merged https://reviews.llvm.org/D110216, which modified the type deduction machinery to preserve type sugar. That already improved diagnostics quite a lot, and was a prerequisite so that we can have type sugar which to resugar with. That patch also implemented retention of type sugar for auto deduction, which previously would only keep the canonical type as well.

There is still work left to complete this, that you can find on the stack. For example, we don’t have a mechanism in Clang to combine the type sugar of different deductions, so one current limitation is that we will arbitrarily pick the type sugar of the value of the first return statement in the case of return type deduction.

That mechanism is implemented by https://reviews.llvm.org/D111283 and expanded by https://reviews.llvm.org/D130308, and then on https://reviews.llvm.org/D111509 we hook this mechanism into many other places where we had two types we wanted to reduce to one, but were making arbitrary choices about what sugar to preserve.

So there is still lots that can be talked about here, design decisions and implementation choices, but I think this post is big enough!

Thanks everyone! Special thanks to the mentors @vvassilev and @zygoloid !

PS:

One frequently asked question I get about this is, what is the relationship of this with ‘Strong Typedefs’?

Well, that depends on what one means by ‘Strong Typedef’. Without a concrete proposal, I think in general the term refers to some extension of the kind of aliasing you can get with enum classes, Ie a kind of type alias which can implicitly or explicitly convert to/from the underlying type, but that represents a different type in the canonical sense, so that it can be overloaded on and such.

Though Resugaring deals with type sugar: We are resugaring a type, to produce another one but which is still canonically the same, so that they can’t overload each other really.

The one connection here is that, if we had a Strong Typedef implemented, and this applies to enums as well, and it was instantiated from a declaration with a dependent underlying type, then we can resugar it so that it appears as if the underlying type had the lost type sugar added back when accessing it as explained above.

If instead Strong Typedef is understood in colloquial terms, then we are making Typedefs stronger because we are not losing them around so much :slight_smile:

3 Likes

Thanks for the post, it clarifies things for me that i did not necessarily understood in the C++ WG meeting last week.
I really like this direction.

When you say that you will only implement that for declarations and not instanciations, I’m assuming it’s for performance?
When you later say that you are considering caching, would the caching allow resugaring in instantiation,
is that something you considered?
Either way, an incremental approach will give us a good grasp on the performance.

Have you tried on interesting libraries with non-trivial scenarios? say C++ standard ranges?

Oh sorry about that, and thanks for sharing your thoughts on this!

Yes, one point is that declarations are usually big and slow to create compared to the type nodes which refer to them.

One other consideration is uniquing. We want to avoid creating redundant objects referring to the same thing when resugaring with similar inputs, and uniquing of declarations in Clang would be some strange, unnecessary innovation. Whereas for types that is the norm.

But I made this point to clarify the situation with regards to Typedefs.

For example:

using Int1 = int;
using Int2 = int;

template <class T> struct A {
  using type = T;
};

using t1 = A<Int1>::type;
using t2 = A<Int2>::type;

Ie with regards to the A::type typedef, there will be only two TypedefDecls here:

  • The original one from the template in which the underlying type is T (aka type-parameter-0-0).
  • The one instantiated from the TypedefDecl from the first bullet, with T = int;

A<Int1>::type and A<Int2>::type will both name TypedefTypes which refer to the TypedefDecl from the second bullet, but unlike what exists implemented in Clang now, they will not desugar to the type declared in the TypedefDecl, they will contain an extra UnderlyingType field which will have the resugared type. This will make these TypedefTypes bigger by one pointer, to store that QualType, but non-resugared typedefs will not need this extra storage, and indeed we will not make them bigger. We will just need to use one free bit for this bookeeping.

Well, the idea of caching is to avoid recomputing the same resugared type over and over when we refer to them with the same inputs. A couple of tests in the clang test suite will require this, clang\test\CodeGenCXX\pr29160.cpp and clang\test\CodeGenCXX\mangle-ms-back-references-pr13207.cpp. Resugaring the types created in those without caching would be extremely inefficient.

Regarding performing resugaring in instantiations, yes we can do that, and the WIP PR already does that, it does not need caching, but still that means we are only considering the canonical types of the instantiation arguments.

For clarification, we will resugar this example:

using Int = int;

template <class T> struct A {
  using type = T;
};

template <class U> void foo() {
  /*...*/ typename A<U>::type /*...*/;

 using X = U;
 /*...*/ typename A<X>::type /*...*/;
};

template void foo<Int>();

When instantiating the foo function template, U will still refer to just int, not Int, but we will be able to handle the accesses into A from the function template, so that A<U>::type will resugar to a typedef which will contain a Subst node pointing to U.
Likewise for A<X>::type, this will resugar to something which will refer to X.

We will also resugar accesses to template specializations during the initial template parse, and if the specialization does not have a dependent argument, this will work just like resugaring from a non-dependent context.

But the benefits wrt diagnostics here are diminished of course, unfortunately there are too many limitations wrt to the design of C++ which limit our ability to diagnose templates before we instantiate them.

Forgot to answer this one.

I haven’t tried many complex scenarios yet or tried building real world code to see the results.

There some known limitations which I wanted to solve first, while I am still testing on a smaller scale.

The libc++ bootstrapping build seems to be broken, although I haven’t managed to look or try to reduce it yet.

Similarly there is an issue with the ASTImporter wrt to function templates that only a lldb test seems to be catching. Probably needs similar changes to which I had to make for variable and class templates, but things are trickier there.

Simple examples written using libc++ seem to be ok, but I haven’t tested much yet.

For libstdc++, there are some other limitations that need to be fixed. Things are a lot more complex there, as more types need to be resugared through the allocator template parameter, which is usually defaulted, and this is one limitation I am still working on.

Also, I think more discussion on design is needed before starting any complicated work on the performance of this whole thing.

I think I understand the broad design as you describe it, but understanding the details is difficult because the number of changes in the WIP patch is so staggering. It would be great if the final version could be broken up wherever possible, and have documentation & explanatory comments added.

In particular it might be helpful to split into at least two patches: the first introduces the ability to resugar (i.e. the Resugarer : TreeTransform<Resugarer> class and the Sema::resugar overloads which use it, I think), and a second patch would make use of this ability to produce the desired diagnostics (and other independent uses of resugaring, if any, would go in their own patches). This would help clarify the purpose of the many secondary changes throughout the codebase.

Also, can you comment on how ASTMatchers or other AST tools might make use of resugaring? Or is this strictly for improving diagnostics?

A broader issue that might be considered is whether a flag should be introduced which disables all type sugar, everywhere, for faster compiling whenever accurate diagnostic info is not needed (because diagnostics are not expected/will not be acted on by the user). Implementing might be as simple as tweaking the QualType ASTContext::get*Type(...) implems to return the inputted canonical type, unwrapped, when this flag is enabled. This would alleviate the developer’s burden of deciding whether further info should be piled into the AST via sugar, at the cost of performance – a decision which is probably better left to the user. Adding such a flag might make it justifiable to resugar eagerly in this case.

Yes I want to do that. But bear with me here, this is already somewhat split up :slight_smile:
If you take a look, there are five other patches in the stack, some of them have been in review for almost a year…

At some point reviews lagged and keeping a stack this deep started becoming detrimental on my side.

I thought then that moving forward fast to have some sort of MVP to garner interest was more important than having those changes easily reviewable at this point in time.

The resugarer itself is very self contained, and hooking it up to the right places in Sema looks self-isolated enough that I honestly think we don’t need to split it up along those lines. The devil here is in other pre-requisites.

I will expound on one small aspect of that: As I said in a previous post, we leverage in this new resugar pass the capacity to represent in the AST where substitutions occurred, mainly with the SubstTemplateTypeParmType node. It’s very good that whoever designed the Clang AST thought to leave this stone in place so that we would not evolve away from being able to implement this.

But I think this resugarer is the first real user of this feature, and I found that there are some fundamental missing pieces and some design issues we can improve on.

One fundamental and very basic missing piece is that we don’t store the pack index from a substitution from a pack in the AST. Without that, we can’t look up the argument that we need from the naming context.

One of the patches in the stack is what I believe a very straight forward implementation of this feature: D128113
I think this should have been an easy and uncontroversial patch to review and move forward. On one hand, it’s useless without a user, but on the other hand it is essential for resugaring. There is negligible downside to merging this, yet it has been stuck in review for a while.

Now, embedded in the resugaring patch there is yet another fix for a fundamental issue with how we represent a SubstTemplateTypeParmType. This other change is bigger and scarier, could have downsides and merits more discussion. I think this would be a perfect thing to split up here. I am not even going to write about it in this post because it would be a huge digression. But it seems pointless to split it off if we can’t even move forward with the much more basic D128113.

There are other such ‘pre-requisite’ changes we can split up, this was just one example.

So my hope is that we can achieve a cheap enough resugaring pass which we can simply have it done eagerly and on all the time. That we would not need to add a command line switch to turn it off, or add a new AST node to do it lazily.

If we can achieve that, then there are no consequential direct changes required to ASTMatchers. Only the effect that we will now have more of the existing type sugar around, and specially that we will have them in some places where we never had type sugar before. We would probably need to fix some existing deficient matchers to cope with that, but all along the APIs that already exist.

If we have to, or decide to, implement lazy resugaring, then there needs to be a new type node to hold the resugaring context. This new type node presumably would have a normal ‘desugar’ method, which would simply ignore this context and return the underlying type as is, without resugaring.

In this way, the AST Matchers would not need to be modified, but they would not get any benefit from resugaring either. Some of them would perhaps just need to be fixed so they don’t get poisoned by this new node.

This new type node would presumably have a method which would return the underlying type resugared. This method would certainly require the ASTContext as a parameter, unlike the traditional ‘desugar’ method. How to integrate that with ASTMatchers I am not sure, but I think that we shouldn’t complicate ASTMatchers, they are fundamentally designed to do quick and dirty things and perhaps the right choice is to always feed the resugared type to those, regardless if we do it lazy or not.

There are many uses for this besides just prettier printing of types. Type sugar can affect CodeGen. The simplest example here is the [[gnu::aligned]] attribute, which can be attached to typedefs. There are other out-of-tree users, such as the ROOT project, that want other such effects, and so we would at least need to offer the option to resugar eagerly.

Well that is a different project that needs a separate discussion!

There are other things to consider here. Some type sugar affects program semantics. If completely disabling it breaks too much code, and specially if it breaks libc / STL implementations, then this could easily become infeasible. The other thing is the added complexity needed to leverage that advantage. There are many codepaths in Clang that could be turned off for a speedup. Many AST nodes which carry sugar could be changed from a static size to a dynamic size, so they can be smaller when they don’t need to carry it.

Can also influence what appears in debug info, I suspect. There has been debate over time about using name-as-in-source versus canonical-name. Sony carries a private patch to prefer using typedef names.

Agree - perfect. I’m very intrigued.

That helps me understand everything better: the resugaring is done while determining the type of CallExprs and MemberExprs (and perhaps others) involving template instantiations (e.g. foo(std::string()) or Bar<std::string>().data). The resugaring “pass” is a TreeTransform applied to the type that would otherwise be attached to those expressions (equivalent to the type of foo(std::basic_string<char>() etc). But once that type is calculated nothing further is needed (so long as we resugar eagerly, which seems like the way to go).

I’m curious about the specific needs of the ROOT project – do they also want to use attributes on typedefs? I.e. are AttributedTypes the only sugar which users want to affect semantics?

My gut reaction is that any type sugar which can affect semantics should not be considered sugar. For example:

typedef int slow_int [[gnu::aligned(1)]];
typedef int fast_int [[gnu::aligned(8)]];

template<typename T>
class Foo {
  char c;
  T t;
};

Right now we ignore the attribute when instantiating Foo, so that
sizeof(Foo<slow_int>) == sizeof(Foo<fast_int>) == sizeof(Foo<int>) == 8.

If we resugared, we would have access to those attributes, but we would still only have one instantiation, so it would still be true that sizeof(Foo<slow_int>) == sizeof(Foo<fast_int>) == sizeof(Foo<int>) == ?.

But if we considered each of those differently attributed types as a distinct canonical type we would naturally get sizeof(Foo<slow_int>) == 5, sizeof(Foo<int> == 8, sizeof(Foo<fast_int>) == 16.

Am I missing something, or is the last case the most desirable?

Interesting, thanks for sharing that!

I don’t think I ever broke (accidentally or otherwise) anything related to debug info in the course of implementing this, and I am afraid I don’t know enough about that matter to consider the ramifications here, so that is unexplored from my side.

I expect that if we ever run into any troubles, it could be related with “resugared” typedefs, ie a TypedefType with an underlying type different from the corresponding TypedefDecl. This is something that
must be implemented, it is essential for resugaring to have any relevant real world effect.

There could be issues with representing those depending on how this is specified and implemented in practice.

Yes that is the right idea, the existing ASTMatchers API is sufficient to express the changes we need to innovate here: A SubstTemplateParmType with a non-canonical underlying type, or a Typedef/Using/Enum with a different type from the corresponding Decl.

The invariants change of course so this could potentially break expectations for particular matchers.

@vvassilev can elaborate, but I believe in CERN they have a special sugar name for a floating point type which they wish to behave in a particular way in the execution model, but serialize in a different way.

Sure, if you define type sugar as something that has no semantic effects :slight_smile:
But the definition I use is simply something that is not part of the canonical type.

If you mean that we should not have type sugar with semantic effects, I agree, in the ideal world sense.

The right approach could be to implement them as qualifiers, kind of how we do it for address spaces, but more complicated because we have to allow the appropriate conversions and implement the proper ranking in overload resolution.

If alignment attr were part of the canonical type, the last one is what you would get. But resugaring has no effect here, there is nothing to resugar in a sizeof expression of a class template specialization.

In a hypothetical world where the sizeof expression could call some class method to obtain the size or refer to some member typedef , we could resugar the resulting type of that expression, but I don’t think anyone ever proposed that!

We have an I/O optimization which stores a datatype as a float (in lower precision) and reads it in-memory as a double (in higher precision). That happens through using Double32_t = float; and we need to track this datatype through instantiations and tell the I/O system how to act upon it. We do not rely on AttributedTypes. There has been a bit more discussion here [cfe-dev] template instantiation, typedef used as template argument and dependent type for data members.

1 Like