[clang] Extend clang AST to provide information for the type as written in template instantiations

When instantiating a template, the template arguments are canonicalized before being substituted into the template pattern. Clang does not preserve type sugar when subsequently accessing members of the instantiation.

std::vector<std::string> vs;
int n = vs.front(); // bad diagnostic: [...] aka 'std::basic_string<char>' [...]

template<typename T> struct Id { typedef T type; };
Id<size_t>::type // just 'unsigned long', 'size_t' sugar has been lost

Clang should “re-sugar” the type when performing member access on a class template specialization, based on the type sugar of the accessed specialization. The type of vs.front() should be std::string, not std::basic_string<char, [...]>.

Suggested design approach: add a new type node to represent template argument sugar, and implicitly create an instance of this node whenever a member of a class template specialization is accessed. When performing a single-step desugar of this node, lazily create the desugared representation by propagating the sugared template arguments onto inner type nodes (and in particular, replacing Subst*Parm nodes with the corresponding sugar). When printing the type for diagnostic purposes, use the annotated type sugar to print the type as originally written.

For good results, template argument deduction will also need to be able to deduce type sugar (and reconcile cases where the same type is deduced twice with different sugar).

Expected results

Diagnostics preserve type sugar even when accessing members of a template specialization. T<unsigned long> and T<size_t> are still the same type and the same template instantiation, but T<unsigned long>::type single-step desugars to unsigned long and T<size_t>::type single-step desugars to size_t.

Desirable skills

Good knowledge of clang API, clang’s AST, intermediate knowledge of C++.

Project type

Large

Mentors

@vvassilev, Richard Smith

1 Like

Hi, @vvassilev! I’m a college student from China, and I’m very interested in this topic! I have read some materials about this, and have a basic understanding in my mind, of course with some confusion as well. I would like to talk about some of my thoughts, also raise some questions here.

So what problem we are facing is that clang does not preserve type sugar when subsequently accessing members of the template instantiation, which produces bad diagnostic messages when something goes wrong. What we can do is basically “save” the type info in a “sugar type”, and store it in the return type of the member function. At last, we can replace the corresponding type when we hit the member function call.

I guess the sugar type can be declared like:

class SugarType : public Type, public llvm::FoldingSetNode {

  QualType TemplateSpecializationType;
  QualType FunctionProtoType;

public:
  SugarType(QualType TemplateSpecializationTy, QualType FunctionProtoTy)
      : Type(Sugar, FunctionProtoTy, FunctionProtoTy->getDependence()),
        TemplateSpecializationType(TemplateSpecializationTy),
        FunctionProtoType(FunctionProtoTy) {}

  QualType getInnerType() const { return TemplateSpecializationType; }

  bool isSugared() const { return true; }
  QualType desugar() const { return getInnerType(); }

};

Though I have an approximate solution in my mind, I have lots of confusion as well.

  • When do we create the sugar node? Do we form in the node in AST parsing?
  • What do single-step desugar of this node and lazily create the desugared representation mean? I’m not familiar with these tech terms…
  • Is performing member access on a class template specialization the only case should we deal with? Are there any more corner cases we should pay attention to?
  • Can you give more info about how clang deal with templates? Any docs or code reference all look good to me.

I would appreciate it if you can give me more details about it. :^)

Best regards,
Jun Zhang

Hi @junaire, thanks for reaching out. This project has been proposed multiple times in GSoC. You can search this forum for more information but here are the relevant topics I found:

Thanks to the efforts of a recent contributor, @mizvekov, Clang has been moving in the direction of preserving type sugar. See:

In the context of the proposed project we need to take a step further and retain the information for typedefs, etc.

Hi, @vvassilev thanks for taking look at this! I have already looked at these materials, and I think [llvm-dev] GSOC Projects should be the most helpful one.

I’m glad to see there’re already folks working on this! These patches are great references and I’ll take a look at those.

BTW, do you think it is possible to send some warm-up patches about it? If so, any direction can you recommend?

Best regards,
Jun

Hi @junaire, you’re welcome. If you still have concrete questions please don’t hesitate to post them here.

As of evaluation/warmup, this is usually what I personally prefer:

If you have interest in working on the project there is a list of things to do
in order to maximize your chances to get selected:

  1. Contact the mentors and express interest in the project. Make sure you attach
    your CV;
  2. Download the source code of the project, build it and run some basic examples;
  3. Start familiarizing yourself with the codebase;
  4. If you have questions you can always contact a mentor or better write here if the questions are relevant to this post.

The mentors are interested in working with all candidates but unfortunately the
rules allow only one to be selected. There are a few tasks which give bonus
points to candidate’s application:

  • Submit a valid bug – demonstrates that the candidate has completed step 2
    and 3 from the previous section.
  • Fix a bug – demonstrates the technical skills of the candidate and shows
    he/she can work independently on the project. The mentors can suggest looking
    into these good first issues.
    Fixing couple of issues may be enough to become a successful candidate.

PS:
I’d like to add this to the description of the project but apparently that discourse edit timeframe closed, @akorobeynikov, can we at least allow for GSoC infinite time frame for edits by the authors?

1 Like

Hi @vvassilev, Thanks for your feedback, I appreciate it. For the CV or proposal, I will post that after I finished polishing. And may I ask how to contact Richard? Looks like he doesn’t have a discourse account? Should I try discord?

I should note that we are at a very early stage of GSoC. I am not even sure if google has reviewed and selected the mentoring organizations yet. I’d encourage you to engage early (as you did) but work on a proposal after we hear back from google about their list of selected organizations (that’s scheduled to happen on 7th of March).

I think discord is a good way to contact Richard, otherwise you can send an email to me and I can cc him.

1 Like

Hi, @vvassilev, I have sent an email to you and Richard, could you please take a look when you got a chance? (I found your email addresses from the commit history :wink: