Late-parsing function templates in tools

Hello all,

In include-what-you-use, we need to parse and AST-visit templates
whether they're instantiated or not, in order to record all uses. This
is somewhat at odds with `-fdelayed-template-parsing` for MSVC
compatibility. Uninstantiated function templates are never visited in
this mode. So we have a hack, suggested here [1] that explicitly
parses these function templates, implemented here [2].

This works surprisingly well, but we've run into a corner case
triggered by llvm/include/llvm/Support/MathExtras.h. When run under
include-what-you-use on Windows, the GNU builtins are still available,
so we end up trying to parse the equivalent of the following code:

    template <class T>
    int myctz(unsigned int value) {
     return __builtin_ctz(value);

Essentially a GNU builtin called from a function template. This causes
the call to LateTemplateParser to crash because parsing the builtin
needs the TUScope member to be non-null. More details in a patch by
@zjturner here [3].

Some open questions borrowed from that review:

Can we seed Sema with a valid `TUScope` before invoking
`LateTemplateParser`, and if so, how?

Or is this because we invoke the parser multiple times? Can we avoid
that somehow?

For us, it sure would be nice if Clang had a more proofed API for
doing this somewhere, but I'm not sure where it belongs or how it
might be implemented.

Many thanks for any input,
- Kim


Ping! I would love some input on this.


  • Kim

It sounds like this issue would be fixed if we did what you are doing (parsed all late parsed function templates) within Sema::ActOnEndOfTranslationUnit, is that right?

If so, that would be a pretty reasonable extension for tools.

No, it doesn’t do that, but it’s right where we’d go through and parse all the late-parsed templates for tools.

Honestly, I think delayed template parsing may be nearing the end of its useful life. It may be causing more problems than it solves for users. We should re-examine what MSVC allows in its default build modes and see if we can get it turned off by default. Most Windows system headers these days are supposed to be compilable with /permissive-, which is more conforming.

If that's the case, that would be a time for celebration for portable code!

But then again, there are probably several active VS2012 shops still
out there; when I was still using MSVC, I found the ecosystem was
often very difficult to upgrade (often precompiled non-source
third-parties, several different build configurations, etc, etc). So
at least from IWYU's perspective, it might be a little premature to
kill off support for delayed template parsing.

- Kim

I dabbled some more with this.

IWYU built with Clang trunk still segfaults on this input:

template<class T>
int myctz(unsigned int value) {
  return __builtin_ctz(value);

backtrace [1]

I also found an interesting test case of ours:

IWYU short-circuits analysis for files it won't rewrite (by default,
everything but the main source file and its associated header [x.cpp
and x.h]). So if we were to move this logic into Sema, IWYU would need
some kind of filtering feature to be able to use it. I'm not sure
exactly what a tooling API would look like, but it might need to be
two-pass (give me all candidates for late-parsing, filter externally,
then execute on a selection) or allow for a filter to be registered.

- Kim

[1] #0 0x000055555611825c in clang::Scope::getEntity (this=0x0) at
#1 0x000055555627a6ed in clang::Sema::PushOnScopeChains
(this=0x5555590df860, D=0x5555590eac28, S=0x0, AddToContext=true) at
#2 0x000055555627d2cb in clang::Sema::LazilyCreateBuiltin
(this=0x5555590df860, II=0x5555590b6a08, ID=282, S=0x0,
ForRedeclaration=false, Loc=...)
    at /home/kgrasman/code/llvm-trunk/llvm/tools/clang/lib/Sema/SemaDecl.cpp:2021
#3 0x00005555567b70a1 in LookupBuiltin (S=..., R=...) at
#4 0x00005555567b775c in LookupDirect (S=..., R=...,
DC=0x5555590972c8) at
#5 0x00005555567b7c9c in CppNamespaceLookup (S=..., R=...,
Context=..., NS=0x5555590972c8, UDirs=...) at
#6 0x00005555567b8cd3 in clang::Sema::CppLookupName
(this=0x5555590df860, R=..., S=0x5555590e9c80) at
#7 0x00005555567ba70a in clang::Sema::LookupName
(this=0x5555590df860, R=..., S=0x555559102a50,
    at /home/kgrasman/code/llvm-trunk/llvm/tools/clang/lib/Sema/SemaLookup.cpp:1830
#8 0x0000555556275cb9 in clang::Sema::getTypeName
(this=0x5555590df860, II=..., NameLoc=..., S=0x555559102a50,
SS=0x7fffffffc8d0, isClassName=false, HasTrailingDot=false,
    IsCtorOrDtorName=false, WantNontrivialTypeSourceInfo=true,
IsClassTemplateDeductionContext=true, CorrectedII=0x0) at
#9 0x0000555555fc6aec in
(this=0x5555590e4c90, SS=..., IsNewScope=true)
    at /home/kgrasman/code/llvm-trunk/llvm/tools/clang/lib/Parse/Parser.cpp:1779
#10 0x0000555555fc69c0 in clang::Parser::TryAnnotateTypeOrScopeToken
(this=0x5555590e4c90) at
#11 0x0000555556029bee in clang::Parser::ParseCastExpression
(this=0x5555590e4c90, isUnaryExpression=false,
isAddressOfOperand=false, NotCastExpr=@0x7fffffffd2c7: false,
    isTypeCast=clang::Parser::NotTypeCast, isVectorLiteral=false) at
#12 0x0000555556027886 in clang::Parser::ParseCastExpression
(this=0x5555590e4c90, isUnaryExpression=false,
isAddressOfOperand=false, isTypeCast=clang::Parser::NotTypeCast,
    at /home/kgrasman/code/llvm-trunk/llvm/tools/clang/lib/Parse/ParseExpr.cpp:531
#13 0x0000555556026129 in clang::Parser::ParseAssignmentExpression
(this=0x5555590e4c90, isTypeCast=clang::Parser::NotTypeCast)
    at /home/kgrasman/code/llvm-trunk/llvm/tools/clang/lib/Parse/ParseExpr.cpp:174
#14 0x0000555556025eb1 in clang::Parser::ParseExpression
(this=0x5555590e4c90, isTypeCast=clang::Parser::NotTypeCast) at
#15 0x0000555556089d63 in clang::Parser::ParseReturnStatement
(this=0x5555590e4c90) at
#16 0x0000555556083c24 in
(this=0x5555590e4c90, Stmts=..., Allowed=clang::Parser::ACK_Any,
TrailingElseLoc=0x0, Attrs=...)
    at /home/kgrasman/code/llvm-trunk/llvm/tools/clang/lib/Parse/ParseStmt.cpp:266
#17 0x0000555556083465 in clang::Parser::ParseStatementOrDeclaration
(this=0x5555590e4c90, Stmts=..., Allowed=clang::Parser::ACK_Any,
    at /home/kgrasman/code/llvm-trunk/llvm/tools/clang/lib/Parse/ParseStmt.cpp:111
#18 0x0000555556086856 in clang::Parser::ParseCompoundStatementBody
(this=0x5555590e4c90, isStmtExpr=false) at
#19 0x000055555608a195 in clang::Parser::ParseFunctionStatementBody
(this=0x5555590e4c90, Decl=0x5555590eab08, BodyScope=...)
    at /home/kgrasman/code/llvm-trunk/llvm/tools/clang/lib/Parse/ParseStmt.cpp:1972
#20 0x000055555609a6f8 in clang::Parser::ParseLateTemplatedFuncDef
(this=0x5555590e4c90, LPT=...) at
#21 0x000055555609a0c5 in clang::Parser::LateTemplateParserCallback
(P=0x5555590e4c90, LPT=...) at
#22 0x00005555557eaff2 in
(this=0x55555907d400, decl=0x5555590972a8)