Type source info proposal


This is a more detailed proposal for adding source info about types.

The basic idea is that source information about a "declarator type" (a type coming out of declarator parsing) will be stored in a flat contiguous memory block
that will be interpreted based on the type. e.g for:

MyType **

we have a pointer -> pointer -> typedef and the flat memory block will contain

-source location for pointer star
-source location for typedef name

Source information will contain other stuff besides source locations, e.g:

MyType * x[N+1]

the flat block will contain:

-source loc for '['
-source loc for ']'
-Expr* for size expression
-source loc for '*'
-source loc for typedef name

For function types:

void (*f1)(int *x, int *y); #1
void f1(int *x, int *y); #2

type source info will contain the ParmVarDecls.
We can probably have the FunctionDecl created from #2 share this ParmVarDecls array with type source info.

Now where should this information be stored ?

One idea was having a special Type subclass that will also keep source info for a declarator and will be non-canonical (like TypedefType).
It would also enter the type system, in that types of parameters of a FunctionType will be this special Type (since a parameter comes out of a declarator).

This was not a good idea because of 2 issues:

1) Types start being created & uniqued unnecessarily, e.g:

int *x; #1
int **y = &x; #2

For #1 we create a new PointerType to keep source info, and for the "&x" expression we also create & unique another PointerType, instead of using the unique "pointer to pointer to int" type.

2) Types change because of semantic analysis, they decay, merge, change because of attributes, etc. You can't really have source info "hanging off" a Type.

What I propose instead is that type source info is decoupled from the actual Type that the declarator resolved to;
type source info should be stored into the Decls (FieldDecl, VarDecl, etc.) and Exprs that contain types (like SizeofAlignofExpr).

We call "type source info" "DeclaratorInfo" and we create a new Decl subclass "DeclaratorDecl", which contains a DeclaratorInfo* pointer.
It will enter the hierarchy like this:

ValueDecl -
         DeclaratorDecl -
               FieldDecl -
               VarDecl -
               FunctionDecl -

That way, EnumConstantDecl (which has no use for DeclaratorInfo* at all) will not change size.

In order to read DeclaratorInfo, you will use "TypeLoc" wrappers to get at the information, e.g:

DeclaratorDecl *DD = cast<DeclaratorDecl>(ASTLoc.getDecl());
DeclaratorInfo *DInfo = DD->getDeclaratorInfo();
TypeLoc TL = DInfo->getTypeLoc();

if (FunctionLoc *FTL = dyn_cast<FunctionLoc>(&TL)) {
     // Print info about the function declarator
      FTL->getLParenLoc().print(OS, SrcMgr);
      FTL->getRParenLoc().print(OS, SrcMgr);
} else if (ArrayLoc *ATL = dyn_cast<ArrayLoc>(&TL)) {
     // Print info about the array declarator
      ATL->getLBracketLoc().print(OS, SrcMgr);
      ATL->getRBracketLoc().print(OS, SrcMgr);

Now, for a given declarator we have both a QualType and a DeclaratorInfo* and we want to pass them both to the Parser, while
the Parser operates on single pointers (e.g. will store the type pointer that it got from Sema into an annotation token).
For that we create a "special" Type subclass ("LocInfoType") that keeps DeclaratorInfo* and whose purpose is *only* for passing back and forth between Parser & Sema,
in will *not* participate into the type system semantics at all.

Currently, LocInfoType gets created by ASTContext and consumes memory, but since it is only "transient", intended for the Parser/Sema interaction,
we can do clever stuff like destroy/cache them when Sema knows that a declaration is finished so that they don't consume memory.

Ok, this is the high level overview, I've attached incremental patches of this implementation:

typeinfo1.patch : Introduce TypeLoc and DeclaratorInfo
typeinfo2.patch : Introduce DeclaratorDecl and pass DeclaratorInfo through the Decl & Sema interfaces.
typeinfo3.patch : Actually build the DeclaratorInfo out of a parsed declarator
typeinfo4.patch : Introduce LocInfoType
typeinfo5.patch : Pass type source info through the Parser using LocInfoType

Currently there is no flag to enable/disable type source info but can be added easily. In general, I would strongly prefer that we always keep type source info
since, apart from getting a complete AST, we will also be able to get rid of the "type specifier start location" SourceLocation that Field/Var/Functions have,
get rid of ConstantArrayWithExprType, and maybe simplify other things that I'm missing.

I'd really appreciate any feedback that you may have on the above; feel free to ask me any questions.


Oops, forgot to attach the patches. Here they are in 2 parts so that cfe-dev accepts them immediately:

First part:

typeinfo1.patch (24.9 KB)

typeinfo2.patch (74.9 KB)

Second part:

typeinfo3.patch (8.26 KB)

typeinfo4.patch (5.45 KB)

typeinfo5.patch (18.4 KB)