Wouldn't the better solution to have some mode which enables
ElaboratedTypes even in C?
Of course this is a possibility, but IMHO I don't think it is a good
idea to increase the memory footprint for a need that is not (or not
yet) materialized.
Let's be clear here: the memory footprint is going to increase with this,
just as it's increased with all the other source range changes you've made
recently. I expect that the increase in memory footprint from allocating
a uniqued ElaboratedType per RecordType in C is going to be quite small
when compared with storing all the extra source locations. One of the nice
things about a mode to enable ElaboratedTypes is that we have the
option of disabling *both* of those costs.
To have precise source ranges is a design goal for clang (and a must for
several applications, ours included), so we try (as always done) to
achieve this trying to keep needed memory as small as possibile.
Having precise source locations is certainly a desirable goal, but it does
need to be balanced against other goals like keeping memory usage
reasonable. For example, we could theoretically store the locations of
semicolons and commas, but we don't, because we assume it's possible to
discover them by re-lexing from a known location.
I'm not saying that elaborated types in C are necessarily in this category,
particularly since (unlike commas) they *precede* the known location,
but perfect source location fidelity is, in the end, one goal among many.
Case in point, I understood your reasoning for adding InnerLocStart to
DeclaratorDecl, but that was a noticeable inflation in AST size for the
benefit of a fairly small number of consumers.
You don't see changing the assumption, made throughout the compiler,
that a TagType is always canonical as having any cost?
I was under the impression that to get the canonical type of a type had
a negligible cost. Consider however that alternative 2b does not change
this assumption.
There are several core algorithms (like getAs) which change depending on
whether a type can be sugar, although maybe this proposal's changes
would be fairly mild on them. Having to worry about multiple RecordTypes
per canonical decl would introduce some non-trivial costs, though.
Once read your preferences I can make another proposal:
Alternative 3:
C language: struct S is a RecordType (always canonical)
C++ language: struct S is an ElaboratedType (always non canonical) with
an inner RecordType (always canonical)
This would be exactly like it is now, but KeywordLoc would be moved
from ElaboratedTypeLoc to RecordTypeLoc so to be available for both C
and C++.
This would be quite expensive for C++, which uses unelaborated class
types a lot — in addition to the obvious uses, recall that we store a
TypeLoc for every constructor or destructor name.
John.