Out of date IdentifierInfo instances associated with macro expansions loaded from AST files

Attached is a patch against Clang trunk to correct IdentifierInfo associations for macro definitions and expansions read from AST files when the name of the macro matches the name of another identifier (such as a keyword).

I'm not particularly confident that the patch represents the best way to correct this issue, so further review and comments would be appreciated. No regressions occurred when running the Clang test suite with this change in place.

For testing, I've been using this code that defines a macro with the same name as a keyword.

$ cat t.c
#define register
void f(void) {
   register int i;
}

The code is then compiled and emitted to a .ast file with a detailed preprocessing record. An in-house utility is then used to read it.

$ clang -emit-ast -Xclang -detailed-preprocessing-record t.c

The problem occurs when attempting to retrieve macro information associated with macro expansions within a source range. I don't have a standalone test case I can share, but the relevant code looks similar to this:

void f(Preprocessor &preprocessor, const SourceRange &range) {
   PreprocessingRecord *preprocessing_record
     = preprocessor.getPreprocessingRecord();

   std::pair<
     PreprocessingRecord::iterator,
     PreprocessingRecord::iterator> preproc_range
     = preprocessing_record->getPreprocessedEntitiesInRange(range));

   PreprocessingRecord::iterator preproc_current = preproc_range.first;
   PreprocessingRecord::iterator preproc_end = preproc_range.second;
   for (; preproc_current != preproc_end; ++preproc_current) {
     MacroExpansion* macro_expansion
       = dyn_cast<MacroExpansion>(*preproc_current));
     if (!macro_expansion)
       continue;

     const IdentifierInfo *macro_id = macro_expansion->getName();
     const MacroInfo *macro_info = preprocessor.getMacroInfo(
       const_cast<IdentifierInfo*>(macro_id));
     if (!macro_info && macro_id->hadMacroDefinition()) {
       // The identifier does not correspond to a currently
       // (at end of TU) defined macro. Get the most recently
       // #undef'd definition.
       macro_info = preprocessor.getMacroInfoHistory(
         const_cast<IdentifierInfo*>(macro_id));
     }
     assert(macro_info &&
       "Encountered a macro expansion with no definition history");
   }
}

The assert at the end fails for the test case above. The reason is that the IdentifierInfo instance retrieved from the MacroExpansion instance is an out of date instance (macro_id->OutOfDate == true) corresponding to the "register" keyword and neither hasMacroDefinition() nor hadMacroDefinition() return true for it (Preprocessor::getMacroInfo() return NULL because macro_id->hasMacroDefinition() returns false).

The patch corrects this by changing ASTReader::DecodeIdentifierInfo() to check if calls to IdentifierTable::get() return an out of date IdentifierInfo instance and, if so, to explicitly retrieve an IdentifierInfo instance from the external identifier lookup instance associated with the PP.getIdentifierTable() IdentifierTable instance. This causes MacroExpansion (and MacroDefinition) instances to be deserialized with the correctly preserved IdentifierInfo associations.

I had also tried updating IdentifierTable::get() to check if it is about to return an out of date IdentifierInfo instance and, if so, to check for an external instance and preferentially return that. This change also resolved the problem for me, but triggered a single regression in the Clang test suite (test/PCH/pch__VA_ARGS__.c). I did not research further why that test case failed and moved on to a more localized solution.

Tom.

clang-module-identifier-out-of-date.patch (1 KB)

Could someone please review/comment/commit this patch?

Tom.

clang-module-identifier-out-of-date.patch (1 KB)