Location (in Source Code) of String Literal in MacroExpansion

https://gist.github.com/T35R6braPwgDJKq/6439fda090e4ee8440d726f8dafa4dbb#file-mectest-cpp

Above I modified the MacroExpansionContext unittest.

The relevant code:

//--------------

TEST_F(MacroExpansionContextTest, Custom) {

// From the GCC website.

const auto Ctx = getMacroExpansionContextFor(R"code(

#define CONCATMACRO(parm) L##parm

int f(wchar_t *b){}

int main()

{

int a= f(CONCATMACRO(“TEST”));

int b= f(L"TEST");

return 0;

}

)code");

Ctx->dumpExpansionRanges();

//> :6:14, :6:33

Ctx->dumpExpandedTexts();

//> :6:14 → ‘L"TEST"’

printf(“Exp: %s\n”,Ctx->getExpandedText(at(6, 14)).getValue());

//Exp: L"TEST"

printf(“Org: %s\n”,Ctx->getOriginalText(at(6, 14)).getValue());

//Org: CONCATMACRO(“TEST”));

// int b= f(L"TEST");

// return 0;

//}

StringRef sourceText = clang::Lexer::getSourceText(CharSourceRange(SourceRange(at(6, 14),at(6, 33)),false),SourceMgr, LangOpts);

printf(“sourceText: %s\n”,sourceText);

// sourceText: CONCATMACRO(“TEST”));

// int b= f(L"TEST");

// return 0;

// }

}

//--------------

I am interested in getting the range for “TEST” in the SourceText. Since the CONCATMACRO is preprocessed to L"TEST" I thought the MacroExpansionContext would be of help.

It tells about the ExpansionRanges: 6:14, :6:33 which is CONCATMACRO(“TEST”)

What I would need is: 6:26, 6:32

Obviously I could just calculate a bit like:

range.begin (14) + macronameLen (11) +1 (opening parenthesis) = 26

to

range.end (33) - 1 (closing parenthesis) = 32

But would that be reliable and is there no other way?

Furthermore: Why do getSourceText/getOriginalText print the whole rest of the source code and not just from (6,14) to (6,33)?

Hi Chiasa,

For the example you attached the MacroExpansionContext seems to work as I intended.

It records what range gets replaced by what tokens at the end of the preprocessing, thus we get

CONCATMACRO("TEST") L"TEST"

About Lexer::getSourceText() I’m not sure. I haven’t used that. The at() function was designed as a helper defining the unit tests. I might misused something when I created this code & tests, let me know!

Balazs

Oh, now I see why you were puzzled!

printf() expects a null-terminated string (“%s”), but you pass an llvm::StringRef instead, which is not null terminated!

Thus, you will end up with undefined behavior, which ends up printing till the end of the file.

I hope this is it!

Balazs

Thanks you were right, that fixed the minor printing issue.

But the actual problem of string literals in macros still persists. Maybe

someone else has an idea on that. The information isn’t in reach of your

MacroExpansionContext/MacroExpansionRangeRecorder, is it?

Just a short feedback on that. I played a bit with you code and got the needed information seems to be

void MacroExpands(const Token &MacroName, const MacroDefinition &MD, SourceRange Range, const MacroArgs *Args) override {

for (unsigned i=0;igetNumMacroArguments ();++i){

Args->getUnexpArgument(i)->getLocation().printToString(SM)<<":"<getUnexpArgument(i)->getEndLoc().printToString(SM)<<"\n";

}

}

I went ahead and basically copied and pasted your code in order to add a new collection for unexpanded arguments.

In contrast to your code the tokenWatcher does not make sense in this regard since it will probably lex the preprocessed hence expanded source code.

Thanks for your work