[PATCH] preprocessor pragma push_macro support

Hi,

This patch implements the preprocessor #pragma push_macro/pop_macro
functionality (a microsoft extension but also supported by gcc). This
is required to make clang handles Windows/VS headers correctly. This
is my first contribution to clang and the C/C++ preprocessor is quite
complex so please review my patch.

My interest is to use clang to do code analysis on Visual Studio vcprojs.

pragma_push.patch (4.5 KB)

Wow, thanks for implementing this. It was on my list of things to do
:P. I'll take a look as soon as I can.

- Michael Spencer

It looks like some changes are missing from this patch. Mainly the
changes to Preprocessor.cpp

- Michael Spencer

yes sorry

push_macro.patch (9.89 KB)

Overall the patch looks good. HandlePragmaPushMacro and
HandlePragmaPopMacro seem to share a lot of code and you may consider
refactoring the parsing and lookup into another function. There are
some things that I'm not sure about and will have to be checked by
others.

It passes the test I wrote while going through the MS extensions. I've
attached that test which you should add to your patch.

One other thing is that clang currently warns about macro
redefinitions after a push. It would be nice if it didn't on the first
redefinition after a push, as you pushed it so you could redefine it
:P.

Now for some inline comments.

Index: include/clang/Basic/DiagnosticLexKinds.td

--- include/clang/Basic/DiagnosticLexKinds.td (revision 109329)
+++ include/clang/Basic/DiagnosticLexKinds.td (working copy)
@@ -230,6 +230,12 @@
  "pragma comment requires parenthesized identifier and optional string">;
def err_pragma_message_malformed : Error<
  "pragma message requires parenthesized string">;
+def err_pragma_push_macro_malformed : Error<
+ "pragma push_macro requires a parenthesized string">;
+def err_pragma_pop_macro_malformed : Error<
+ "pragma pop_macro requires a parenthesized string">;
+def warn_pragma_pop_macro_no_push : Warning<
+ "#pragma pop_macro : '%0' no previous #pragma push_macro for this identifier">;

This may be better worded as "pragma pop_macro could not pop '%0', no
matching push_macro", but that's really up to personal preference and
whatnot. I just matched what warn_pragma_diagnostic_clang_cannot_ppp
says for consistency.

def warn_pragma_message : Warning<"%0">;
def warn_pragma_ignored : Warning<"unknown pragma ignored">,
   InGroup<UnknownPragmas>, DefaultIgnore;
Index: lib/Lex/Pragma.cpp

--- lib/Lex/Pragma.cpp (revision 109329)
+++ lib/Lex/Pragma.cpp (working copy)
@@ -16,6 +16,7 @@
#include "clang/Lex/HeaderSearch.h"
#include "clang/Lex/LiteralSupport.h"
#include "clang/Lex/Preprocessor.h"
+#include "clang/Lex/MacroInfo.h"
#include "clang/Lex/LexDiagnostic.h"
#include "clang/Basic/FileManager.h"
#include "clang/Basic/SourceManager.h"
@@ -483,7 +484,131 @@
    Callbacks->PragmaMessage(MessageLoc, MessageString);
}

+/// HandlePragmaPushMacro - Handle #pragma push_macro.
+/// The syntax is:
+/// #pragma push_macro("macro")
+void Preprocessor::HandlePragmaPushMacro(Token &PushMacroTok) {
+ // Remember the pragma token location.
+ SourceLocation PragmaLoc = PushMacroTok.getLocation();

+ // Read the '('.
+ Token Tok;
+ Lex(Tok);
+ if (Tok.isNot(tok::l_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_push_macro_malformed);
+ return;
+ }
+
+ // Read the macro name string.
+ Lex(Tok);
+ if (Tok.isNot(tok::string_literal)) {
+ Diag(PragmaLoc, diag::err_pragma_push_macro_malformed);
+ return;
+ }
+
+ // Remember the macro string.
+ std::string StrVal = getSpelling(Tok);

This should be a llvm::StringRef instead of an std::string.

+
+ // Read the ')'.
+ Lex(Tok);
+ if (Tok.isNot(tok::r_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_push_macro_malformed);
+ return;
+ }
+
+ assert(StrVal[0] == '"' && StrVal[StrVal.size()-1] == '"' &&
+ "Invalid string token!");

This should be an actual diagnostic instead of an assert. Input
provided by the user should _never_ cause an assert.

+
+ // Create a Token from the string.
+ Token MacroToPushTok;
+ MacroToPushTok.startToken();
+ MacroToPushTok.setKind(tok::identifier);
+ CreateString(&StrVal[1], StrVal.size() - 2, MacroToPushTok);
+
+ // Get the IdentifierInfo of MacroToPushTok.
+ IdentifierInfo *IdentInfo = LookUpIdentifierInfo(MacroToPushTok);
+
+ // Get the MacroInfo and flag IsPragmaPush.
+ MacroInfo *MI = getMacroInfo(IdentInfo);
+ if (MI) {
+ MI->setIsPragmaPush(true);
+ }
+
+ // Push the MacroInfo pointer so we can retrieve in later.
+ PragmaPushMacroInfo[IdentInfo].push(MI);
+}
+
+/// HandlePragmaPopMacro - Handle #pragma push_macro.
+/// The syntax is:
+/// #pragma pop_macro("macro")
+void Preprocessor::HandlePragmaPopMacro(Token &PopMacroTok) {
+ SourceLocation PragmaLoc = PopMacroTok.getLocation();
+ Token Tok;
+
+ // Read the '('.
+ Lex(Tok);
+ if (Tok.isNot(tok::l_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_pop_macro_malformed);
+ return;
+ }
+
+ // Read the '"..."'.
+ Lex(Tok);
+ if (Tok.isNot(tok::string_literal)) {
+ Diag(PragmaLoc, diag::err_pragma_pop_macro_malformed);
+ return;
+ }
+
+ // Remember the string.
+ std::string StrVal = getSpelling(Tok);

This should be a llvm::StringRef instead of an std::string.

+
+ // Read the ')'.
+ Lex(Tok);
+ if (Tok.isNot(tok::r_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_pop_macro_malformed);
+ return;
+ }
+
+ assert(StrVal[0] == '"' && StrVal[StrVal.size()-1] == '"' &&
+ "Invalid string token!");

This should be an actual diagnostic instead of an assert. Input
provided by the user should _never_ cause an assert.

+
+ // Create a Token from the string.
+ Token MacroToPopTok;
+ MacroToPopTok.startToken();
+ CreateString(&StrVal[1], StrVal.size() - 2, MacroToPopTok);
+ MacroToPopTok.setKind(tok::identifier);
+
+ // Get the IdentifierInfo of MacroToPopTok.
+ IdentifierInfo *IdentInfo = LookUpIdentifierInfo(MacroToPopTok);
+
+ // Retrived the stack<MacroInfo*> associated with the macro.
+ std::map<IdentifierInfo*, std::stack<MacroInfo*> >::iterator iter =
+ PragmaPushMacroInfo.find(IdentInfo);
+ if (iter != PragmaPushMacroInfo.end()) {
+ // PushedMI is the MacroInfo will want to reinstall.
+ MacroInfo *PushedMI = iter->second.top();
+
+ // CurrentMI is the current MacroInfo to release.
+ MacroInfo *CurrentMI = getMacroInfo(IdentInfo);
+ if (CurrentMI && CurrentMI != PushedMI)
+ ReleaseMacroInfo(CurrentMI);
+
+ // Reinstall the pushed macro.
+ setMacroInfo(IdentInfo, PushedMI);
+
+ // Reset the IsPragmaPush flag to false.
+ if (PushedMI) PushedMI->setIsPragmaPush(false);
+
+ // Pop PragmaPushMacroInfo stack.
+ iter->second.pop();
+ if (iter->second.size() == 0)
+ PragmaPushMacroInfo.erase(iter);
+ }
+ else {

No newline before the else.

+ Diag(PragmaLoc, diag::warn_pragma_pop_macro_no_push) << getSpelling(MacroToPopTok);

Keep lines below 80 columns.

+ }
+}
+
/// AddPragmaHandler - Add the specified pragma handler to the preprocessor.
/// If 'Namespace' is non-null, then it is a token required to exist on the
/// pragma line before the pragma string starts, e.g. "STDC" or "GCC".
@@ -699,6 +824,23 @@
  }
};

+/// PragmaPushMacroHandler - "#pragma push_macro" marks the file as atomically included.

Keep lines below 80 columns. And the comment is wrong.

+struct PragmaPushMacroHandler : public PragmaHandler {
+ PragmaPushMacroHandler() : PragmaHandler("push_macro") {}
+ virtual void HandlePragma(Preprocessor &PP, Token &PushMacroTok) {
+ PP.HandlePragmaPushMacro(PushMacroTok);
+ }
+};
+
+
+/// PragmaPopMacroHandler - "#pragma pop_macro" marks the file as atomically included.

Keep lines below 80 columns. And the comment is wrong.

+struct PragmaPopMacroHandler : public PragmaHandler {
+ PragmaPopMacroHandler() : PragmaHandler("pop_macro") {}
+ virtual void HandlePragma(Preprocessor &PP, Token &PopMacroTok) {
+ PP.HandlePragmaPopMacro(PopMacroTok);
+ }
+};
+
// Pragma STDC implementations.

enum STDCSetting {
@@ -780,6 +922,8 @@
void Preprocessor::RegisterBuiltinPragmas() {
  AddPragmaHandler(new PragmaOnceHandler());
  AddPragmaHandler(new PragmaMarkHandler());
+ AddPragmaHandler(new PragmaPushMacroHandler());
+ AddPragmaHandler(new PragmaPopMacroHandler());

  // #pragma GCC ...
  AddPragmaHandler("GCC", new PragmaPoisonHandler());

- Michael Spencer

pragma-pushpop-macro.c (825 Bytes)

Overall the patch looks good. HandlePragmaPushMacro and
HandlePragmaPopMacro seem to share a lot of code and you may consider
refactoring the parsing and lookup into another function. There are
some things that I'm not sure about and will have to be checked by
others.

It passes the test I wrote while going through the MS extensions. I've
attached that test which you should add to your patch.

One other thing is that clang currently warns about macro
redefinitions after a push. It would be nice if it didn't on the first
redefinition after a push, as you pushed it so you could redefine it
:P.

Now for some inline comments.

Index: include/clang/Basic/DiagnosticLexKinds.td

--- include/clang/Basic/DiagnosticLexKinds.td (revision 109329)
+++ include/clang/Basic/DiagnosticLexKinds.td (working copy)
@@ -230,6 +230,12 @@
"pragma comment requires parenthesized identifier and optional string">;
def err_pragma_message_malformed : Error<
"pragma message requires parenthesized string">;
+def err_pragma_push_macro_malformed : Error<
+ "pragma push_macro requires a parenthesized string">;
+def err_pragma_pop_macro_malformed : Error<
+ "pragma pop_macro requires a parenthesized string">;
+def warn_pragma_pop_macro_no_push : Warning<
+ "#pragma pop_macro : '%0' no previous #pragma push_macro for this identifier">;

This may be better worded as "pragma pop_macro could not pop '%0', no
matching push_macro", but that's really up to personal preference and
whatnot. I just matched what warn_pragma_diagnostic_clang_cannot_ppp
says for consistency.

def warn_pragma_message : Warning<"%0">;
def warn_pragma_ignored : Warning<"unknown pragma ignored">,
  InGroup<UnknownPragmas>, DefaultIgnore;
Index: lib/Lex/Pragma.cpp

--- lib/Lex/Pragma.cpp (revision 109329)
+++ lib/Lex/Pragma.cpp (working copy)
@@ -16,6 +16,7 @@
#include "clang/Lex/HeaderSearch.h"
#include "clang/Lex/LiteralSupport.h"
#include "clang/Lex/Preprocessor.h"
+#include "clang/Lex/MacroInfo.h"
#include "clang/Lex/LexDiagnostic.h"
#include "clang/Basic/FileManager.h"
#include "clang/Basic/SourceManager.h"
@@ -483,7 +484,131 @@
   Callbacks->PragmaMessage(MessageLoc, MessageString);
}

+/// HandlePragmaPushMacro - Handle #pragma push_macro.
+/// The syntax is:
+/// #pragma push_macro("macro")
+void Preprocessor::HandlePragmaPushMacro(Token &PushMacroTok) {
+ // Remember the pragma token location.
+ SourceLocation PragmaLoc = PushMacroTok.getLocation();

+ // Read the '('.
+ Token Tok;
+ Lex(Tok);
+ if (Tok.isNot(tok::l_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_push_macro_malformed);
+ return;
+ }
+
+ // Read the macro name string.
+ Lex(Tok);
+ if (Tok.isNot(tok::string_literal)) {
+ Diag(PragmaLoc, diag::err_pragma_push_macro_malformed);
+ return;
+ }
+
+ // Remember the macro string.
+ std::string StrVal = getSpelling(Tok);

This should be a llvm::StringRef instead of an std::string.

+
+ // Read the ')'.
+ Lex(Tok);
+ if (Tok.isNot(tok::r_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_push_macro_malformed);
+ return;
+ }
+
+ assert(StrVal[0] == '"' && StrVal[StrVal.size()-1] == '"' &&
+ "Invalid string token!");

This should be an actual diagnostic instead of an assert. Input
provided by the user should _never_ cause an assert.

+
+ // Create a Token from the string.
+ Token MacroToPushTok;
+ MacroToPushTok.startToken();
+ MacroToPushTok.setKind(tok::identifier);
+ CreateString(&StrVal[1], StrVal.size() - 2, MacroToPushTok);
+
+ // Get the IdentifierInfo of MacroToPushTok.
+ IdentifierInfo *IdentInfo = LookUpIdentifierInfo(MacroToPushTok);
+
+ // Get the MacroInfo and flag IsPragmaPush.
+ MacroInfo *MI = getMacroInfo(IdentInfo);
+ if (MI) {
+ MI->setIsPragmaPush(true);
+ }
+
+ // Push the MacroInfo pointer so we can retrieve in later.
+ PragmaPushMacroInfo[IdentInfo].push(MI);
+}
+
+/// HandlePragmaPopMacro - Handle #pragma push_macro.
+/// The syntax is:
+/// #pragma pop_macro("macro")
+void Preprocessor::HandlePragmaPopMacro(Token &PopMacroTok) {
+ SourceLocation PragmaLoc = PopMacroTok.getLocation();
+ Token Tok;
+
+ // Read the '('.
+ Lex(Tok);
+ if (Tok.isNot(tok::l_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_pop_macro_malformed);
+ return;
+ }
+
+ // Read the '"..."'.
+ Lex(Tok);
+ if (Tok.isNot(tok::string_literal)) {
+ Diag(PragmaLoc, diag::err_pragma_pop_macro_malformed);
+ return;
+ }
+
+ // Remember the string.
+ std::string StrVal = getSpelling(Tok);

This should be a llvm::StringRef instead of an std::string.

Are you sure? getSpelling returns a new string, I think, you can't
keep a reference to it.

- Daniel

Overall the patch looks good. HandlePragmaPushMacro and
HandlePragmaPopMacro seem to share a lot of code and you may consider
refactoring the parsing and lookup into another function. There are
some things that I'm not sure about and will have to be checked by
others.

It passes the test I wrote while going through the MS extensions. I've
attached that test which you should add to your patch.

One other thing is that clang currently warns about macro
redefinitions after a push. It would be nice if it didn't on the first
redefinition after a push, as you pushed it so you could redefine it
:P.

Now for some inline comments.

Index: include/clang/Basic/DiagnosticLexKinds.td

--- include/clang/Basic/DiagnosticLexKinds.td (revision 109329)
+++ include/clang/Basic/DiagnosticLexKinds.td (working copy)
@@ -230,6 +230,12 @@
"pragma comment requires parenthesized identifier and optional string">;
def err_pragma_message_malformed : Error<
"pragma message requires parenthesized string">;
+def err_pragma_push_macro_malformed : Error<
+ "pragma push_macro requires a parenthesized string">;
+def err_pragma_pop_macro_malformed : Error<
+ "pragma pop_macro requires a parenthesized string">;
+def warn_pragma_pop_macro_no_push : Warning<
+ "#pragma pop_macro : '%0' no previous #pragma push_macro for this identifier">;

This may be better worded as "pragma pop_macro could not pop '%0', no
matching push_macro", but that's really up to personal preference and
whatnot. I just matched what warn_pragma_diagnostic_clang_cannot_ppp
says for consistency.

def warn_pragma_message : Warning<"%0">;
def warn_pragma_ignored : Warning<"unknown pragma ignored">,
InGroup<UnknownPragmas>, DefaultIgnore;
Index: lib/Lex/Pragma.cpp

--- lib/Lex/Pragma.cpp (revision 109329)
+++ lib/Lex/Pragma.cpp (working copy)
@@ -16,6 +16,7 @@
#include "clang/Lex/HeaderSearch.h"
#include "clang/Lex/LiteralSupport.h"
#include "clang/Lex/Preprocessor.h"
+#include "clang/Lex/MacroInfo.h"
#include "clang/Lex/LexDiagnostic.h"
#include "clang/Basic/FileManager.h"
#include "clang/Basic/SourceManager.h"
@@ -483,7 +484,131 @@
Callbacks->PragmaMessage(MessageLoc, MessageString);
}

+/// HandlePragmaPushMacro - Handle #pragma push_macro.
+/// The syntax is:
+/// #pragma push_macro("macro")
+void Preprocessor::HandlePragmaPushMacro(Token &PushMacroTok) {
+ // Remember the pragma token location.
+ SourceLocation PragmaLoc = PushMacroTok.getLocation();

+ // Read the '('.
+ Token Tok;
+ Lex(Tok);
+ if (Tok.isNot(tok::l_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_push_macro_malformed);
+ return;
+ }
+
+ // Read the macro name string.
+ Lex(Tok);
+ if (Tok.isNot(tok::string_literal)) {
+ Diag(PragmaLoc, diag::err_pragma_push_macro_malformed);
+ return;
+ }
+
+ // Remember the macro string.
+ std::string StrVal = getSpelling(Tok);

This should be a llvm::StringRef instead of an std::string.

+
+ // Read the ')'.
+ Lex(Tok);
+ if (Tok.isNot(tok::r_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_push_macro_malformed);
+ return;
+ }
+
+ assert(StrVal[0] == '"' && StrVal[StrVal.size()-1] == '"' &&
+ "Invalid string token!");

This should be an actual diagnostic instead of an assert. Input
provided by the user should _never_ cause an assert.

+
+ // Create a Token from the string.
+ Token MacroToPushTok;
+ MacroToPushTok.startToken();
+ MacroToPushTok.setKind(tok::identifier);
+ CreateString(&StrVal[1], StrVal.size() - 2, MacroToPushTok);
+
+ // Get the IdentifierInfo of MacroToPushTok.
+ IdentifierInfo *IdentInfo = LookUpIdentifierInfo(MacroToPushTok);
+
+ // Get the MacroInfo and flag IsPragmaPush.
+ MacroInfo *MI = getMacroInfo(IdentInfo);
+ if (MI) {
+ MI->setIsPragmaPush(true);
+ }
+
+ // Push the MacroInfo pointer so we can retrieve in later.
+ PragmaPushMacroInfo[IdentInfo].push(MI);
+}
+
+/// HandlePragmaPopMacro - Handle #pragma push_macro.
+/// The syntax is:
+/// #pragma pop_macro("macro")
+void Preprocessor::HandlePragmaPopMacro(Token &PopMacroTok) {
+ SourceLocation PragmaLoc = PopMacroTok.getLocation();
+ Token Tok;
+
+ // Read the '('.
+ Lex(Tok);
+ if (Tok.isNot(tok::l_paren)) {
+ Diag(PragmaLoc, diag::err_pragma_pop_macro_malformed);
+ return;
+ }
+
+ // Read the '"..."'.
+ Lex(Tok);
+ if (Tok.isNot(tok::string_literal)) {
+ Diag(PragmaLoc, diag::err_pragma_pop_macro_malformed);
+ return;
+ }
+
+ // Remember the string.
+ std::string StrVal = getSpelling(Tok);

This should be a llvm::StringRef instead of an std::string.

Are you sure? getSpelling returns a new string, I think, you can't
keep a reference to it.

- Daniel

Ah, you're right, I saw the getSpelling version that returns
llvm::StringRef and thought it could be used here.

- Michael Spencer

This is the patch after rework.
The only thing I didn't implement is to remove the assert. User code
cannot trigger the assert (I believe), the check on line 503 will
prevent it:
    if (Tok.isNot(tok::string_literal)) {

Comments greatly appreciated.

pragma_push_macro2.patch (11.1 KB)

Looks good to me. +1 for commit.

- Michael Spencer

Thanks for working on this! The patch is looking good, but I have a few more minor comments:

+ /// IsPragmaPush - True if this macro has been pushed using #pragma push_macro
+ /// This is used to prevent releasing the macro.
+ bool IsPragmaPush : 1;

Your implementation approach introduces sharing of the MacroInfo object: the identifier info table can have one pointer and the pragma push_macro stack can have another. I think it would be cleaner to just *make a copy* of the MacroInfo object on pragma push, to avoid this sharing. This is slightly less efficient than your solution, but simplifies the code, and push_macro hopefully shouldn't be a performance bottleneck :slight_smile:

+ /// isPragmaPush - Return true if this macro is on the #pragma push_macro stack.
+ bool isPragmaPush() const { return IsPragmaPush; }

Please wrap to 80 columns.

+ /// PragmaPushMacroInfo - For each IdentifierInfo used in a #pragma
+ /// push_macro directive, we keep a MacroInfo stack used to restore
+ /// previous macro value.
+ std::map<IdentifierInfo*, std::stack<MacroInfo*> > PragmaPushMacroInfo;

Please use
llvm::DenseMap<IdentifierInfo*, std::vector<MacroInfo*> > PragmaPushMacroInfo;

for consistency with our containers.

+ IdentifierInfo* ParsePragmaPushOrPopMacro(Token &Tok);

Space before the * :slight_smile:

       // Macros must be identical. This means all tokes and whitespace

Not your bug but tokes -> tokens.

+ // Also do not warn if this is the first redefinition after #pragma
+ // push_macro.
+ if (!MI->isIdenticalTo(*OtherMI, *this) && !OtherMI->isPragmaPush()) {

Ah. This is an interesting and rather unfortunate rule. With the implementation approach suggested above, I think it would make sense to keep a bit in macroinfo, but the bit would be "AllowRedefinitionsWithoutWarning" or something like that. To me, the semantics of this bit are much more clear and possible more reusable than the 'ispragmapush' bit.

+IdentifierInfo* Preprocessor::ParsePragmaPushOrPopMacro(Token &Tok) {

Space before the *, not after :slight_smile:

+void Preprocessor::HandlePragmaPopMacro(Token &PopMacroTok) {
+ SourceLocation MessageLoc = PopMacroTok.getLocation();

Hi

This is a new patch after rework based on your comments.

pragma_push3.patch (13.9 KB)

Please ignore the latest patch it contains a major bug.
Here is the new (hopefully) bug free patch.

pragma_push4.patch (14 KB)

Any reason why this has not been committed?

I am also looking forward to it.
Thanks,

    Roberto

Looks great! Applied in r111234, thanks!

-Chris