extern "C" support

This adds initial extern "C" support... It includes ast, sema, ast printer, and stub codegen but does not include ast dumper.

Thoughts?

Doing diffs in .:
--- ./AST/Decl.cpp.~1~ 2007-12-20 15:32:21.000000000 -0800
+++ ./AST/Decl.cpp 2007-12-21 13:58:52.000000000 -0800
@@ -37,6 +37,7 @@ static unsigned nObjcImplementationDecls
  static unsigned nObjcCategoryImpl = 0;
  static unsigned nObjcCompatibleAlias = 0;
  static unsigned nObjcPropertyDecl = 0;
+static unsigned nLinkageSpecDecl = 0;

  static bool StatSwitch = false;

@@ -151,12 +152,28 @@ void Decl::PrintStats() {
      nObjcPropertyDecl, (int)sizeof(ObjcPropertyDecl),
      int(nObjcPropertyDecl*sizeof(ObjcPropertyDecl)));

+ fprintf(stderr, " %d linkage specifications, %d each (%d bytes)\n",
+ nLinkageSpecDecl, (int)sizeof(LinkageSpecDecl),
+ int(nLinkageSpecDecl*sizeof(LinkageSpecDecl)));

Hi Mike,

Very cool, thanks for tackling this!

Please attach patches as attachments, not inline. If you're using 'mail', just name the file "whatever.patch" and this will happen. Until this happens, I can't properly review the patch as it's all wrapped and nastified :frowning:

Some random thoughts:

+++ ./Driver/ASTConsumers.cpp 2007-12-21 13:58:52.000000000 -0800
+void DeclPrinter:: PrintDecl(Decl *D) {
+ if (FunctionDecl *FD = dyn_cast<FunctionDecl>(D)) {
+ PrintFunctionDeclStart(FD);

Ok...

This moves the decl printer back up into DeclPrinter.

tested with make check on x86 darwin... no new regressions.

d.patch (4.74 KB)

Applied, thanks!

FYI, the mailing lists are apparently having issues right now. I'm working on resolving them, in the meantime, the web archives are still working.

-Chris

Please attach patches as attachments, not inline. If you're using 'mail', just name the file "whatever.patch" and this will happen. Until this happens, I can't properly review the patch as it's all wrapped and nastified :frowning:

Odd, you must be using an old mail viewer... In modern viewers, it should come out just fine. I pretested cutting and pasting and ensured there were no problems with it. Oh well... What mail viewer do you use?

+/// LinkageSpecDecl - This represents a linkage specification.

+class LinkageSpecDecl : public Decl {

Please give an example of what a linkage spec is in the header. At some point, we probably want to make a DeclCXX.h file, but deferring this for now is fine.

+private:
+ /// Language - The language for this linkage specification.
+ Language language;
+ bool inside_braces;
+ Decl *D;

inside_braces is good for "remembering" the original source,

I did that merely because we're calling them ASTs and because we form ParenExprs. If we didn't want ASTs, we could get rid of ParenExprs?

but unless there is a client,

-ast-print uses that information to print the ast. Absent that information we can't faithfully reproduce the syntax tree. Do we care that the printed ast is wrong? Do we care if the ast isn't actually an ast?

I'd suggest not including it. How does your Decl structure handle stuff like:

extern "C" {
int x; int y;
}

work in progress... but roughly, an AST should have one linkage spec with two decls under it.

Is there just one LinkageSpecDecl?

Yes. This choice was mainly driven by keeping the AST an ast.

+++ ./Parse/Parser.cpp 2007-12-21 13:58:52.000000000 -0800
@@ -378,6 +378,12 @@ Parser::DeclTy *Parser::ParseDeclaration
      return ParseObjCAtInterfaceDeclaration(AtLoc,
DS.getAttributes());
  }

+ if (Tok.is(tok::string_literal)
+ && DS.getStorageClassSpec() == DeclSpec::SCS_extern)
+ {
+ return ParseLinkage(Declarator::FileContext);

I'm not sure if this is sufficient:

It wasn't, but that was a known issue.

does this allow 'extern int "C"' or 'extern inline "C"'

I'm sure...

The DeclSpec::getParsedSpecifiers() method can be used to make sure there is just a storage class specified.

Ah, perfect, I've fixed it and tested extern inline "C"; that is now rejected.

+Sema::DeclTy* Sema::ActOnLinkageSpec(SourceLocation Loc,
+ std::string Lang,
+

Please don't pass std::strings by-value, pass by const reference, or better yet, pass in the string as a const char* or something like that.

Ok, fixed.

d.patch (12.2 KB)

Mike Stump wrote:

Please attach patches as attachments, not inline. If you're using 'mail', just name the file "whatever.patch" and this will happen. Until this happens, I can't properly review the patch as it's all wrapped and nastified :frowning:

Odd, you must be using an old mail viewer... In modern viewers, it should come out just fine. I pretested cutting and pasting and ensured there were no problems with it. Oh well... What mail viewer do you use?

I think the line-breaks may be caused by oter tools than the mail viewer. At least the source my viewer received had additional linebreaks in it.

Nevertheless, the new patch you provided as attachment was ok as far as line-breaks. I did have other problems using it with my diff/patch tools, but I managed to apply it with some manual hacking. Problems may have been cased by us not being in sync with svn revisions. I am not that hot on svn, but svn-diff and friends may be a better choice in . I attached the output of running

svn diff > cxx-extern-linkage-spec.patch

in my tools/clang directory after I applied Mike's patches the best I could.

It seems Mike are done with most of what I set out to do :-). I reviewed the patch for the sake of learning, and applied it to my source. It now work fine with my test code for C++ extern keword. Extern template instanciation stuff is of scale and no tests is written for that.

I also attach my test source which run nicely with Mike's patch :slight_smile: Good work Mike.

+/// LinkageSpecDecl - This represents a linkage specification.

+class LinkageSpecDecl : public Decl {

Please give an example of what a linkage spec is in the header. At some point, we probably want to make a DeclCXX.h file, but deferring this for now is fine.

+private:
+ /// Language - The language for this linkage specification.
+ Language language;
+ bool inside_braces;
+ Decl *D;

inside_braces is good for "remembering" the original source,

I did that merely because we're calling them ASTs and because we form ParenExprs. If we didn't want ASTs, we could get rid of ParenExprs?

but unless there is a client,

-ast-print uses that information to print the ast. Absent that information we can't faithfully reproduce the syntax tree. Do we care that the printed ast is wrong? Do we care if the ast isn't actually an ast?

Does this support nested linkage specifiers as well? There is a test case in my test file.

Should we not let support for linkage be a backend issue (e.g. llvm CodeGen), not frontend, as it is now, the AST is unusable for backends supporting linking to other languages.

This would mean a change of Sema::ActOnLinkageSpec to support more lang_xxx enums and/or pass the language string to the AST consumer. Something like:

Sema::DeclTy* Sema::ActOnLinkageSpec(SourceLocation Loc,
                     const char *Lang,
                     bool inside_brace,
                     DeclTy *D) {
  LinkageSpecDecl::Language language;
  Decl *dcl = static_cast<Decl *>(D);
  if (strcmp (Lang, "\"C\"") == 0)
    language = LinkageSpecDecl::lang_c;
  else if (strcmp (Lang, "\"C++\"") == 0)
    language = LinkageSpecDecl::lang_cxx;
  // could include enum support for other typical, but not required languages also
  else
    language = LinkageSpecDecl::lang_other;
  return new LinkageSpecDecl(Loc, language, inside_brace, dcl);
}

And let the backend codegen emit diagnostics, something similar to pseudocode:

    switch ( laguage )
    {
       case lang::lang_cxx:
         DoCXXLinkageCodeGen(...);
         break; case lang::lang_c:
         DoCLinkageCodeGen(...);
         break; case lang::lang_other:
         if ( (strcmp (language_str, "\"FORTRAN\"") == 0) &&
              (strcmp (language_str, "\"Fortran\"") == 0) )
            DoFortranLinkageCodeGen(...);
         else if (strcmp (language_str, "\"Ada\"") == 0)
            DoAdaLinkageCodeGen(...);
         Diag(Loc, diag::err_bad_language);
         break; }

cxx-extern-linkage-spec.patch (13.7 KB)

cxx-extern.cpp (1.74 KB)

-ast-print uses that information to print the ast. Absent that information we can't faithfully reproduce the syntax tree. Do we care that the printed ast is wrong? Do we care if the ast isn't actually an ast?

Does this support nested linkage specifiers as well?

I don't see why it would not, though, I didn't test it. I did write it to work in general, including that case.

[ testing ] Yup, works just fine.

Should we not let support for linkage be a backend issue (e.g. llvm CodeGen), not frontend, as it is now, the AST is unusable for backends supporting linking to other languages.

There is an extra not above...

This would mean a change of Sema::ActOnLinkageSpec to support more lang_xxx enums and/or pass the language string to the AST consumer.

And let the backend codegen emit diagnostics, something similar to pseudocode:

any thoughts ?

Sure, my thought would be, if one has a plug in for a code generator that supports a new language, call it Pascal, if you want, it does seem reasonable to have the supported by a previously compiled front end. Do we have such a clean architecture? There is a cost in that though, space, ownership of strings and the like. Pay now, or pay when someone wants the feature, I'll leave that for others to chime in with. Also, if you defer it, do you give up issuing an error when you say:

extern "unknown" static int i;
void foo() {
}

and i is deleted by the optimizer because it isn't used? Do we want that or not? Anyway, I'd rather get the entire rest of the core of the language done than fiddle the fine details. If written well, they are flexible enough to change.

I did postpone it out of the parser so that one could have a parser parse the above, if they wanted. :slight_smile:

Mike Stump wrote:

-ast-print uses that information to print the ast. Absent that information we can't faithfully reproduce the syntax tree. Do we care that the printed ast is wrong? Do we care if the ast isn't actually an ast?

Does this support nested linkage specifiers as well?

I don't see why it would not, though, I didn't test it. I did write it to work in general, including that case.

Good, I was just wandering :slight_smile:

[ testing ] Yup, works just fine.

Good,
I will see how much time I am able to use on this, but right now I have made up a few more test cases. I try to work myselves systematically through the cases in

from the "Changes to C99 versus C++98" and onword. I.e: Incompatibilities Between ISO C and ISO C++

I figure we just as well sort these conflict issues out before we go on to the C++ specific stuff. If a case is already handled I just make the appropriate test cases and move on to the next issue. This includes making tests for C90 and C99 to verify that differences is honored in -pedantic. If I find cases that are not supported I intend to post on the list before I will considder to start working on them. I am a little unclear how aggressive (incremental) I should be in posting test code I have added. I do not get a good feel for how much focus the maintainers with SVN write-access have on applying patches. So it does not make sence to post patches and additions to the list if nobody pick up the tab. Is bugzilla a better way to avoid stuff get lost?

Should we not let support for linkage be a backend issue (e.g. llvm CodeGen), not frontend, as it is now, the AST is unusable for backends supporting linking to other languages.

There is an extra not above...

Well, yes - such negation is common in my native language to emphasise a point, funny thing really :slight_smile: But I should probabbly avoid it in English, sorry.

This would mean a change of Sema::ActOnLinkageSpec to support more lang_xxx enums and/or pass the language string to the AST consumer.

And let the backend codegen emit diagnostics, something similar to pseudocode:

any thoughts ?

Sure, my thought would be, if one has a plug in for a code generator that supports a new language, call it Pascal, if you want, it does seem reasonable to have the supported by a previously compiled front end. Do we have such a clean architecture?

I don't know if we do. The obvious choise is to attempt to keep a complete list of relevant programming languages as enums. That would not give any added cost, but danger of missing a language or two. This would probbably not be very likely if we kan get a list of , let us say the 20 most likely languages, and language binding technologies. One challenge is however that we also need the correct authorative language string. The C++ standard offer some guidance here.

If we are to pass the string or make them accessable to AST consumers, the following thoughts come to mind (before I have dug in):

- pass the string in Linkage Specifier AST node --> very costly and probbably inefficient.

- lett the AST consumer, if it needs to, deal with lang_other by accessing the source through sorce location info in AST --> maybe not a bad idea in combination of a fairly complete list of enums.

- Use some efficient way of refering to language strings as is done with types or identifiers. Maybe there is something that can more or less be reused as is in the AST, I have not digged in to look.

There is a cost in that though, space, ownership of strings and the like. Pay now, or pay when someone wants the feature, I'll leave that for others to chime in with.

Fair enough, only thing is to avoid major changes later in the AST.

Also, if you defer it, do you give up issuing an error when you say:

extern "unknown" static int i;
void foo() {
}

and i is deleted by the optimizer because it isn't used?

I am probably missing something, but it sound like you say we are optimizing in Parser and Sema. Are we? I thought all this would be in the AST no mather what?

Do we want that or not?

I guess as far as building libs and apps it may not mather if this diagnostics get lost. For other use-cases of the AST, I guess it depend on the use-case if it should care. Some use-cases should probably never emit diagnostics as there is no such concept as an unsuported language. Say you do AST based pretty printing, code refactoring, or do edit/build-time introspection - what defines if linkage to a language is legal? But then it may be wrong for the AST information being lost in the first place, diagnostics or not.

Anyway, I'd rather get the entire rest of the core of the language done than fiddle the fine details. If written well, they are flexible enough to change.

I did postpone it out of the parser so that one could have a parser parse the above, if they wanted. :slight_smile:

Fair enogh. I don't intend to touch CodeGen for now, so I will most definently leave it :wink:

I don't know, I'm 15 days or so into a trivial little patch. At that rate, that'd be 24 a year. :frowning: To do a front end for C++ at that rate, it'd take 66 years to get it in. I was kinda hoping it would go slightly faster than that.

Mike Stump skrev:

I would chalk up any delay in patch reviews as due to lack of time because of the holidays. :slight_smile: The LLVM group tends to be pretty good at reviewing such things in a timely fashion. If you feel that something is pressing though, maybe it would be good to ping the group again with said patch, or, if the patch could be split up into smaller increments, submit those as they are easier to review/apply. :slight_smile:

-bw

Hi Mike,

Sorry for the delay, I've been focusing on some llvm backend related projects, which I'd like to see happen before the 2.2 release. Steve is still out on vacation, which is why there is limited coverage. I'll try to review and apply your patch soon.

-Chris

+private:
+ /// Language - The language for this linkage specification.
+ Language language;
+ bool inside_braces;
+ Decl *D;

inside_braces is good for "remembering" the original source,

I did that merely because we're calling them ASTs and because we form ParenExprs. If we didn't want ASTs, we could get rid of ParenExprs?

I don't understand what you're saying here. ASTs can take many forms.

but unless there is a client,

-ast-print uses that information to print the ast. Absent that information we can't faithfully reproduce the syntax tree. Do we care that the printed ast is wrong? Do we care if the ast isn't actually an ast?

You seem to be using an overly constrained definition of what an AST is :). At this point, I'm happy with -ast-print emitting *valid* code, but it does not have to be exactly what the user wrote. In this specific case, I'd be fine with -ast-print always emitting {}'s for example.

clang does not preserve absolutely everything that the user wrote, and I don't think we can achieve that without significant expense to other goals. That said, in this specific case, there is no space overhead, so preserving the info is fine.

I'd suggest not including it. How does your Decl structure handle stuff like:

extern "C" {
int x; int y;
}

work in progress... but roughly, an AST should have one linkage spec with two decls under it.

Ok.

Is there just one LinkageSpecDecl?

Yes.

This makes sense.

+++ ./Parse/Parser.cpp 2007-12-21 13:58:52.000000000 -0800
@@ -378,6 +378,12 @@ Parser::DeclTy *Parser::ParseDeclaration
     return ParseObjCAtInterfaceDeclaration(AtLoc,
DS.getAttributes());
}

+ if (Tok.is(tok::string_literal)
+ && DS.getStorageClassSpec() == DeclSpec::SCS_extern)
+ {
+ return ParseLinkage(Declarator::FileContext);

I'm not sure if this is sufficient:

It wasn't, but that was a known issue.

Please add FIXME markers to everything you *know* is incomplete or incorrect.

does this allow 'extern int "C"' or 'extern inline "C"'
The DeclSpec::getParsedSpecifiers() method can be used to make sure there is just a storage class specified.

Ah, perfect, I've fixed it and tested extern inline "C"; that is now rejected.

Nice.

+Sema::DeclTy* Sema::ActOnLinkageSpec(SourceLocation Loc,
+ std::string Lang,
+

Please don't pass std::strings by-value, pass by const reference, or better yet, pass in the string as a const char* or something like that.

Ok, fixed.

Ok, some more thoughts:

Many of the changes to AST/Decl.cpp and the change to Type.h are unrelated to this patch. It would be good to split them out as they aren't controversial at all :slight_smile:

+void clang::CodeGen::CodeGenLinkageSpec(CodeGenModule *B, LinkageSpecDecl *LS) {

This would mean a change of Sema::ActOnLinkageSpec to support more
lang_xxx enums and/or pass the language string to the AST consumer.

And let the backend codegen emit diagnostics, something similar to
pseudocode:

any thoughts ?

Sure, my thought would be, if one has a plug in for a code generator
that supports a new language, call it Pascal, if you want, it does
seem reasonable to have the supported by a previously compiled front
end. Do we have such a clean architecture? There is a cost in that
though, space, ownership of strings and the like. Pay now, or pay
when someone wants the feature, I'll leave that for others to chime in
with. Also, if you defer it, do you give up issuing an error when you
say:

extern "unknown" static int i;
void foo() {
}

and i is deleted by the optimizer because it isn't used? Do we want
that or not? Anyway, I'd rather get the entire rest of the core of
the language done than fiddle the fine details. If written well, they
are flexible enough to change.

I completely agree with you Mike.

I did postpone it out of the parser so that one could have a parser
parse the above, if they wanted. :slight_smile:

I think that postponing it is the right thing to do. There is a lot more than just "how it gets lowered in the backend" that changes based on the extern C. There is also extra semantic analysis that can go along with it ("you can't pass this C++ object into a extern C function" or "extern C functions can't take references" or something). At this point, how to handle this is not at all clear, and I'd prefer to get the simple stuff working first.

Thanks Mike!

-Chris

I did that merely because we're calling them ASTs and because we form ParenExprs. If we didn't want ASTs, we could get rid of ParenExprs?

I don't understand what you're saying here. ASTs can take many forms.

Sorry, the eyes read AST, but the brain was thinking parse trees, due to ParenExprs.

You seem to be using an overly constrained definition of what an AST is :).

:slight_smile: Yeah, I agree, we don't need to preserve the parse tree in the AST!

In this specific case, I'd be fine with -ast-print always emitting {}'s for example.

Done.

Please add FIXME markers to everything you *know* is incomplete or incorrect.

Ok.

Many of the changes to AST/Decl.cpp and the change to Type.h are unrelated to this patch. It would be good to split them out as they aren't controversial at all :slight_smile:

I sent the Type.h separately. I'm holding the other hostage. :slight_smile: Oops, it went missing in the Objc ObjC rewrite, oh well...

+void clang::CodeGen::CodeGenLinkageSpec(CodeGenModule *B, LinkageSpecDecl *LS) {
+
+}

Please have this emit an unsupported warning or something (see CodeGenModule::WarnUnsupported) so that we know when this is happening.

Gosh, none of the existing ones could be used, and I didn't want to learn how to do all that... [ pause ] Oh well. :slight_smile: Done.

Also, please add a FIXME.

Done.

+/// LinkageSpecDecl - This represents a linkage specification.
+class LinkageSpecDecl : public Decl {

As I requested previously, please add an example.

"Language language" is strange. I'd suggest renaming the enum to LanguageIDs or something.

Done.

+ bool inside_braces;

If you want to keep this

Gone.

How about getDecls() ?

Done.

Please stay in 80 columns and use mixed caps instead of _'s.

Done.

Ok, but please format with the llvm style (see surrounding code) which has the &&'s on the previous line, no space before the (), and either drop the {} or make them K&R style.

Done.

Likewise in the implementation of ParseLinkage.

Done.

+// C++ 7.5p2
+Parser::DeclTy *Parser::ParseLinkage(unsigned Context) {
+ std::string lang = PP.getSpelling(Tok);

Please add an assertion that the Tok.is(stringliteral), and a comment above the method, like all other parser methods have.

Done.

Also, this is the expensive form of getSpelling, [ ... ] Instead, please use something like this:

Done.

In addition to the formatting issues above, please rearrange this to make the small case come first.

Done.

Also, RBrace is dead, so you can just remove its declaration.

Done....

Better yet, you could pass *it* into ActOnLinkageSpec instead of 'inside_brace'.

Sure, done.

+ return Actions.ActOnLinkageSpec(Loc, lang.c_str(), saw_braces, D);

Can D be null here?

Sure.

If so, you should check for it and return an error instead of building an invalid AST node.

Done.

+Sema::DeclTy* Sema::ActOnLinkageSpec(SourceLocation Loc,
+ const char *Lang,
+ bool inside_brace,
+ DeclTy *D) {

This looks good, but please clean up the formatting above and add a fixme if there are semantic checks missing :slight_smile:

Well, almost all of them are missing... :slight_smile: Added a FIXME.

Let's try this version:

externc-2.patch (13.1 KB)

I did that merely because we're calling them ASTs and because we form ParenExprs. If we didn't want ASTs, we could get rid of ParenExprs?

I don't understand what you're saying here. ASTs can take many forms.

Sorry, the eyes read AST, but the brain was thinking parse trees, due to ParenExprs.

Gotcha. We definitely have an AST, we just keeps some parse tree'y features in it :).

Many of the changes to AST/Decl.cpp and the change to Type.h are unrelated to this patch. It would be good to split them out as they aren't controversial at all :slight_smile:

I sent the Type.h separately. I'm holding the other hostage. :slight_smile: Oops, it went missing in the Objc ObjC rewrite, oh well...

ok :slight_smile:

+void clang::CodeGen::CodeGenLinkageSpec(CodeGenModule *B, LinkageSpecDecl *LS) {
+
+}

Please have this emit an unsupported warning or something (see CodeGenModule::WarnUnsupported) so that we know when this is happening.

Gosh, none of the existing ones could be used, and I didn't want to learn how to do all that... [ pause ] Oh well. :slight_smile: Done.

Thanks!

+Sema::DeclTy* Sema::ActOnLinkageSpec(SourceLocation Loc,
+ const char *Lang,
+ bool inside_brace,
+ DeclTy *D) {

This looks good, but please clean up the formatting above and add a fixme if there are semantic checks missing :slight_smile:

Well, almost all of them are missing... :slight_smile: Added a FIXME.

Hehe, yeah :slight_smile:

Let's try this version:

This is looking very nice. Some last remaining nit-picky details:

+Parser::DeclTy *Parser::ParseLinkage(unsigned Context) {
+ assert(Tok.is(tok::string_literal) && "Not a stringliteral!");
+ llvm::SmallVector<char, 8> LangBuffer;
+ LangBuffer.resize(Tok.getLength()); // LangBuffer is guaranteed to be big enough.

Please fit to 80 cols.

+/// CodeGenLinkageSpec - Emit the specified linkage space to LLVM.
+void clang::CodeGen::CodeGenLinkageSpec(CodeGenModule *Builder, LinkageSpecDecl *LS) {

likewise

@@ -156,7 +157,8 @@ void Decl::PrintStats() {
          nFileVars*sizeof(FileVarDecl)+nParmVars*sizeof(ParmVarDecl)+
          nFieldDecls*sizeof(FieldDecl)+nSUC*sizeof(RecordDecl)+
          nEnumDecls*sizeof(EnumDecl)+nEnumConst*sizeof(EnumConstantDecl)+
- nTypedef*sizeof(TypedefDecl)) /* FIXME: add ObjC decls */);
+ nTypedef*sizeof(TypedefDecl)+nLinkageSpecDecl*sizeof(LinkageSpecDecl))
+ /* FIXME: add ObjC decls */);
  }

likewise.

+void LinkageSpecDecl::EmitInRec(Serializer& S) const {
+ Decl::EmitInRec(S);
+ S.EmitInt(getLanguage());
+ S.EmitPtr(D);
+}

Emitting the language enum as part of the serialization format implicitly makes the enum value part of the serialized encoding. I'm fine with this, but please add a comment about this issue above the enum definition:

+class LinkageSpecDecl : public Decl {
+public:
+ enum LanguageIDs { lang_cxx, lang_c };

How about:
    // Note that changing these enum values will break compatibility with serialized ASTs.
+ enum LanguageIDs { lang_cxx, lang_c };

or something.

+void DeclPrinter::PrintLinkageSpec(LinkageSpecDecl *LS) {
+ std::string l;
+ if (LS->getLanguage() == LinkageSpecDecl::lang_c)
+ l = "C";
+ else if (LS->getLanguage() == LinkageSpecDecl::lang_cxx)
+ l = "C++";

I'd declare "l" as const char* instead of std::string, to avoid a malloc/free. The performance of the declprinter isn't very important, but it's good to avoid gratuitous heap thrashing :).

+class LinkageSpecDecl : public Decl {
...
+ Decl *getDecl() { return D; }

Please add a const version of the accessor as well:
+ const Decl *getDecl() const { return D; }

+++ ./include/clang/Parse/Action.h 2008-01-08 18:59:11.000000000 -0800
@@ -154,6 +154,12 @@ public:
+ virtual DeclTy *ActOnLinkageSpec(SourceLocation Loc, SourceLocation LBrace,
+ SourceLocation RBrace, const char *,
+ unsigned, DeclTy *D) {

Please give a name to the const char* and unsigned argument so that it is more obvious what is expected. The names you have in Sema.h are good.

+/// ParseLinkage - We knnow that the current token is a string_literal

knnow -> know

+/// and just before that, that extern was seen.
+///
+/// linkage-specification: [C++ 7.5p2: dcl.link]
+/// extern string-literal { declaration-seq opt }
+/// extern string-literal declaration

Very nice, but please follow the form of the other parser routines, for example 'extern' instead of extern, '{' instead of {, etc.

+ // std::string lang = PP.getSpelling(Tok);

This comment is dead, plz remove.

+ if (D)
+ return Actions.ActOnLinkageSpec(Loc, LBrace, RBrace, LangBufPtr, StrSize, D);

This is looking very nice. Some last remaining nit-picky details:

Time will tell if these are truly the last of the nit-pics... :slight_smile:

+Parser::DeclTy *Parser::ParseLinkage(unsigned Context) {
+ assert(Tok.is(tok::string_literal) && "Not a stringliteral!");
+ llvm::SmallVector<char, 8> LangBuffer;
+ LangBuffer.resize(Tok.getLength()); // LangBuffer is guaranteed to be big enough.

Please fit to 80 cols.

Done.

+/// CodeGenLinkageSpec - Emit the specified linkage space to LLVM.
+void clang::CodeGen::CodeGenLinkageSpec(CodeGenModule *Builder, LinkageSpecDecl *LS) {

Done.

@@ -156,7 +157,8 @@ void Decl::PrintStats() {
        nFileVars*sizeof(FileVarDecl)+nParmVars*sizeof(ParmVarDecl)+
        nFieldDecls*sizeof(FieldDecl)+nSUC*sizeof(RecordDecl)+
        nEnumDecls*sizeof(EnumDecl)+nEnumConst*sizeof(EnumConstantDecl)+
- nTypedef*sizeof(TypedefDecl)) /* FIXME: add ObjC decls */);
+ nTypedef*sizeof(TypedefDecl)+nLinkageSpecDecl*sizeof(LinkageSpecDecl))
+ /* FIXME: add ObjC decls */);
}

Done.

+void LinkageSpecDecl::EmitInRec(Serializer& S) const {
+ Decl::EmitInRec(S);
+ S.EmitInt(getLanguage());
+ S.EmitPtr(D);
+}

Emitting the language enum as part of the serialization format implicitly makes the enum value part of the serialized encoding. I'm fine with this, but please add a comment about this issue above the enum definition:

Done. Hum, yeah, would be nice to have a nice clean separation between the two. For an abi, I think I'd rather have the value chosen by the dwarf standard, I think they both enumerate languages and select values. [ pause ] Ok, I fixed up the code.

I'd declare "l" as const char*

Done.

Please add a const version of the accessor as well:

Done.

Please give a name to the const char* and unsigned argument so that it is more obvious what is expected.

Done.

knnow -> know

Fixed.

Very nice, but please follow the form of the other parser routines, for example 'extern' instead of extern, '{' instead of {, etc.

Done.

Please use:
+ if (!D) return 0;

Done.

Please add a comment explaining the logic here.

Done.

+ if (strncmp (Lang, "\"C\"", StrSize) == 0)
+ Language = LinkageSpecDecl::lang_c;
+ else if (strncmp (Lang, "\"C++\"", StrSize) == 0)
+ Language = LinkageSpecDecl::lang_cxx;

I'm not sure if strncmp is the right thing here. Will it work if "Lang" is not null terminated?

Uhm, this is why strncmp was invented!? Yes. strncmp only fails when languages are defined to have '\0' in their names. When that happens, we have bigger problems.

Also, please format the code as "strncmp(" instead of "strncmp (".

Done.

Finally, should 'extern "c"' be allowed?

No.

Ok, We'll try this one...

externc-3.patch (13.7 KB)

This is looking very nice. Some last remaining nit-picky details:

Time will tell if these are truly the last of the nit-pics... :slight_smile:

Hehe, the first significant patch is always the hardest.

+void LinkageSpecDecl::EmitInRec(Serializer& S) const {
+ Decl::EmitInRec(S);
+ S.EmitInt(getLanguage());
+ S.EmitPtr(D);
+}

Emitting the language enum as part of the serialization format implicitly makes the enum value part of the serialized encoding. I'm fine with this, but please add a comment about this issue above the enum definition:

Done. Hum, yeah, would be nice to have a nice clean separation between the two. For an abi, I think I'd rather have the value chosen by the dwarf standard, I think they both enumerate languages and select values. [ pause ] Ok, I fixed up the code.

Yep. At this point, we really don't care about 100% stability (because things are changing and immature), but when we do start caring we want to know what pieces implicitly define the abi.

+ if (strncmp (Lang, "\"C\"", StrSize) == 0)
+ Language = LinkageSpecDecl::lang_c;
+ else if (strncmp (Lang, "\"C++\"", StrSize) == 0)
+ Language = LinkageSpecDecl::lang_cxx;

I'm not sure if strncmp is the right thing here. Will it work if "Lang" is not null terminated?

Uhm, this is why strncmp was invented!? Yes. strncmp only fails when languages are defined to have '\0' in their names. When that happens, we have bigger problems.

Ah ok, this works because you're comparing StrSize bytes, not StrSize+1, silly me :slight_smile:

Finally, should 'extern "c"' be allowed?

No.

Ok!

Ok, We'll try this one...

Very nice, applied:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20080107/003743.html

-Chris