[cfe-commits] r100942 - /cfe/trunk/lib/CodeGen/CGObjC.cpp

taking this to cfe-dev.

To me, this is a very similar cousin to "cast of super" which we explicitly disallow.

The big issue here is that from a language perspective, "super" is not an expression at all. It is a magic token that is only valid in a message send. Things like super->ivar *should* be disallowed, as should "id x = super" and many other things that we probably accidentally allow.

Currently, super->ivar is disallowed, so is id x = super. These are accepted by GCC, although that's probably not sensible.

I think that the big AST change to model this correctly is really important, because it will fix a number of other problems and cases where Sema probably accidentally accepts invalid code (what about foo(super), passing it as an argument)? This is not about a single special case, this is about the design always being broken and wanting to fix the root problem.

Sema allows foo(super), but it stops in CodeGen. This is probably worth fixing, but I'm not entirely convinced by your proposed solution.

This is accepted by GCC, although it generates a spurious warning about incompatible pointer types.

Just "leaving it in" will make clang accept illegal code - weaning the invalid code off their constructs is going to be more difficult in the future than it is now. Since clang has never accepted this, we just continue rejecting it.

The change doesn't make it accept illegal code, it will make it accept legal code. The following absolutely should be legal:

[(super) foo];

While you would never write this directly, it is relatively common as a result of macro expansion, when a macro is used to optionally remove message sends. Consider a macro like this:

#ifndef DISABLE_LOGS
#define LOG(x) [(x) log]
#else
#define LOG(x)
#endif

The brackets are in the macro, but when you call it like this, gcc accepts it and clang rejects it:

FOO(super);

If you make a message send to super a flag in the AST, then how would you represent [(super) foo] or [((super)) foo] in the AST? Would they be the same as [super foo]? How would this effect AST consumers other than CodeGen?

David

-- Sent from my PDP-11

The big issue here is that from a language perspective, "super" is not an expression at all. It is a magic token that is only valid in a message send. Things like super->ivar *should* be disallowed, as should "id x = super" and many other things that we probably accidentally allow.

Currently, super->ivar is disallowed, so is id x = super.

Good.

These are accepted by GCC, although that's probably not sensible.

No, they aren't.

I think that the big AST change to model this correctly is really important, because it will fix a number of other problems and cases where Sema probably accidentally accepts invalid code (what about foo(super), passing it as an argument)? This is not about a single special case, this is about the design always being broken and wanting to fix the root problem.

Sema allows foo(super), but it stops in CodeGen.

Right, that is bad, my solution would fix that.

Just "leaving it in" will make clang accept illegal code - weaning the invalid code off their constructs is going to be more difficult in the future than it is now. Since clang has never accepted this, we just continue rejecting it.

The change doesn't make it accept illegal code, it will make it accept legal code. The following absolutely should be legal:

[(super) foo];

No, that should not be legal. Super is not an expression. It is not valid to use it in expression contexts: paren exprs require that the subexpression be an expression.

While you would never write this directly, it is relatively common as a result of macro expansion, when a macro is used to optionally remove message sends. Consider a macro like this:

#ifndef DISABLE_LOGS
#define LOG(x) [(x) log]
#else
#define LOG(x)
#endif

Just because something is convenient doesn't mean it should be legal.

If you make a message send to super a flag in the AST, then how would you represent [(super) foo] or [((super)) foo] in the AST?

You wouldn't, we should reject this code, that's the whole point.

-Chris

The big issue here is that from a language perspective, "super" is not an expression at all. It is a magic token that is only valid in a message send. Things like super->ivar *should* be disallowed, as should "id x = super" and many other things that we probably accidentally allow.

Currently, super->ivar is disallowed, so is id x = super.

Good.

These are accepted by GCC, although that's probably not sensible.

No, they aren't.

They seem to be on my copy of GCC. I just tested all of these and they worked.

[(super) foo];

No, that should not be legal. Super is not an expression. It is not valid to use it in expression contexts: paren exprs require that the subexpression be an expression.

While you would never write this directly, it is relatively common as a result of macro expansion, when a macro is used to optionally remove message sends. Consider a macro like this:

#ifndef DISABLE_LOGS
#define LOG(x) [(x) log]
#else
#define LOG(x)
#endif

Just because something is convenient doesn't mean it should be legal.

The fact that something is convenient, supported by GCC, and used in existing code seems like a good reason for accepting it...

If you make a message send to super a flag in the AST, then how would you represent [(super) foo] or [((super)) foo] in the AST?

You wouldn't, we should reject this code, that's the whole point.

I don't really see a convincing argument for rejecting it. Even if super is not an expression, being able to put it in brackets seems logical because you use it in the same place as expressions. If you can put some message receivers in brackets and not others, this makes the language inconsistent and makes macros even more irritating to work with than the C spec normally does.

And, if something can be used in the same place as an expression, but isn't an expression, what is it?

David

-- Sent from my brain

By your logic, we would accept cast of super... But we don't and it is intentional. This is the same issue.

-Chris

No, because cast of super is never a sensible thing to do. The class of super is fixed at compile time, so casting it is always wrong. If code does this, then it should be fixed.

In contrast, putting parentheses around super does not change its semantics. If you wish for clang not to support this, what do you propose as the alternative for the (currently in use in existing code) use case that I described in my last mail?

Cast of super is a work around for calling private methods in the superclass. You can work around its lack trivially with a private category on the superclass. You have not proposed a way of replacing macros that conditionally call a method (a widely-used idiom) when they are passed super as an argument. If the macro takes an expression, it must be parenthetical. If the macro takes super, it must not be parenthetical.

Note that the fact that Smalltalk does not support this construct is not an issue because Smalltalk does not use the C preprocessor, it uses a much more sane model for AST transforms. The same is true of Java. Only the fact that Objective-C is used with the C preprocessor makes this an issue.

David

The point is, super is not an expression, it is a magic keyword that affects messages sends. It cannot be used in any other places that expressions are allowed, so it should not be part of the expression grammar.

-Chris

Which is a side issue. There are three questions here. First, should we support super in parentheses in message send expressions? I say we should because:

- It is supported by GCC.
- It is used in existing code.
- It is required for macros that send messages to their arguments.
  - You have not proposed any way for people to work around this

You say no because is counter to your private sense of aesthetics.

Question two. Should we do so with the existing infrastructure and the one-line change that I just made? I say yes because:

- It is a trivial change.
- It works now.
- It allows existing code to be compiled with Clang that was previously requiring GCC.

You say no because is counter to your private sense of aesthetics.

Question three. Should we work to remove ObjCSuperExpr. You say yes because:

- It is not really an expression.
- It complicates semantic analysis.
- It doesn't truly reflect the language semantics.

I say I don't really care, but it's probably a nicer design longer term and if someone wants to put some effort into it then it will probably make things cleaner and easier.

David

-- Sent from my PDP-11

The point is, super is not an expression, it is a magic keyword that affects messages sends. It cannot be used in any other places that expressions are allowed, so it should not be part of the expression grammar.

Which is a side issue. There are three questions here. First, should we support super in parentheses in message send expressions? I say we should because:

- It is supported by GCC.
- It is used in existing code.
- It is required for macros that send messages to their arguments.
  - You have not proposed any way for people to work around this

You say no because is counter to your private sense of aesthetics.

I say no because it doesn't fit with the rest of the compiler, our current implementation of super is a ecological disaster and needs to be fixed, and it is obviously not been used enough to be reported yet. I would argue that cast of super is much more widely used than parenthesized super.

Question two. Should we do so with the existing infrastructure and the one-line change that I just made? I say yes because:

- It is a trivial change.
- It works now.
- It allows existing code to be compiled with Clang that was previously requiring GCC.

You say no because is counter to your private sense of aesthetics.

This encourages broken code. Continuing to reject it forces people to fix their broken code. This is exactly what we did with cast of super, and the world is better place because of it.

Whether ObjCSuperExpr is removed or not seems completely unrelated to the user visible issue of whether the code is accepted or rejected.

-Chris

I say no because it doesn't fit with the rest of the compiler, our current implementation of super is a ecological disaster and needs to be fixed, and it is obviously not been used enough to be reported yet. I would argue that cast of super is much more widely used than parenthesized super.

I suspect that it hasn't been reported because clang is only very recently able to compile most Objective-C code without random things breaking. Now that most things are working we're getting into tidying up the last bits.

Question two. Should we do so with the existing infrastructure and the one-line change that I just made? I say yes because:

- It is a trivial change.
- It works now.
- It allows existing code to be compiled with Clang that was previously requiring GCC.

You say no because is counter to your private sense of aesthetics.

This encourages broken code. Continuing to reject it forces people to fix their broken code. This is exactly what we did with cast of super, and the world is better place because of it.

Please can we define 'broken' here? I gave a use case for it. You have still, after three replies, not propose a way in which code that uses this idiom could be rewritten to work with clang. Cast of super can, as I have already said be, worked around by declaring the private method in a category. How do you work around macros that need to be able to take any valid receiver as an argument? Give me a solution to this problem, and I'll shut up and write about your proposed work around in my new Objective-C book.

Simply declaring code as broken when it has always worked in GCC and solves a specific problem is not a constructive approach. The world is not a better place when you take code that works, is easy to read and understand, and declare unilaterally that it is broken and needs to be fixed, but don't propose a way of fixing it.

Whether ObjCSuperExpr is removed or not seems completely unrelated to the user visible issue of whether the code is accepted or rejected.

But this is the argument that you seem to be using to motivate the user-visible change.

David

I'm going to weigh in here. I checked the spec, and it's clear that super is not allowed except as the receiver in a message send. Whether this should be supported as an extension is another question (and not one that I can answer satisfactorily, seeing as I don't use Objective-C), but even if we do support it as an extension, it should at least give a warning in all cases.

Sean

I don't think anyone is arguing that we should support the general case of treating super as an expression equivalent to self. GCC does, but it leads to code that doesn't do what it looks like it ought to do and is an artefact of how Objective-C is implemented in GCC rather than a design decision for the language. The only question is whether, in the one context where we do want to allow super, we allow it to be in parentheses, as it may be as a result of macro expansion.

David

I say no because it doesn’t fit with the rest of the compiler, our current implementation of super is a ecological disaster and needs to be fixed, and it is obviously not been used enough to be reported yet. I would argue that cast of super is much more widely used than parenthesized super.

I suspect that it hasn’t been reported because clang is only very recently able to compile most Objective-C code without random things breaking. Now that most things are working we’re getting into tidying up the last bits.

Uh, we’ve compiled a lot of objective-c code, and Apple has been shipping clang as a production quality c/objc compiler since last summer.

This encourages broken code. Continuing to reject it forces people to fix their broken code. This is exactly what we did with cast of super, and the world is better place because of it.

Please can we define ‘broken’ here? I gave a use case for it. You have still, after three replies, not propose a way in which code that uses this idiom could be rewritten to work with clang.

There are lots of ways, remove the parens from the macro or don’t use a macro at all. Amazing.

Simply declaring code as broken when it has always worked in GCC and solves a specific problem is not a constructive approach.

David, I respect you a lot, but I feel like I’ve given you several explanations of why this is wrong. I can’t help that you don’t want to hear it :slight_smile:

Have you looked at all the places and weird ad-hoc things that clang is doing w.r.t. super? It is a real mess right now, it isn’t a keyword, it is special cased in a few bizarre places. I want to make it so the parser just handles this in the message dispatch parser as a context sensitive keyword. This is the semantic behavior of it, it should not be parsed as a general expression because it doesn’t make sense.

-Chris

Uh, we've compiled a *lot* of objective-c code, and Apple has been shipping clang as a production quality c/objc compiler since last summer.

Perhaps this is not an idiom used inside Apple. XCode does not default to clang, and most projects that I have tried compiling only worked without issues in the last few months. ABI support has been a big issue for non-Darwin users, with various random seeming crashes that have only been addressed (and reported - no implied criticism here) recently.

There are lots of ways, remove the parens from the macro or don't use a macro at all. Amazing.

Neither of these are solutions. Removing the parentheses from the macro is not an option, because they are required when something that is an expression is used. Don't use the macro at all is not really a solution either - macros of this form are already in use and suggesting that someone goes through their existing code and manually expands every instance of the macro (including wrapping it in #ifdefs) is more of a problem than a solution.

Macros of this form are included with the standard GNUstep headers, and have been for the past 15 years. Our next release will be the first one to officially support clang as a compiler, although the fact that you broke the address of label extension just before 1.1 means that we have to require clang from svn to build our Foundation implementation.

This is the first time that we are encouraging downstream developers to use clang, and you are intentionally breaking the compilation of their code, if they are using these macros.

Simply declaring code as broken when it has always worked in GCC and solves a specific problem is not a constructive approach.

David, I respect you a lot, but I feel like I've given you several explanations of why this is wrong. I can't help that you don't want to hear it :slight_smile:

I agree that treating super as an expression is wrong in the general case, but I disagree that it is wrong for super to be allowed to be in brackets. Putting something in brackets does not change its semantics, is an idiom used in the wild, and does not require it to be treated as an expression.

Have you looked at all the places and weird ad-hoc things that clang is doing w.r.t. super? It is a real mess right now, it isn't a keyword, it is special cased in a few bizarre places. I want to make it so the parser *just* handles this in the message dispatch parser as a context sensitive keyword. This is the semantic behavior of it, it should not be parsed as a general expression because it doesn't make sense.

With regard to your latest change, I really dislike this warning:

'super' not valid when not in a method

This is misleading, because 'super' is valid when not used in a method, as long as it is declared as a variable. I often use super as a variable name referring to a superclass in Objective-C code outside of methods, especially in free-standing methods that will be attached to a class at run time. And please don't tell me that this is a bug - Objective-C is a pure superset of C and, by design, does not declare any keywords that are valid C identifiers. You also seem to have broken some valid code:

$ cat sup.m && clang -c sup.m
#import <objc/Object.h>

int main(void)
{
  id super = nil;
  [(Object*)super new];
  int *s1 = (int*)super;
  return 0;
}
sup.m:6:12: error: cannot cast 'super' (it isn't an expression)
        [(Object*)super new];
         ~~~~~~~~~^
sup.m:7:18: error: cannot cast 'super' (it isn't an expression)
        int *s1 = (int*)super;
                  ~~~~~~^
2 errors generated.

super is an expression here. It is a local variable. This is entirely valid, especially in the second case where there is no Objective-C at all. Pure C code should always compile correctly in Objective-C mode.

David

-- Sent from my brain

Hello-

Forgive me, but if (super) is really necessary for David, could it not be implemented as a preprocessor hack which just eats the parens, with associated warning, thus avoiding any codegen/parser/sema complications?

Alistair

That would work for me - I don't mind if (super) is only valid as the result of a macro expansion. It's a ugly solution, but it's significantly less ugly than the existing code.

David

Uh, we've compiled a *lot* of objective-c code, and Apple has been shipping clang as a production quality c/objc compiler since last summer.

Perhaps this is not an idiom used inside Apple. XCode does not default to clang, and most projects that I have tried compiling only worked without issues in the last few months.

Arguing about whether people are using clang or not is a moot point. If they aren't already, they will at some point. We intentionally have decided to reject some constructs that GCC supports (for a number of reasons, forcing people to clean up their code is one). Doing this when transitioning from GCC -> Clang is a much easier thing for developers to understand than to find out they upgraded a minor version of their compiler and suddenly their code doesn't build.

ABI support has been a big issue for non-Darwin users, with various random seeming crashes that have only been addressed (and reported - no implied criticism here) recently.

I'm not sure what you mean.

There are lots of ways, remove the parens from the macro or don't use a macro at all. Amazing.

Neither of these are solutions. Removing the parentheses from the macro is not an option, because they are required when something that is an expression is used. Don't use the macro at all is not really a solution either - macros of this form are already in use and suggesting that someone goes through their existing code and manually expands every instance of the macro (including wrapping it in #ifdefs) is more of a problem than a solution.

Macros of this form are included with the standard GNUstep headers, and have been for the past 15 years. Our next release will be the first one to officially support clang as a compiler, although the fact that you broke the address of label extension just before 1.1 means that we have to require clang from svn to build our Foundation implementation.

This is the first time that we are encouraging downstream developers to use clang, and you are intentionally breaking the compilation of their code, if they are using these macros.

Why not include a second version of the macro that works with super? Alternatively, why not just drop the parens from the macro definition?

I'm not sure how address of label is broken, PR#?

Simply declaring code as broken when it has always worked in GCC and solves a specific problem is not a constructive approach.

David, I respect you a lot, but I feel like I've given you several explanations of why this is wrong. I can't help that you don't want to hear it :slight_smile:

I agree that treating super as an expression is wrong in the general case, but I disagree that it is wrong for super to be allowed to be in brackets. Putting something in brackets does not change its semantics, is an idiom used in the wild, and does not require it to be treated as an expression.

I agree that putting an expression in parentheses does not change it semantics, but super isn't an expression. You can't do:

x (+) y

for example. You also can't do:

for (id x (in) ...

etc. Why should you be able to parenthesize super? It doesn't make any sense, and arguing about language semantics from a convenience standpoint doesn't fly well, there are lots of things that don't make sense that would be convenient.

Have you looked at all the places and weird ad-hoc things that clang is doing w.r.t. super? It is a real mess right now, it isn't a keyword, it is special cased in a few bizarre places. I want to make it so the parser *just* handles this in the message dispatch parser as a context sensitive keyword. This is the semantic behavior of it, it should not be parsed as a general expression because it doesn't make sense.

With regard to your latest change, I really dislike this warning:

'super' not valid when not in a method

This is misleading, because 'super' is valid when not used in a method, as long as it is declared as a variable.

Sure, this is the typical problem with diagnostics: you have to guess what the user meant and/or what the most useful message is to help people fix their code. I inferred that a likely reason to get this message is refactoring code from a method out to a C function. ObjC programers consider super to be a magic incantation, not something they're likely to name variables (despite it not actually being a keyword).

I don't think it would be particularly useful to emit "'super' not valid when not in a method, but you could name a variable 'super' if you really wanted to".

I often use super as a variable name referring to a superclass in Objective-C code outside of methods, especially in free-standing methods that will be attached to a class at run time. And please don't tell me that this is a bug - Objective-C is a pure superset of C and, by design, does not declare any keywords that are valid C identifiers.

I agree, that is not a bug.

You also seem to have broken some valid code:

$ cat sup.m && clang -c sup.m
#import <objc/Object.h>

int main(void)
{
  id super = nil;
  [(Object*)super new];
  int *s1 = (int*)super;
  return 0;
}
sup.m:6:12: error: cannot cast 'super' (it isn't an expression)
       [(Object*)super new];
        ~~~~~~~~~^
sup.m:7:18: error: cannot cast 'super' (it isn't an expression)
       int *s1 = (int*)super;
                 ~~~~~~^
2 errors generated.

super is an expression here. It is a local variable. This is entirely valid, especially in the second case where there is no Objective-C at all. Pure C code should always compile correctly in Objective-C mode.

I'll take a look thanks!

-Chris