[clang] declspec(property...) advice

I'm currently working on getting the front-end to eat Microsoft's
__declspec(property...) extension.

http://msdn.microsoft.com/en-us/library/yhfk0thd(v=VS.100).aspx

I've gotten things to the point where I can parse it and generate a
PropertyAttr attached to the FieldDecl. The PropertyAttr has two string
attributes representing the name of the getter and setter member
functions. I am wondering if this is the best way to represent this in
the AST. My impression of how the feature is used is that expressions
are rewritten:

  "a = x.foo" becomes "a = x.getFoo()"
  "x.foo = a" becomes "x.setFoo(a)"
  "x.foo += 1" becomes "x.setFoo(x.getFoo() + 1)"

Also, when generating the layout for the class and the
constructors/destructors/assign methods, these pseudo-fields should be
ignored. These are all non-trivial rewrites. A few questions:

1) In your opinion, is having a normal FieldDecl in the AST with an
attribute the way to go? Or should there be an entirely different kind
of Decl for these pseudo-fields?

2) If the attribute is the right approach, is storing the getters and
setters as strings sensible? FWIW, Microsoft doesn't catch typos in the
getter/setter declaration until the property is actually referenced,
which seems to suggest that its doing a fairly primitive rewrite at the
point of use. So strings would work, IIUC. It does not eagerly perform
name lookup of the getters and setters within the class definition.

3) Where should the rewrite happen? Is it OK to leave in references to
the pseudo-field in the AST, or should the AST reflect what will
actually be executed -- invocations of the getters and setters? My
instinct tells me that the AST should have the pseudo-field, not the
getters and setters. Anything else might confuse things like IDEs.

Thanks in advance,

I'm currently working on getting the front-end to eat Microsoft's
__declspec(property...) extension.

http://msdn.microsoft.com/en-us/library/yhfk0thd(v=VS.100).aspx

I've gotten things to the point where I can parse it and generate a
PropertyAttr attached to the FieldDecl. The PropertyAttr has two string
attributes representing the name of the getter and setter member
functions. I am wondering if this is the best way to represent this in
the AST. My impression of how the feature is used is that expressions
are rewritten:

"a = x.foo" becomes "a = x.getFoo()"
"x.foo = a" becomes "x.setFoo(a)"
"x.foo += 1" becomes "x.setFoo(x.getFoo() + 1)"

Also, when generating the layout for the class and the
constructors/destructors/assign methods, these pseudo-fields should be
ignored. These are all non-trivial rewrites. A few questions:

1) In your opinion, is having a normal FieldDecl in the AST with an
attribute the way to go? Or should there be an entirely different kind
of Decl for these pseudo-fields?

There should be an entirely different kind of Decl for these properties, because they really have nothing to do with fields once we've gotten past their syntax. It'll be a subclass of ValueDecl, much like IndirectFieldDecl or ObjCPropertyDecl.

2) If the attribute is the right approach, is storing the getters and
setters as strings sensible? FWIW, Microsoft doesn't catch typos in the
getter/setter declaration until the property is actually referenced,
which seems to suggest that its doing a fairly primitive rewrite at the
point of use. So strings would work, IIUC. It does not eagerly perform
name lookup of the getters and setters within the class definition.

It makes sense to delay the lookup of the get/put names until after the class is defined (like we do with default arguments and inline member function bodies), but I find it really gross that name lookup doesn't occur until the property is actually referenced.

3) Where should the rewrite happen? Is it OK to leave in references to
the pseudo-field in the AST, or should the AST reflect what will
actually be executed -- invocations of the getters and setters? My
instinct tells me that the AST should have the pseudo-field, not the
getters and setters. Anything else might confuse things like IDEs.

The AST should have the pseudo-field (possibly with a special "MSPropertyRefExpr" expression type to reference them), and we should introduce implicit AST nodes (e.g., an ImplicitCastExpr wrapping a call) to handle the get and put calls. You're in luck, though: Objective-C has properties, and John McCall is currently refactoring lvalue-to-rvalue conversions to make properties (and other constructs) more explicit and cleaner. You should be able to use the Objective-C property implementation as a guide and follow what John is currently doing to implement this feature.

  - Doug

I'm currently working on getting the front-end to eat Microsoft's
__declspec(property...) extension.

http://msdn.microsoft.com/en-us/library/yhfk0thd(v=VS.100).aspx

I've gotten things to the point where I can parse it and generate a
PropertyAttr attached to the FieldDecl. The PropertyAttr has two
string attributes representing the name of the getter and setter
member functions. I am wondering if this is the best way to
represent this in the AST. My impression of how the feature is
used is that expressions are rewritten:

"a = x.foo" becomes "a = x.getFoo()" "x.foo = a" becomes
"x.setFoo(a)" "x.foo += 1" becomes "x.setFoo(x.getFoo() + 1)"

Also, when generating the layout for the class and the
constructors/destructors/assign methods, these pseudo-fields
should be ignored. These are all non-trivial rewrites. A few
questions:

1) In your opinion, is having a normal FieldDecl in the AST with an
attribute the way to go? Or should there be an entirely different
kind of Decl for these pseudo-fields?

There should be an entirely different kind of Decl for these
properties, because they really have nothing to do with fields once
we've gotten past their syntax. It'll be a subclass of ValueDecl,
much like IndirectFieldDecl or ObjCPropertyDecl.

OK, makes sense.

2) If the attribute is the right approach, is storing the getters
and setters as strings sensible? FWIW, Microsoft doesn't catch
typos in the getter/setter declaration until the property is
actually referenced, which seems to suggest that its doing a
fairly primitive rewrite at the point of use. So strings would
work, IIUC. It does not eagerly perform name lookup of the getters
and setters within the class definition.

It makes sense to delay the lookup of the get/put names until after
the class is defined (like we do with default arguments and inline
member function bodies), but I find it really gross that name lookup
doesn't occur until the property is actually referenced.

I agree, it's a bit gross. I can't think of a case where performing
name-lookup at the end of the class definition would produce different
results, but I'm a bit wary of trying to be smarter than the MS compiler
when the ultimate goal is feature parity with MS.

3) Where should the rewrite happen? Is it OK to leave in
references to the pseudo-field in the AST, or should the AST
reflect what will actually be executed -- invocations of the
getters and setters? My instinct tells me that the AST should have
the pseudo-field, not the getters and setters. Anything else might
confuse things like IDEs.

The AST should have the pseudo-field (possibly with a special
"MSPropertyRefExpr" expression type to reference them), and we
should introduce implicit AST nodes (e.g., an ImplicitCastExpr
wrapping a call) to handle the get and put calls.

I'm afraid I didn't follow that.

You're in luck,
though: Objective-C has properties, and John McCall is currently
refactoring lvalue-to-rvalue conversions to make properties (and
other constructs) more explicit and cleaner. You should be able to
use the Objective-C property implementation as a guide and follow
what John is currently doing to implement this feature.

OK, then I might just put off further work on this feature until the
refactorization is done and I have John's work to crib from.

John, can you please alert me when your work in this area is done?

Thanks,

2) If the attribute is the right approach, is storing the getters
and setters as strings sensible? FWIW, Microsoft doesn't catch
typos in the getter/setter declaration until the property is
actually referenced, which seems to suggest that its doing a
fairly primitive rewrite at the point of use. So strings would
work, IIUC. It does not eagerly perform name lookup of the getters
and setters within the class definition.

It makes sense to delay the lookup of the get/put names until after
the class is defined (like we do with default arguments and inline
member function bodies), but I find it really gross that name lookup
doesn't occur until the property is actually referenced.

I agree, it's a bit gross. I can't think of a case where performing
name-lookup at the end of the class definition would produce different
results,

It could matter when the code is ill-formed because the get or put operation doesn't exist (or can't be called). MS probably doesn't diagnose this until the property is used.

but I'm a bit wary of trying to be smarter than the MS compiler
when the ultimate goal is feature parity with MS.

We always have to weigh compatibility against solid design. Should every client of the AST have to cope with the possibility that the get or put operation hasn't been resolved? There's a real cost there.

3) Where should the rewrite happen? Is it OK to leave in
references to the pseudo-field in the AST, or should the AST
reflect what will actually be executed -- invocations of the
getters and setters? My instinct tells me that the AST should have
the pseudo-field, not the getters and setters. Anything else might
confuse things like IDEs.

The AST should have the pseudo-field (possibly with a special
"MSPropertyRefExpr" expression type to reference them), and we
should introduce implicit AST nodes (e.g., an ImplicitCastExpr
wrapping a call) to handle the get and put calls.

I'm afraid I didn't follow that.

Somehow, you need to represent an expression

  foo.property

That probably means creating a new expression node type. This expression node type may be used in an assignment (calling the put operation) or via an lvalue-to-rvalue conversion (calling the get operation), which will be represented via implicitly-generated AST nodes.

You're in luck,
though: Objective-C has properties, and John McCall is currently
refactoring lvalue-to-rvalue conversions to make properties (and
other constructs) more explicit and cleaner. You should be able to
use the Objective-C property implementation as a guide and follow
what John is currently doing to implement this feature.

OK, then I might just put off further work on this feature until the
refactorization is done and I have John's work to crib from.

No need to wait. You can get fairly far without tripping over John's work.

  - Doug

I’m a little worried about adopting our current representation for a C+±oriented feature. Presumably we’d need to allow filling in default arguments, etc., plus a wide range of stuff that we technically have to support with ObjC++ properties but actually haven’t gotten around to yet, like what happens when we’re doing a compound assignment on an implicit property reference and the getter and setter have really different types. Really I think this is an argument for improving the representation of all property expressions.

My intuition here is to just rename OK_ObjCProperty to OK_Property instead of differentiating the two cases. It can mean “this is an entity which can be both read and written to, but requires special code to do so”. The existing hook in Sema for loads (ConvertPropertyForRValue) can easily be modified to handle the MS properties, and we can make BuildBinOp check for OK_Property LHS’s and just immediately defer to some better hook.

John.

I’m a little worried about adopting our current representation for a C+±oriented feature. Presumably we’d need to allow filling in default arguments, etc., plus a wide range of stuff that we

If you are talking about Darwin ObjC property (with no feature extension), there is no allowance for default arguments.

technically have to support with ObjC++ properties but actually haven’t gotten around to yet, like what happens when we’re doing a compound assignment on an implicit property reference and the getter and setter have really different types. Really I think this is an argument for improving the representation of all property expressions.

If you are talking about the same property, they cannot have different types.

  • fariborz

I’m a little worried about adopting our current representation for a C+±oriented feature. Presumably we’d need to allow filling in default arguments, etc., plus a wide range of stuff that we

If you are talking about Darwin ObjC property (with no feature extension), there is no allowance for default arguments.

Sorry, I meant the MS extension; since the property expands to an unresolved method call, we’d have to be able to fill in default arguments.

technically have to support with ObjC++ properties but actually haven’t gotten around to yet, like what happens when we’re doing a compound assignment on an implicit property reference and the getter and setter have really different types. Really I think this is an argument for improving the representation of all property expressions.

If you are talking about the same property, they cannot have different types.

They can if they’re implicit properties.

John.

technically have to support with ObjC++ properties but actually haven't gotten around to yet, like what happens when we're doing a compound assignment on an implicit property reference and the getter and setter have really different types. Really I think this is an argument for improving the representation of all property expressions.

If you are talking about the same property, they cannot have different types.

They can if they're implicit properties.

That's why I never liked to call them properties :). I referred to them implicit call to setter/getters using property-dot syntax.
Looks like we have removed that distinction.

- Fariborz

Well, but you could still do compound assignments on them, so I'm not sure it helps us much in practice. :slight_smile:

John.

(resurrecting this old thread....)

I'm a little worried about adopting our current representation for a
C++-oriented feature. Presumably we'd need to allow filling in default
arguments, etc., plus a wide range of stuff that we technically have to
support with ObjC++ properties but actually haven't gotten around to
yet, like what happens when we're doing a compound assignment on an
implicit property reference and the getter and setter have really
different types. Really I think this is an argument for improving the
representation of all property expressions.

Indeed.

My intuition here is to just rename OK_ObjCProperty to OK_Property
instead of differentiating the two cases.

Yes.

It can mean "this is an
entity which can be both read and written to, but requires special code
to do so". The existing hook in Sema for loads
(ConvertPropertyForRValue) can easily be modified to handle the MS
properties, and we can make BuildBinOp check for OK_Property LHS's and
just immediately defer to some better hook.

And BuildUnaryOp for pre- and post-increment/decrement.

Doug, I've found a compelling reason why at least setter methods cannot
be resolved until the point of use. Consider:

  #include <cstdio>

  struct T {};
  struct U {};

  struct S {
    void setI( T ) { std::printf("T\n"); }
    void setI( U ) { std::printf("U\n"); }
    __declspec(property(put=setI)) int I;
  };

  int main() {
    S s;
    s.I = T();
    s.I = U();
  }

This prints:

  T
  U

Nasty, eh? I don't think the existing property stuff can handle this at
all, so like John, I feel this needs a major reworking.

Any and all suggestions welcome. This has become my #1 priority, and I
want to make forward progress quickly, but I want to do it right.

I'd like to share my current thinking on the property issue. I encourage
any and all interested parties to jump in.

Above, John McCall suggests modifying BuildBinOp to Do Something. I
think that something is to programatically rewrite the AST into the
appropriate getter and setter calls. Ditto for BuildUnaryOp. So for
instance, for this code:

  ++obj.property;

when BuildUnaryOp gets passed a PropertyRefExpr and the
pre-increment op code, it builds the AST for this:

  obj.setProperty(obj.getPropery() + 1)

Well, actually, it /should/ return the AST for the unary op as usual,
but hidden away inside it somewhere should be the AST for the above
rewrite. Why both? The AST should accurately reflect the fact that the
user wrote it as "++obj.property". However, creating the rewritten AST
nodes has the effect of type-checking the expression, AND the rewritten
AST can be used for codegen later (as long as it's accessible somehow).

I've been looking into how to generate those AST nodes. First,
obj.setProperty gets parsed into a MemberExpr, then, the paren and the
arguments are parsed, and the whole thing gets passed to
BuildCallToMemberFunction. I figure if we do all of that in the right
places and it type checks, we're golden. Stuff the result into the AST
somewhere reasonable and that should do it.

So that's my current thinking. This, however, is a radical departure
from how properties are currently handled in clang, and will require
some pretty invasive surgery.

Thoughts?

I've been looking into how to generate those AST nodes. First,
obj.setProperty gets parsed into a MemberExpr, then, the paren and the
arguments are parsed, and the whole thing gets passed to
BuildCallToMemberFunction. I figure if we do all of that in the right
places and it type checks, we're golden. Stuff the result into the AST
somewhere reasonable and that should do it.

Right. It should probably just call the appropriate functions on Sema
to do all this.

So that's my current thinking. This, however, is a radical departure
from how properties are currently handled in clang, and will require
some pretty invasive surgery.

I think that's basically right. It probably deserves its own expression
node to wrap the innards, which hopefully would encompass
compound assignments, non-compound assignments, and unary
inc/dec.

John.

I've been looking into how to generate those AST nodes. First,
obj.setProperty gets parsed into a MemberExpr, then, the paren and the
arguments are parsed, and the whole thing gets passed to
BuildCallToMemberFunction. I figure if we do all of that in the right
places and it type checks, we're golden. Stuff the result into the AST
somewhere reasonable and that should do it.

Right. It should probably just call the appropriate functions on Sema
to do all this.

So that's my current thinking. This, however, is a radical departure
from how properties are currently handled in clang, and will require
some pretty invasive surgery.

I think that's basically right. It probably deserves its own expression
node to wrap the innards, which hopefully would encompass
compound assignments, non-compound assignments, and unary
inc/dec.

In my thinking, we have this hierarchy:

PropertyRefExpr
  >
  +-----CXXMSPropertyRefExpr
  >
  +-----CXXBorlandPropertyRefExpr
  >
  +-----ObjCPropertyRefExpr

PropertyRefExpr defines pure virtuals for BuildBinaryOp and
BuildUnaryOp, as well as (maybe) ConvertToRValue and ConvertToLValue.
These return an Expr* that points to a new type of expression:
RewrittenExpr.

RewrittenExpr is really just a pair of Expr*. The first is the AST
corresponding to what the user actually wrote, e.g. ++obj.prop. The
second is the AST for the rewritten expression:
obj.setProp(obj.getProp()+1). The second is what actually gets used for
codegen.

Thoughts?

- --
Eric Niebler
BoostPro Computing
http://www.boostpro.com

(resurrecting this old thread....)

I'm a little worried about adopting our current representation for a
C++-oriented feature. Presumably we'd need to allow filling in default
arguments, etc., plus a wide range of stuff that we technically have to
support with ObjC++ properties but actually haven't gotten around to
yet, like what happens when we're doing a compound assignment on an
implicit property reference and the getter and setter have really
different types. Really I think this is an argument for improving the
representation of all property expressions.

Indeed.

My intuition here is to just rename OK_ObjCProperty to OK_Property
instead of differentiating the two cases.

Yes.

It can mean "this is an
entity which can be both read and written to, but requires special code
to do so". The existing hook in Sema for loads
(ConvertPropertyForRValue) can easily be modified to handle the MS
properties, and we can make BuildBinOp check for OK_Property LHS's and
just immediately defer to some better hook.

And BuildUnaryOp for pre- and post-increment/decrement.

Doug, I've found a compelling reason why at least setter methods cannot
be resolved until the point of use. Consider:

#include <cstdio>

struct T {};
struct U {};

struct S {
   void setI( T ) { std::printf("T\n"); }
   void setI( U ) { std::printf("U\n"); }
   __declspec(property(put=setI)) int I;
};

int main() {
   S s;
   s.I = T();
   s.I = U();
}

This prints:

T
U

Nasty, eh? I don't think the existing property stuff can handle this at
all, so like John, I feel this needs a major reworking.

Any and all suggestions welcome. This has become my #1 priority, and I
want to make forward progress quickly, but I want to do it right.

I'd like to share my current thinking on the property issue. I encourage
any and all interested parties to jump in.

Above, John McCall suggests modifying BuildBinOp to Do Something. I
think that something is to programatically rewrite the AST into the
appropriate getter and setter calls. Ditto for BuildUnaryOp. So for
instance, for this code:

++obj.property;

when BuildUnaryOp gets passed a PropertyRefExpr and the
pre-increment op code, it builds the AST for this:

obj.setProperty(obj.getPropery() + 1)

I'm not sure I understand why we have to have so expicit AST nodes, traditionally the nodes were as close to the source code as possible.
Can't we do semantic checking otherwise ? Do we need to simplify it to make it easier for codegen ?
One could argue that we should rewrite all
++variable;
to
variable = variable + 1;

-Argiris

(resurrecting this old thread....)

I'm a little worried about adopting our current representation for a
C++-oriented feature. Presumably we'd need to allow filling in default
arguments, etc., plus a wide range of stuff that we technically have to
support with ObjC++ properties but actually haven't gotten around to
yet, like what happens when we're doing a compound assignment on an
implicit property reference and the getter and setter have really
different types. Really I think this is an argument for improving the
representation of all property expressions.

Indeed.

My intuition here is to just rename OK_ObjCProperty to OK_Property
instead of differentiating the two cases.

Yes.

It can mean "this is an
entity which can be both read and written to, but requires special code
to do so". The existing hook in Sema for loads
(ConvertPropertyForRValue) can easily be modified to handle the MS
properties, and we can make BuildBinOp check for OK_Property LHS's and
just immediately defer to some better hook.

And BuildUnaryOp for pre- and post-increment/decrement.

Doug, I've found a compelling reason why at least setter methods cannot
be resolved until the point of use. Consider:

#include <cstdio>

struct T {};
struct U {};

struct S {
  void setI( T ) { std::printf("T\n"); }
  void setI( U ) { std::printf("U\n"); }
  __declspec(property(put=setI)) int I;
};

int main() {
  S s;
  s.I = T();
  s.I = U();
}

This prints:

T
U

Nasty, eh? I don't think the existing property stuff can handle this at
all, so like John, I feel this needs a major reworking.

Any and all suggestions welcome. This has become my #1 priority, and I
want to make forward progress quickly, but I want to do it right.

I'd like to share my current thinking on the property issue. I encourage
any and all interested parties to jump in.

Above, John McCall suggests modifying BuildBinOp to Do Something. I
think that something is to programatically rewrite the AST into the
appropriate getter and setter calls. Ditto for BuildUnaryOp. So for
instance, for this code:

++obj.property;

when BuildUnaryOp gets passed a PropertyRefExpr and the
pre-increment op code, it builds the AST for this:

obj.setProperty(obj.getPropery() + 1)

I'm not sure I understand why we have to have so expicit AST nodes, traditionally the nodes were as close to the source code as possible.
Can't we do semantic checking otherwise ? Do we need to simplify it to make it easier for codegen ?
One could argue that we should rewrite all
++variable;
to
variable = variable + 1;

Unfortunately, compound assignments have semantics that are somewhat more complicated than a simple top-down walk through a syntax tree: we evaluate the LHS as an l-value, coerce the loaded value to the type of the operation, perform the operation, coerce the result back to the l-value type, and store that in. We kindof finesse this right now — clients like IR generation have to duplicate the coercion logic that normally Sema would do. That's acceptable for most things only because there's a limit to how complicated this analysis can get. With property reference l-values, however, the l-value accesses are secretly function calls; we're not getting all the cases right even for Objective-C property references (e.g. with getters and setters of asymmetric type or of C++ class type), and I expect that's probably a non-starter for a C++-centered property system.

John.

Oh wow, no checking for Objective-C props, we leave it at the mercy of codegen which either emits bogus IR or crashes, e.g:

@interface Foo
-(float)myfo;
-(void)setMyfo: (int)p;
@end

void bar(Foo *x) {
  x.myfo++;
}

Call parameter type does not match function signature!
  %6 = fadd float %5, 1.000000e+00
i32 call void bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to void (i8*, i8*, i32)*)(i8* %8, i8* %7, float %6)
Broken module found, compilation aborted!

Ok, being more explicit is goodness but could we be consistent about it ? If we have the mechanism in place to rewrite compound assignments with AST nodes that are explicit but also indicate that there was a compound assignment in source code,
could we use it consistently for all compound assignments, not just properties, thereby simplifying a bit clients like codegen and the analyzer ?

-Argiris

(Resurrecting this old thread for a quick confirmation...)

I'm currently working on getting the front-end to eat Microsoft's
__declspec(property...) extension.

<snip>

There should be an entirely different kind of Decl for these
properties, because they really have nothing to do with fields once
we've gotten past their syntax. It'll be a subclass of ValueDecl,
much like IndirectFieldDecl or ObjCPropertyDecl.

Doug, we've followed your advice and put our property decls (MS and
Borland) under DeclaratorDecl (which is a ValueDecl).

However, we notice that ObjCPropertyDecl is *not* a ValueDecl. It's
merely a NamedDecl. Is that wrong? We don't propose changing it, we just
want to make sure that making MS and Borland properties are in their
rightful place in the decl hierarchy.

TIA,

(Resurrecting this old thread for a quick confirmation...)

I'm currently working on getting the front-end to eat Microsoft's
__declspec(property...) extension.

<snip>

There should be an entirely different kind of Decl for these
properties, because they really have nothing to do with fields once
we've gotten past their syntax. It'll be a subclass of ValueDecl,
much like IndirectFieldDecl or ObjCPropertyDecl.

Doug, we've followed your advice and put our property decls (MS and
Borland) under DeclaratorDecl (which is a ValueDecl).

However, we notice that ObjCPropertyDecl is *not* a ValueDecl. It's
merely a NamedDecl. Is that wrong?

Yes. ObjC has a few oddities like this.

We don't propose changing it, we just
want to make sure that making MS and Borland properties are in their
rightful place in the decl hierarchy.

You are on the right track with DeclaratorDecl.

Hi Eric,

What is the status of this patch?
I am testing clang against MFC code (yes really!!) and I really need
this extension to move forward.
Are you close to submitting something or do you have a partial patch at least?
Otherwise I am planning to go ahead and implement at least parsing
support for this extension.

(Apologies for the delay. We've been having problems with the
boostpro.com domain. Not only are our sites offline, but we're not
getting our email either.)