Adding type source info to ImplicitValueInitExpr.

Hello.

The attached patch adds (optional) type source info to class ImplicitValueInitExpr: the type source info will have a valid value if the ImplicitValueInitExpr results from __builtin_offsetof constructs.

Cheers,
Enea Zaffanella.

ImplicitValueInitExpr.patch (7.24 KB)

I think we'd rather not take this. The primary use of ImplicitValueInitExpr does not involve a type written in the source; in fact, it involves no source code at all. This patch is useful solely because of the current implementation of __builtin_offsetof. That implementation is terrible, and it's long past time for __builtin_offsetof to get its own AST class. When that happens, this patch will be unnecessary.

If you are interested in working on this, it is being tracked as PR 5390: http://llvm.org/bugs/show_bug.cgi?id=5390

John.

We have thought to have __builtin_offsetof as a separate AST class and
this certainly has some advantages (mainly the fact to have the right
source range and some cleaning) and we definitely agree this would be a
good thing, but the patch from Enea try to solve a problem that is
orthogonal to this:

__builtin_offsetof(struct s, f) is currently expressed in the AST as

UnaryOperator(offsetof, MemberExpr(UnaryOperator(deref,
ImplicitValueInitExpr(PointerType(RecordType())))))

As you can see, also if we transform this in:

OffsetOf(MemberExpr(UnaryOperator(deref,
ImplicitValueInitExpr(PointerType(RecordType())))))

as wished in PR5390, we still have an ImplicitValueInitExpr where we
have to insert an appropriate TypeSourceInfo.

Unless you want to replace the use of ImplicitValueInitExpr in offsetof
expression (with what?), I think that patch from Enea should be accepted.

But, also if you hope to replace ImplicitValueInitExpr too someday, it
is not better to have a better intermediate suboptimal implementation
while waiting for the definitive solution to be implemented (after we
have clear how this should be), is it?

The attached patch adds (optional) type source info to class
ImplicitValueInitExpr: the type source info will have a valid value
if the ImplicitValueInitExpr results from __builtin_offsetof
constructs.

I think we'd rather not take this. The primary use of
ImplicitValueInitExpr does not involve a type written in the source;
in fact, it involves no source code at all. This patch is useful
solely because of the current implementation of __builtin_offsetof.
That implementation is terrible, and it's long past time for
__builtin_offsetof to get its own AST class. When that happens, this
patch will be unnecessary.

We have thought to have __builtin_offsetof as a separate AST class and
this certainly has some advantages (mainly the fact to have the right
source range and some cleaning) and we definitely agree this would be a
good thing, but the patch from Enea try to solve a problem that is
orthogonal to this:

__builtin_offsetof(struct s, f) is currently expressed in the AST as

UnaryOperator(offsetof, MemberExpr(UnaryOperator(deref,
ImplicitValueInitExpr(PointerType(RecordType())))))

As you can see, also if we transform this in:

OffsetOf(MemberExpr(UnaryOperator(deref,
ImplicitValueInitExpr(PointerType(RecordType())))))

as wished in PR5390, we still have an ImplicitValueInitExpr where we
have to insert an appropriate TypeSourceInfo.

Unless you want to replace the use of ImplicitValueInitExpr in offsetof
expression (with what?), I think that patch from Enea should be accepted.

Syntactically, __builtin_offsetof does not have any subexpressions; I would prefer that OffsetOfExpr (or whatever) follow this and simply be a leaf node with a complex internal structure, much like its input from the parser.

But, also if you hope to replace ImplicitValueInitExpr too someday, it
is not better to have a better intermediate suboptimal implementation
while waiting for the definitive solution to be implemented (after we
have clear how this should be), is it?

I'm fine with workarounds up to the point that they start affecting unrelated parts of the system.

If you need something now and don't want to change __builtin_offsetof's representation that drastically, perhaps changing the ImplicitValueInitExpr to a CStyleCastExpr of 0? Those at least naturally have a written type.

John.

John McCall wrote:

The attached patch adds (optional) type source info to class
ImplicitValueInitExpr: the type source info will have a valid value
if the ImplicitValueInitExpr results from __builtin_offsetof
constructs.

I think we'd rather not take this. The primary use of
ImplicitValueInitExpr does not involve a type written in the source;
in fact, it involves no source code at all. This patch is useful
solely because of the current implementation of __builtin_offsetof.
That implementation is terrible, and it's long past time for
__builtin_offsetof to get its own AST class. When that happens, this
patch will be unnecessary.

Hello.

A student of mine would like to have a try fixing the thing above
(a.k.a., PR 5390), unless there already is someone working on it.

The approach we would follows is as follows:

1) Take out OffsetOf from UnaryOperator::OpCode and have instead a dedicated class inheriting from Expr (OffsetOfExpr).

2) Create a new class OffsetOfBaseExpr that will replace current (ab-)uses of ImplicitValueInitExpr in the context of the new class above: this new class will embed a TypeSourceInfo object, so as to be able to provide suitable source location info.

Would that be OK?

Regards,
Enea Zaffanella.

I don't understand why you need this second expression class; the base of an offsetof is a type, not an expression. You should be able to get by with just having a TypeSourceInfo as a field in the OffsetOfExpr.

Otherwise this sounds excellent; by all means, go ahead. I don't know of anyone else doing this.

John.

The problem don't come from offsetof base, but from offsetof member:
once taken for granted that the more suitable way to express that is an
Expr we'd need another Expr node to represent the typed base of such
member expression (exactly as it works currently, only with more
appropriate Expr nodes).

We are missing something? There is a smarter way you see?

Forgive me; I missed an important point of your earlier message. I am recommending that you radically restructure the representation of offsetof so that it no longer has any sub-expressions at all and simply carries all the necessary information directly on a single expression node. This node would have an array of "chunks", much like the input Sema gets from the parser. When the type is non-dependent, the actual offset should also be stored directly on the OffsetOfExpr node for the sake of simplicity; it is straightforward to calculate this during validation of the offsetof, and it removes what would otherwise be a great deal of unnecessary complexity from the constant evaluator.

John.