Helpers to compute sizeof/alignof/offsetof?

Currently, there are multiple places that need the values for
sizeof/alignof/offsetof (at least two places in isIntegerConstantExpr
and the visitor methods in CGExprScalar.cpp). However, there is no
common method to compute the needed information. There really should
be common methods (probably in ASTContext?) to get the result of
evaluating these expressions, because the logic isn't trivial.

Sound reasonable?

The reason I came upon this is that sizeof(void) currently crashes,
and there are currently three places that have to be changed to get it
right.

Implementation notes:
The two versions in isIntegerConstantExpr for sizeof/alignof (which
appear to be the same) look mostly correct, although they don't handle
sizeof(void) correctly... the version in CGExprScalar.cpp also
mishandles "sizeof(void ())" (and a theoretical arch where CHAR_BIT !=
8...). Also, ASTContext::getTypeInfo currently mishandles references
in order to get the sizeof case correct, so once an alternative
exists, that should be fixed.

-Eli

Currently, there are multiple places that need the values for
sizeof/alignof/offsetof (at least two places in isIntegerConstantExpr
and the visitor methods in CGExprScalar.cpp). However, there is no
common method to compute the needed information. There really should
be common methods (probably in ASTContext?) to get the result of
evaluating these expressions, because the logic isn't trivial.

Sound reasonable?

Sounds good to me...

snaroff

Currently, there are multiple places that need the values for
sizeof/alignof/offsetof (at least two places in isIntegerConstantExpr
and the visitor methods in CGExprScalar.cpp). However, there is no
common method to compute the needed information. There really should
be common methods (probably in ASTContext?) to get the result of
evaluating these expressions, because the logic isn't trivial.

Sound reasonable?

The reason I came upon this is that sizeof(void) currently crashes,
and there are currently three places that have to be changed to get it
right.

Sounds good to me.

Implementation notes:
The two versions in isIntegerConstantExpr for sizeof/alignof (which
appear to be the same) look mostly correct, although they don't handle
sizeof(void) correctly... the version in CGExprScalar.cpp also
mishandles "sizeof(void ())" (and a theoretical arch where CHAR_BIT !=
8...). Also, ASTContext::getTypeInfo currently mishandles references
in order to get the sizeof case correct, so once an alternative
exists, that should be fixed.

The only big gotcha will be sizeof(VLA), which doesn't return a constant. I don't think there is a good way to represent this without knowing the client. The code generator (f.e.) needs to expand this out to LLVM IR instructions.

-Chris

Currently, there are multiple places that need the values for
sizeof/alignof/offsetof (at least two places in isIntegerConstantExpr
and the visitor methods in CGExprScalar.cpp). However, there is no
common method to compute the needed information. There really should
be common methods (probably in ASTContext?) to get the result of
evaluating these expressions, because the logic isn't trivial.

Sound reasonable?

The reason I came upon this is that sizeof(void) currently crashes,
and there are currently three places that have to be changed to get it
right.

Sounds good to me.

I also think this is a good idea. ASTContext seems like a reasonably place since all the type information is there, and clients of the ASTs are likely to have a reference to ASTContext.

Implementation notes:
The two versions in isIntegerConstantExpr for sizeof/alignof (which
appear to be the same) look mostly correct, although they don't handle
sizeof(void) correctly... the version in CGExprScalar.cpp also
mishandles "sizeof(void ())" (and a theoretical arch where CHAR_BIT !=
8...). Also, ASTContext::getTypeInfo currently mishandles references
in order to get the sizeof case correct, so once an alternative
exists, that should be fixed.

The only big gotcha will be sizeof(VLA), which doesn't return a
constant. I don't think there is a good way to represent this without
knowing the client. The code generator (f.e.) needs to expand this
out to LLVM IR instructions.

Agreed, clients will need to perform their own reasoning about VLAs. What the code generator and (say) the static analysis engine do are context specific.

For the "generic" method that returns the sizeof, we may consider having it return a variant type that wraps an integer primitive. The variant could also have a flag indicating whether or the not the sizeof returned is valid. This way we can have the generic method return "unknown" for VLAs if clients just blindly call sizeof on VLA types. For example:

class SizeofVariant {
   unsigned value; // probably will use something other than unsigned, but you get the idea
   bool IsValid;
public:
   SizeOfVariant(unsigned V) : value(V), IsValid(true) {}
   explicit SizeOfVariant() : value(0), IsValid(false) {}

   unsigned getValue() const { assert (IsValid); return value; }
   bool isValid() const { return IsValid; };

   operator bool() const { return isValid(); }
   operator unsigned() const { return getValue(); }
};

The only big gotcha will be sizeof(VLA), which doesn't return a
constant. I don't think there is a good way to represent this without
knowing the client. The code generator (f.e.) needs to expand this
out to LLVM IR instructions.

Agreed, clients will need to perform their own reasoning about VLAs.
What the code generator and (say) the static analysis engine do are
context specific.

Right.

For the "generic" method that returns the sizeof, we may consider
having it return a variant type that wraps an integer primitive. The
variant could also have a flag indicating whether or the not the
sizeof returned is valid. This way we can have the generic method
return "unknown" for VLAs if clients just blindly call sizeof on VLA
types. For example:

How about a really simple variant: int64_t :slight_smile:

Define -1 to be "invalid" and -2 to be "vla" and you're golden.

-Chris

That's a nice solution because of its simplicity. One thing that I like about the solution that I proposed is the extra checking done by the type system, the automatic assertions, and a clear API to query the sizeof value instead of checking for magic numbers (just use methods: isVLA(), isValid(), etc.). Using a thin class also doesn't cost us anything in terms of runtime overhead, although it will have a negligible impact on compile time.