[PATCH] add constant expression evaluation to the AST and fix PR2413

Attached patch adds some capability to the AST to evaluate constant
expressions more general than integer constant expressions. It then
uses that capability to remove the hack added for
Sema/darwin-align-cast.c, which fixes PR2413.

This is a significant chunk of code, but this appears to be the best
way to fix the original issue in a way which doesn't impact
correctness for valid code. Some code like this is necessary anyway
to correctly honor 6.6p3 for Sema of initializer constant expressions.
I imagine that the additional code could also be useful for external
consumers (as an alternative to the broken isConstantExpr()), and
possibly also for CodeGen (to reduce the number of codepaths
evaluating constant expressions).

The evaluation bits aren't complete, but they're enough to deal with
Sema/darwin-align-cast.c. I'll probably do the rest incrementally.

As-is, this regresses CodeGen/struct-init.c, but I have a separate fix
for that which I'll commit soon.

I'm a bit tired right now, so I haven't reviewed this carefully myself
yet, but it appears to essentially work. I'm sending it to the list
for general discussion, and so the only copy isn't on my computer.

-Eli

tt.txt (18.6 KB)

This is an updated version with a few minor fixes.

-Eli

tt.txt (19.2 KB)

I'm kind of reluctant to land a patch of this size without some sort
of review... would someone mind looking at this?

Thanks,
Eli

Hi Eli,

Eli Friedman wrote:

  

Attached patch adds some capability to the AST to evaluate constant
expressions more general than integer constant expressions. It then
uses that capability to remove the hack added for
Sema/darwin-align-cast.c, which fixes PR2413.
    
Having a general way to evaluate constant expressions is a great idea and very useful for consumers.
However, having

TryEvaluateExprFloat
TryEvaluateExprComplex
TryEvaluateExprInt
...

doesn't seem to me that it "scales" well. There's going to be a lot of code duplication between all those functions and there isn't a way for the consumer to do action based on the type of the constant:

if (expr is integer)
..
else if (expr is float)
...

How about having a single "EvaluateConstantExpr" method, that accepts a structure that can hold all the possible values in a union (APSInt, APFloat, etc.) plus an enum that can indicate what is the type of the value returned (vt_int, vt_float, vt_complex, etc) ?

To indicate a non constant expression, it could return a vt_notConstant enum, or it could be implemented as "TryEvaluateConstantExpr" and return false in that case.

-Argiris

Argiris Kirtzidis wrote:

if (expr is integer)
..
else if (expr is float)
...

Just to be more clear about this, your design with TryEvaluateExprInt, TryEvaluateExprFloat is not the same as the above; these functions work more like

if (expr can be evaluated as integer)
..
else if (expr can be evaluated as float)
..

This way, for example, you cannot distinguish between an integer constant and a character one.

-Argiris

Having a general way to evaluate constant expressions is a great idea and
very useful for consumers.

Okay, cool.

However, having

TryEvaluateExprFloat
TryEvaluateExprComplex
TryEvaluateExprInt
...

doesn't seem to me that it "scales" well. There's going to be a lot of code
duplication between all those functions

There's a bit of duplication, but they're all working on different
types of data, so it's not that much duplication (for example, integer
addition, pointer addition, float addition, and complex addition all
require different calculations.)

and there isn't a way for the
consumer to do action based on the type of the constant:

if (expr is integer)
..
else if (expr is float)
...

That doesn't really end up being an issue, I don't think; an
expression of type float can only have a constant of type float, same
for complex float/int, same for pointer, (mostly) same for int. (The
"mostly" comes from the fact that an int might actually be a pointer
cast to an int; I'll probably end up sending this case through
TryEvaluateExprPtr as a fallback.)

How about having a single "EvaluateConstantExpr" method, that accepts a
structure that can hold all the possible values in a union (APSInt, APFloat,
etc.) plus an enum that can indicate what is the type of the value returned
(vt_int, vt_float, vt_complex, etc) ?

I was considering that at first... it's a better approach in some
ways, but there's an issue that completely kills this: you can't put
an APSInt or APFloat into a union. Any suggestions?

-Eli

Hmm? Are you saying that you'd want to distinguish between an integer
and a character constant? There isn't any significant difference in
terms of constant evaluation. Or are you saying that it would affect
consumers who shouldn't care? For this, see my other email; I expect
consumers would choose based on the type of the expression.

-Eli

Eli Friedman wrote:

  

and there isn't a way for the
consumer to do action based on the type of the constant:

if (expr is integer)
..
else if (expr is float)
...
    
That doesn't really end up being an issue, I don't think; an
expression of type float can only have a constant of type float, same
for complex float/int, same for pointer, (mostly) same for int. (The
"mostly" comes from the fact that an int might actually be a pointer
cast to an int; I'll probably end up sending this case through
TryEvaluateExprPtr as a fallback.)
  
Hmm, I agree, this isn't much of an issue.

How about having a single "EvaluateConstantExpr" method, that accepts a
structure that can hold all the possible values in a union (APSInt, APFloat,
etc.) plus an enum that can indicate what is the type of the value returned
(vt_int, vt_float, vt_complex, etc) ?
    
I was considering that at first... it's a better approach in some
ways, but there's an issue that completely kills this: you can't put
an APSInt or APFloat into a union. Any suggestions?
  
Ah, that certainly kills it.

I'm convinced that the interface of the methods is the right one. Not a major issue, but to avoid code repetition and make maintenance easier, how about having the implementation of the methods use template specializations ?
That way, for example, ParenExpr's evaluation, would be in one place only since it's common for all evaluation methods.

-Argiris

How about a discriminated union, something like this:

class AnyValue {

   enum { Undefined, Int, Float, ComplexInt, ComplexFloat } Kind;

   void *Data[ sizeof(2*max(APInt, APFloat))/sizeof(void*)];
};

The default ctor would initialize the enum to undefined. When it becomes one kind or another, placement new could be used to set up the body, and the dtor could use placement destroy to destroy it (based on the enum).

-Chris

I think this is a great idea, and a huge improvement over 'CalcFakeICEVal' :).

In addition to the discriminated union, it would also be nice to change this to use a StmtVisitor if possible, just to simplify the code and make it more modular. It would be natural to move the implementation to its own file if it is large, even though the entrypoint would stay a method on Expr.

-Chris