Is ConstantFoldConstantExpression meant to be idempotent?

I noticed a change in LLVM’s behavior between 3.2 and 3.3/ToT, but I’m not sure if it qualifies as a bug/regression or not.

The change is that in 3.3 and tip of trunk, calling ConstantFoldConstantExpression on

i64 and (i64 add (i64 add (i64 ptrtoint (i64* getelementptr (i64* null, i32 1) to i64), i64 ptrtoint (i64* getelementptr (i64* null, i32 1) to i64)), i64 15), i64 -16)


i64 and (i64 add (i64 ptrtoint (i64* getelementptr (i64* null, i32 1) to i64), i64 23), i64 -16)

which, when passed back to ConstantFoldConstantExpression, gets boiled down to

i64 16

LLVM 3.2 returns a ConstantInt from the first call. I’ve attached a small testcase which shows this behavior.

Is the fact that ConstantFoldConstantExpression does not produce a “minimal” constant expression a bug, or are clients expected to potentially need to iterate folding to get a final result? I didn’t see anything in 3.3’s release notes about changes to constant folding behavior.

t3.cpp (1.41 KB)

The original design had every constant being folded upon construction, thus
preventing us from ever needing to revisit an CE's operands when trying to
fold a CE. Unfortunately, we actually have two constant folders now, one
which has access to TargetData and the other which doesn't. The TD-aware
folder was designed with the same principle that it shouldn't need to fold
its operands, but in reality this doesn't work since there's no guarantee
that the operands have been folded with the TD-aware folder before. People
have been adding hacks to make the TD-aware folder look deeper into its
operands in an unprincipled fashion.

To answer your direct question, clients shouldn't need to iterate on the
result. That's a bug.

But looking at the larger picture, I think we should fix the underlying
problem by making TargetData a mandatory part of the Module and only having
one constant folder in LLVM.