I have a test case(attached as fc_long.ll) that when run through the optimizer produces 65bit integer math(fc_long-opt.ll).

Now I understand that llvm can have any length integer, but I consider turning a 64bit mul into multiple 65 bit instructions to be a ‘bad’ optimization. This eventually expands to a 128bit multiply call(__multi3) which I have absolutely no interest in supporting. So I’m wondering what optimization might be the culprit here so I can disable it in this situation.

Micah

fc_long.ll (3.22 KB)

fc_long-opt.ll (2.22 KB)

I'm pretty sure this is the fault of one of the loop passes (I forget

which one off the top of my head). It's worth noting that the

optimizer manages to eliminate the loop; computing the exit value of

one of the loop variables uses the wide multiply.

-Eli

It's -indvars. I ran mem2reg, instcombine, and simplifycfg on your

input and got the attached file. I thought I could add "nuw nsw" to

the arithmetic inside the loop to say that i65 is unnecessary, but

that doesn't cause indvars to leave the arithmetic at i64. Is that a

bug in indvars, or in SCEV, or is this just a natural limitation of

SCEV?

fc_long.ll (1.56 KB)

It's a limitation of SCEV; the relevant function is

SCEVAddRecExpr::evaluateAtIteration in ScalarEvolution.cpp.

-Eli

I haven't looked at your attached .ll, but I have added code to the loop analyser to make it do this. Given the choice between not analysing loop evolutions and growing arithmetic by an extra bit, it will expand the arithmetic. It's in SCEV, which is used by pretty much every loop optimizer.

Nick

Villmow, Micah wrote:

It's SCEV. I have this very bit of code disabled in our tree for exactly the

reasons Micah gives. Check out BinomialCoefficient is ScalarEvolution.cpp.

Here's the big hack we put in, which is just the code that used to be there

in 2.3:

// We need at least W + T bits for the multiplication step

// FIXME: A temporary hack; we round up the bitwidths

// to the nearest power of 2 to be nice to the code generator.

unsigned CalculationBits = 1U << Log2_32_Ceil(W + T);

// Cray: Truncate to 64 bits. It's what we did with LLVM 2.3 and

// while it can create miscompilations in some cases if the

// computation happens to overflow, it's no worse than what we did

// before.

if (CalculationBits > 64) {

//return new SCEVCouldNotCompute();

CalculationBits = 64;

DEBUG(DOUT << "*** [dag] SEVERE WARNING: SCEV calculation truncated to 64

bits, check for miscompilation! ***\n");

}

I have never found a test where this actually causes a problem. And we have

tens of thousands of tests.

It is very bad form to create wide arithmetic where none was there before.

This is one of the downfalls of SCEV but I consider it to be minor compared

to the benefits.

-Dave

David,

Thanks for helping me figure out exactly where to put the check. I was

trying to do modifications in other parts of that function and it wasn't

working very well. I've attached a patch that parameterizes this and

allows it to be set from the commandline instead of hardcoding it.

Would it be possible to get this patch pushed into the tree?

Thanks,

Micah

ScalarEvolution.diff (1.09 KB)

LLVM's ScalarEvolution's handling of non-linear recurrences has much room

for improvement. The problem Micah describes sounds like the one

in lib/Analysis/README.txt.

Dan

Villmow, Micah wrote:

What does nsw nuw do?

It means it doesn't overflow whether considered signed or unsigned. For

example, in i8 arithmetic, 255+1 overflows in the unsigned sense, while

127+1 overflows in the signed sense. nsw nuw means neither of these

happens.

Ciao,

Duncan.