Turn off certain IR optimizations

I (Brian actually) am compiling EMBench and certain IR optimizations are getting in the way of good code for my architecture. For example:: In EMBench: edn.c: subroutine mac:

long int
mac (const short *a, const short *b, long int sqr, long int *sum)
{
long int i;
long int dotp = *sum;

for (i = 0; i < 150; i++)
{
dotp += b[i] * a[i];
sqr += b[i] * b[i];
}

*sum = dotp;
return sqr;
}

gets turned into (the c equivalent of)::

long int
mac (const short *a, const short *b, long int sqr, long int *sum)
{
long int i;
long int dotp = *sum;

for (i = 150; i >= 0; i–)
{
dotp += b[150-i] * a[150-i];
sqr += b[150-i] * b[150-i];
}

*sum = dotp;
return sqr;
}

This makes the loop 13 instructions instead of 8 instructions.

{Hint: my architecture has an instruction that does ADD-CMP-BC as 1 instruction in 1
cycle. So this optimization is completely unnecessary for my architecture.
Secondarily: My ISA can perform b[i] in a single instruction, but b[150-i] is 3 instructions.}

I understand that in many ISAs that counting towards zero takes 1 fewer instruction
than counting to a numeric value. Mine does not happen to be one of those.

How can I turn this off ??

In addition IR-optimizer is converting things like::

 if( u < 16 )

into:

 if( u <=15 )

{Probably only for the unsigned types}

Is there a way to turn this “optimization” off ?

Thanks, greatly.

Can you please clarify which pass performs this transform? From the middle-end perspective, I don’t think we prefer an IV that counts down, so I’d be surprised if a middle-end transform introduces this.

In the back end, the LSR (loop strength reduce) pass is responsible for bringing loop induction variables into a form that is beneficial for the target. This is the pass that knows about details like whether “b[150-i]” can be lowered efficiently for a specific target or not.

If LSR is the pass performing this transform, then you likely need to implement some TTI hooks that control the behavior of this pass.

This is Brian (aka Mitch’s compiler guy). It is LSR that is responsible. I have implemented two TTI hooks but that does not seem to do the trick. The hooks are:
bool isNumRegsMajorCostOfLSR()
{ return false; }

bool isLSRCostLess(TargetTransformInfo::LSRCost &C1,
TargetTransformInfo::LSRCost &C2) {
// My66000 specific here are “instruction number 1st priority”.
return std::tie(C1.Insns, C1.NumRegs, C1.AddRecCost,
C1.NumIVMuls, C1.NumBaseAdds,
C1.ScaleCost, C1.ImmCost, C1.SetupCost) <
std::tie(C2.Insns, C2.NumRegs, C2.AddRecCost,
C2.NumIVMuls, C2.NumBaseAdds,
C2.ScaleCost, C2.ImmCost, C2.SetupCost);
}

What additional things do you suggest?

I’m not really familiar with LSR hooks, maybe @davemgreen would have some pointers for you. Possibly the addressing mode hooks (getPreferredAddressingMode, isLegalAddressingMode, etc) would be relevant in your case.

If “u<16” is becoming “u<=15”, that’s the opposite of the normal canonicalization in InstCombine:

Not much else can be said without a runnable example that shows the problem.

The IR optimizations for our target architecture seem to fall into several categories:

  1. Not applicable - e.g. all the SIMD vector cruft
  2. Unhelpful - e.g. LSR, loop unrolling
  3. Benign
  4. Possibly useful.
    What I would like to do is turn off all the IR optimizations, then turn them back on one at a time.
    How do I do that.

By reading the LSR source I found there is a hidden flag “–disable-lsr”.
Now I just need to discover how to set that boolean from our architecture startup files so
I don’t have to do it from the command line.