I'm writing for a backend and have a complicated instruction bundle (3 instructions) that has to be executed like a single block (meaning: if the first instruction is executed, all three have to be executed to obtain the result, though not necessarily without other instructions in between). Unfortunately, MachineCSE gets in the way sometimes and rips it apart.
Is there a way to stop CSE from doing its thing (common subexpression elimination) for certain instructions?
I've already tried glueing (gluing?) them together, but that doesn't seem to make a difference.
You may be interested in the (very) recently added explicit instruction bundle support. For an example of their usage, have a look at the ARM backend's IT-block (Thumb2 predication support) pass, which uses them to tie instructions together.
Just out of curiosity, won’t such mechanism work via the patterns from instructions defs?
Sorry, but I'm afraid I don't understand your question. Can you elaborate a bit?
If an instruction is marked as side-effect free then it's a candidate for CSE. Try marking the instruction with hasSideEffects.
In my case the target (Tilera) doesn’t have a full 32-bit mult operation and to do so it has to accumulate results from three 16-bit mults, by retaining operands and the result across in the same registers. However the ISel DAG thinks its a CSE case. Please note this is not a MAdd/MSub triad.
How could I do this by defining such a sequence or the pattern in the .def file itself for the ISD::MUL?
The hasSideEffects method I believe operates only on Inline Assembly (IA) blocks. What if such a sequence is not part of IA?
Eh? There’s a hasSideEffects attribute on the instruction definitions in the .td file. It has nothing to do with inline assembly.
Ah, OK. I think I understand much better now. Thanks! You shouldn’t need bundles for that sort of thing. A custom lowering or a fancy pattern should be sufficient, depending on the details of how your target is defined.
For patterns, looks at the various targets use of the Pat<>, Pattern<>, ComplexPattern<> and related classes in the .td files.
For examples of custom lowerings, have a look at how other targets handle any operations marked in ISelLowering.cpp as “Custom” operation actions.
I’m doing custom lowering but here I have a very basic issue and the situation is like this -
Mul Dest, Src1, Src2
[Expanded from EmitInstrWithCustomInserter]
Step1 Dest, Src1, Src2 <=== BuildMI(…, Step1, Dest).addReg(Src1).addReg(Src2)
Step2 Dest, Src1, Src2 <=== BuildMI(…, Step2, Dest).addReg(Src1).addReg(Src2)
Step3 Dest, Src2, Src1 <=== BuildMI(…, Step3, Dest).addReg(Src2).addReg(Src1)
Could manage to skip the CSE on those steps! While mul operation is expanded to multiple (3 in this case) steps, BuildMIs as above. But the “Live Intervals” computation gives a fatal error of multiple definitions on destination register (Dest), from lib/CodeGen/LiveIntervalAnalysis.cpp. Certainly those addReg seems to be done wrongly. Any hint as to what must be the correct steps?
By adding, what was essentially a missing reg copy in between, the problem is solved.