Hi Evgeny,
Hi Quentin,
Yes, the code allows to process connected instructions. Although it should be taken into account that the instruction next to the current processed instruction must never be erased because this invalidates iterator.
Indeed.
I’ve been fixing a bug in AArch64InstrInfo::optimizeCompareInstr: instructions are converted into S form but it’s not checked that they produce the same flags as CMP. The bug exists upstream as well.
Could you file a PR or just push the patch :).
Together with the fix I want to add some peephole rules for combinations CMP+BRC and CMP+SEL. In the context of optimizeCmpInstr I have all information about CmpInstr. I simply go down and check all instructions which use AArch64::NZCV whether they can be substituted with the simpler version. After all I delete CmpInstr. This approach contradicts with PeepholeOptimizer design because BRC and SEL must be processed in corresponding functions.
Ok I got your concern: basically you want to do the CMP+BRC or CMP+SEL inside optimizeCmpInstr instead of having them into optimizeSelect and optimizeBranch so that you don’t do the analysis twice.
Historically the peephole optimizer is processing patterns bottom-up (use to def). The rationale is we only have one def but we may have several uses. In other words, it is easy to replace a use after you prove it is correct, but what you want is top down (def->use) and in that case, you need some extra checks (the potential other uses) to prove that the def can optimized.
The bottom line, I believe this is not done this way because it is not peephole-ish in terms of complexity.
Yes, ‘analyzeCompare’ is cheap but in optimizeCondBranch and in optimizeSelect we need to go up to find the instruction defining condition flags.
Going up is generally cheap, we just ask for the unique definition of the vreg. I believe in your case it is not cheap because you are tracking a physical reg and not a vreg.
Is that the problem?
In case of BRC CMP should not be far from it but I am not sure about SEL. Also when BRC is replaced with BR CMP can be removed (BTW processing of instructions below BRC can be stopped). I don’t know if there any restrictions on instructions below BRC.
You should have only terminators at the end of the BB.
You may have another branch though.
Anyway I don’t expect many of them. In case of CMP+SEL we can not remove CMP after simplifying SEL because there can be other SEL instructions using flags from CMP.
This is what I explained with the defs need more checks. That’s why optimizeSelect seems a good fit for that.
> I have to admit I don’t see the concern with the instruction being condition dependent; we don’t want to call optimizeCondBranch :).
> I believe I missed your point.
I missed your point too J I think it’s always good to get rid of CondBranch.
I was talking about the code in the peephole optimizer :).
Like:
if isCondBranch then optimizeCondBranch
We don’t want unconditional call to optimizeCondBranch. I.e., optimizeCondBranch expects a condbranch as argument.
Cheers,
-Quentin