[RFC] Disabling DAG combines in /O0

Hi all,

The DAGCombiner pass actually runs even if the optimize level is set to None. This can result in incorrect debug information or unexpected stepping/debugging experience. Not to mention that having good stepping/debugging experience is the major reason to compile at /O0.

I recently suggested a patch to disable one specific DAG combine at /O0 that broke stepping on a particular case (http://reviews.llvm.org/D19268), but other similar cases could appear. In the same way, another patch was submitted last year for a similar reason (http://reviews.llvm.org/D7181).

So, since the DAGCombiner is in fact an optimization pass, could it be disabled completely (or partially?) in /O0? And how should it be done?

For example, would this patch be too aggressive?

Index: DAGCombiner.cpp

Hi Marianne,

This has been tried before. The problem is that although, ideally, DAGCombine should only be used for optimization, in practice there exist some combines that are required for correctness. So making this work properly may require a fairly large clean-up job and a lot of testing.I suggest you look at PR22346, http://reviews.llvm.org/D8614 and http://reviews.llvm.org/D9992 for context.

Michael

(Remembering to re-add llvm-dev again…)

Yes, this would be too aggressive. DAG combiner does more than just optimization (which I discovered as part of D7181, see PR22346).

Internally we’ve thrown around other ideas, for example inside the worklist loop, skip loads and stores at O0, but haven’t done any actual experiments. There are probably other bits of the combiner that could be turned off at O0, but I think they’d have to be considered individually. If you want to run with this one, go right ahead.

–paulr

Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Hi,

DAG combiner does indeed optimizations but also canonicalization.
The latter, we probably need to keep, the former, we should be able to disable.

When I asked Marianne to do a RFC about this, I was hoping we could get new ideas on how to tackle this problem.

I am fine with the approach of disabling the optimizations one by one when we know it is indeed an optimization. However, I am not a fan of having "if (OptLevel == <...>)" scattered all around the place.
Instead, I would like to have a sort of list of optimizations that differ between the opt level == none and the other opt level, then we run just that list.

In other words, I was thinking we could have some kind of registration mechanism for each optimization and this is a matter of classifying them.

Anyhow, what are other people thoughts?

Note: I understand this is a big refactoring and maybe not one that we want to do now, but I believe we could reuse some of the ideas we are discussing here when we eventually do combines in GobalISel.

Cheers,
-Quentin

This has been discussed before and AFAIR, the problem is that the exact handshakes between combining and legalization are not documented. A specification of the canonical forms during various stages of legalization would make it much easier to decouple canonicalization from optimization.

Also, do we know that the required canonicalization will not have the same problem as the combining? That is, whether the necessary canonicalization steps will not affect the "debuggability" of the program?

-Krzysztof

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
Krzysztof Parzyszek via llvm-dev
Sent: Tuesday, May 17, 2016 6:25 AM
To: llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] [RFC] Disabling DAG combines in /O0

> DAG combiner does indeed optimizations but also canonicalization.

This has been discussed before and AFAIR, the problem is that the exact
handshakes between combining and legalization are not documented. A
specification of the canonical forms during various stages of
legalization would make it much easier to decouple canonicalization from
optimization.

Also, do we know that the required canonicalization will not have the
same problem as the combining? That is, whether the necessary
canonicalization steps will not affect the "debuggability" of the program?

If canonicalization has the same problem as combining, then we already
have the debug-info problem at -O0 in some cases, and we'd want to fix
those anyway.

I'd hope that canonicalization transformations are relatively simple.
In that case, loss of debug info is generally an oversight or a lack
of straightforward bookkeeping (based on my experience in some other
compilers). Fixing the canonicalization debug-info problems, if there
are any, should not be a major undertaking.
--paulr

I think the hard part here is that we’re really, really bad at registration systems. ;]

I think switching DAGCombine to use a registration system so that we can use that registration system to also declare a classification between optimizations and canonicalizations that are necessary for lowering would be a huge change, if a really good one.

My question is: is it worth that in the face of progress towards global isel? I’m not sure it is, but I’m also not sure it isn’t. =/ this seems to be a hard tradeoff question.

I wonder, is the sprinkling of conditions in dag combining code for this going to be significantly worse than the existing sprinkling of conditions to select whether a combine is pre- or post-legalization (for each of the different phases of legalization)? If it is going to be significantly less intrusive, then it might make more sense to just hack the DAG combines with conditions to address the immediate problem rather than taking on the whole task of moving to a mechanical dispatching system for combines. If it would make the situation 2x worse though (that is, it has the same level of overhead), then perhaps it does make sense to do the big refactoring first.

-Chandler

My question is: is it worth that in the face of progress towards global isel? I'm not sure it is, but I'm also not sure it isn't. =/ this seems to be a hard tradeoff question.

I wondered the same. GlobalISel is going to have to reimplement pretty much everything a backend has in order to be equivalent.

It would certainly make life easier if 'SelectionDag -O0' only did lowering, legalisation and selection with no combines (and possibly no canonicalisation) as then 'GlobalIsel -O0' can get equivalence much sooner. Also makes diffing their outputs at -O0 much easier. And then diffing again at -O1 to see what else is missing to match outputs there.

Pete

Hi,

DAG combiner does indeed optimizations but also canonicalization.The latter, we probably need to keep, the former, we should be able to disable.

When I asked Marianne to do a RFC about this, I was hoping we could get new ideas on how to tackle this problem.

I am fine with the approach of disabling the optimizations one by one when we know it is indeed an optimization. However, I am not a fan of having “if (OptLevel == <…>)” scattered all around the place.
Instead, I would like to have a sort of list of optimizations that differ between the opt level == none and the other opt level, then we run just that list.

In other words, I was thinking we could have some kind of registration mechanism for each optimization and this is a matter of classifying them.

Anyhow, what are other people thoughts?

I think the hard part here is that we’re really, really bad at registration systems. ;]

I think switching DAGCombine to use a registration system so that we can use that registration system to also declare a classification between optimizations and canonicalizations that are necessary for lowering would be a huge change, if a really good one.

My question is: is it worth that in the face of progress towards global isel? I’m not sure it is, but I’m also not sure it isn’t. =/ this seems to be a hard tradeoff question.

Agree, I don’t think it makes sense to do that in SDISel. Though I was curious to have ideas to see how we would tackle that when we eventually do combines in GISel.

I wonder, is the sprinkling of conditions in dag combining code for this going to be significantly worse than the existing sprinkling of conditions to select whether a combine is pre- or post-legalization (for each of the different phases of legalization)? If it is going to be significantly less intrusive, then it might make more sense to just hack the DAG combines with conditions to address the immediate problem rather than taking on the whole task of moving to a mechanical dispatching system for combines. If it would make the situation 2x worse though (that is, it has the same level of overhead), then perhaps it does make sense to do the big refactoring first.

Agreed.

Q.