MLIR dead code analysis

tblah · January 9, 2023, 2:16pm

I am writing a pass for Flang using mlir::dataflow::DenseDataFlowAnalysis (⚙ D140415 [flang] stack arrays pass). The dense data flow analysis requires mlir::dataflow::DeadCodeAnalysis to run first.

DeadCodeAnalysis::visitBranchOperation attempts to fetch conditional branch operand values from the constant propagation analysis pass. If constant values cannot be determined for the branch operands, visitBranchOperation will fall back to marking no edges live.

I think it would make more sense to fall back to marking all successor blocks as live, as they are presumably branch targets reachable depending upon information available at runtime.

Have I understood this behavior correctly, or have I set up the analysis wrong? I am loading mlir::dataflow::SparseConstantPropagation before DeadCodeAnalysis into the mlir::DataFlowSolver instance.

tblah · January 10, 2023, 3:38pm

Tagging @phisiart @Mogball

Mogball · January 10, 2023, 5:07pm

The order in which you load the analyses doesn’t matter. If you have loaded SparseConstantPropagation but DeadCodeAnalysis is still marking all successors as dead, then either those successors are dead or there’s a bug somewhere… Can you post more details?

tblah · January 10, 2023, 6:12pm

Ahh I have found a bug on my end. I am sorry for the noise. This is not a bug.

Nonetheless, would you mind correcting my understanding of DeadCodeAnalysis::visitBranchOperation?

Why doesn’t visitRegionBranchOperation loop over all branch successors even when constant propagation information for branch arguments is not available?

phisiart · January 12, 2023, 1:46am

I’m not the original author but I can provide my theory.

The DataFlowSolver runs all the loaded analyses simultaneously in the same worklist algorithm execution. During the execution, information about ConstantPropagation and 'DeadCodeAnalysis` is propagated together.

When ConstantPropagation and 'DeadCodeAnalysisare both loaded, the fact that some branch operands have noConstantPropagationinformation attached to them means thatConstantPropagation` information has not propagated there yet, which means that the branch op is considered not reachable. Therefore, keeping all successors as dead is the right thing to do.

I think the real problem is that, this design means that you must always load both SparseConstantPropagation and DeadCodeAnalysis into the DataFlowSolver, otherwise you will never mark any edge as live. This limits the flexibility of the API.

I tend to believe that we might as well just make dead code analysis a built-in feature of DataFlowSolver, and ask the analysis writer to provide a callback to specify which successors are live.

jpienaar · January 12, 2023, 3:06am

Is that true? I’ve used just DeadCodeAnalysis along with just another but without constant analysis and it did the correct thing (now to be fair I was surprised I needed to add DeadCodeAnalysis and all the input of interest to me had one block per region, which wasn’t visited without dead code analysis but fine with). It may be good/point to needing more tests for this upstream. I find it surprising that constant analysis is needed (it feels like wrong conservative state).

tblah · January 12, 2023, 9:50am

@jpienaar The constant propagation is used to attempt to predict the outcome of cf.cond_br operations, so if you only had one block per region, DeadCodeAnalysis will work correctly without constant propagation analysis.

@phisiart thank you for the explanation. I agree that including DeadCodeAnalysis as a built-in feature in the DataFlowSolver is a good idea - it was surprising to me first that I needed DeadCodeAnalysis to get the general dense data flow analysis to work, and secondly that I needed the constant propagation analysis to handle control flow blocks correctly…

mikeurbach · January 12, 2023, 8:51pm

I tried out the data flow analysis framework, and also realized the DeadCodeAnalysis was required for pretty much every analysis I wanted to do. So I’m sympathetic to making this easier to get started with, but I’m also curious what making it a “built-in” feature would entail. Maybe rather than built-in in the sense of “always included in DataFlowSolver” is too strong, but we could provide some helper that includes some sensible defaults (e.g. DeadCodeAnalysis and SparseConstantPropagation), which users could extend.

AlexDenisov · May 25, 2025, 11:57am

Just hist a very similar issue. Thanks all for the hints and suggestions.

At first, my custom analysis would not visit any operations without DeadCodeAnalysis.
Then, it wouldn’t visit scf.if/scf.for without SparseConstantPropagation.

It took me way too many hours to find out the reasons behind this behavior.

Shouldn’t the defaults be different? i.e., everything is live/reachable unless proven otherwise?

IMO, DeadCodeAnalysis/SparseConstantPropagation should be an optimization, and not the requirement. Unless I’m missing something?

j2kun · June 9, 2025, 5:11pm

I coincidentally spent an hour helping a colleague debug this same problem again today, in addition to hitting this footgun in 2023. My colleague had added DeadCodeAnalysis to their solver, but not SparseConstantPropagation, and this resulted in the analysis failing to process any BranchOpInterface ops.

j2kun · June 9, 2025, 5:20pm

If anyone has advice on how to properly fix this upstream, I might take a crack at it. Otherwise, I am open to adding an upstream helper function that returns a dataflow solver pre-populated with DeadCodeAnalysis and SparseConstantPropagation.

krzysz00 · June 9, 2025, 7:27pm

I’d argue that

There should be some interface for dataflow analyses to declare their dependencies. Perhaps each lattice element gets a type ID and you can say “I need these lattices available to function” + “these are the lattice kinds I produce”
The dead code analysis should be optional - all the “if this is dead, do X” should be “if a dead code analysis is available and this is dead”

I’d also like to flag [mlir][IntRangeAnalysis] Handle unstructured loop arguments correctly by krzysz00 · Pull Request #119459 · llvm/llvm-project · GitHub , where the current sparse constant propagation + dead code analysis setup causes correctness failures for integer range analysis

Topic		Replies	Views
Some questions regarding deadcode analysis MLIR	0	67	September 8, 2025
SparseDataFlowAnalysis don't visit linalg.generic operation MLIR	4	214	September 13, 2023
Inspecting mlir dataflow analysis MLIR	8	460	January 4, 2024
Another guide on dataflow analysis in MLIR MLIR	0	141	June 2, 2025
[RFC] Add visitBranchPropertyArgument API to SparseDataFlowAnalysis and apply it in LivenessAnalysis/RemoveDeadValues/IntegerRangeAnalysis MLIR	2	176	December 27, 2025

MLIR dead code analysis

Related topics